Context is Invaluable, and Lets you Understand What your Security Event and Alert Information Mean.
In my previous column, I discussed how important it is to add context to data. Context applied to security information gives you contextual security. But context is only one step in maximizing the value of your security information. To get the biggest bang for the buck you need to be correlating your data. Adding context to data gives you information. Correlation adds even more information by evaluating relationships between pieces of information.
What does that mean?
Let’s use a non-security example. I dump a 3000-piece puzzle on the table. I tell you nothing about the puzzle, but you have to put it together. You can look at a piece, and add context to that piece. Is it a corner piece, a side piece, or a middle piece? Does the piece have a prong or a divet (okay, so I don’t know the formal “puzzle terms” for something that sticks out of a puzzle piece or a piece that has a shape into which another piece plugs, so go with it). You can look at the picture on the piece. Is that something red and round? Is that something shiny? All of these observations add context to the pieces, as well as the puzzle as a whole. You might group pieces that have red on them, as well as those pieces that are shiny to see if you can find anything in common or see a pattern. You then start by assembling the frame of the puzzle by looking at the sides and corners.
But now you are correlating. When you look at how the pieces fit together, you are looking at the relationship between those pieces. That is correlation. Next, you look at the red pieces, and see how they fit together. After you assemble three or four pieces, you recognize that the red is a clown nose. Correlation gives you even more data. Then you assemble some shiny pieces and realize it is a shiny hubcap on a wheel. Correlation. And better yet, you can correlate those larger pieces of information together and realize that the puzzle probably includes a clown car. The correlation you have done helps you to recognize the giant daisy that squirts water, and the huge green shoe sticking out of the trunk. You can then look for other pieces that support or deny your initial thoughts. You utilize your correlated information to assemble multiple clowns, and the car. Contextual information enabled you start building, but it was correlation that actually let you make progress on the puzzle. It will be correlation that lets you finish.
Of course, the same rules apply with information security. The context is invaluable, and lets you understand what your event and alert information means. But, the correlation of those events into a bigger picture of what is happening in your environment is even more important.
Back to Joe’s
Joe’s Hat, Boot, and Shoe Company has a pretty normal manufacturing environment. They understand where their cool data is. They have an IDS and monitoring system that helps them see ongoing security alerts, and have a dedicated management system to monitor the performance of their assembly lines. They even have some context when an IT engineer knows that 220.127.116.11 is their central order database, which is some of their most important data. But they really have no capacity to do meaningful correlation. If they do any correlation it is only because the IT engineer just knows that a series of systems are related, so recognizes when they all have alerts – but this is probably more luck than common practice.
Acme BigBoxStore (BBS) is a large retailer, with a significant online presence. Their overall architecture is quite similar to that of Joe’s. The single biggest difference in the way they manage security is that fact that BBS uses data correlation to help define and manage security events. This gives them much more control over the way they deal with things as they happen.
Each organization experiences the same set of events.
1. An external port scan
2. A series of login failures on an external system.
3. A series of login failures on an internal system.
4. A login on a privileged database account
5. Outbound traffic out of normal baseline Joe’s is probably not worried about the external port scan. They happen all of the time.
Joe’s probably also pretty much ignores the login failures on an external system. For that matter, they don’t see a successfully guessed password and a logon, since it doesn’t report as an event. Joe’s may get some concern on a series of login failures on an internal system, but this pretty much depends on the context of the event. If the login failures are on an internal website, it is likely that no one will care. If the login failures are on the FLOOR301 assembly line server, which only talks to an internal application that has the username and password stored in a database, Joe’s get’s concerned because there should be no failed logins.
The Joe’s I dealt with did not care about a privileged login, since they lacked context and their level of security paranoia was relatively low. If Joe’s sees the elevated traffic levels, it may be cause for concern, but for the most part, it is simply another event in a flood of other events. Keep in mind that Joe’s did not just get these five events. Joe’s got these five events along with another 3,000 or so events that evening. Chances are that IT staff at Joe’s are not alerted to anything.
Run the same 3,000 events through BBS. The rules start the same. BBS could not care less about the 17th external port scan run against them that week. BBS may also not be worried about a series of external login failures. But when those failures are immediately followed by a success, correlation kicks into action. Was this a user mistyping a username and/or password, or was this a successfully guessed password? At the very least, the correlation engine has this marked as “curious”. And, marked curiouser and curiouser when the engine checks back in time and sees that 10 minutes earlier BBS had been port scanned. Suddenly, the port scan is not “just another port scan”. The correlation engine then evaluates context about the events. Can it see anything else interesting about the events, like, if the source IP addresses of the port scan and any of the login attempts are the same IP address – adding context to the correlation.
Then, BBS sees the login failures internally. Given that this followed shortly after the suspicious external logins, this series of events is now marked with an elevated concern of something more like “interesting”. Their internal systems report the privileged account logon as a matter of due course, and it is only really interesting if it falls in a reasonable time sequence in the series of events that the correlation engine is now tracking. Elevated outbound traffic volume would be the last straw. The correlation engine looked at 3,000 events, and picked out a series of five, and determined that they were related – that they fit together like the corner of a puzzle.
What happens next depends on how BBS has defined their security profile. At the very least, an internal alert is issued. The BBS that I know is prepared to act on their alerts, and they would terminate outbound traffic at the firewall.
The five events are obviously a dramatic oversimplification. So is the “five out of 3,000”. In reality, this could be thousands and potentially millions, of events, depending on the exact environment of the client. Hence the term “correlation engine”. If your environment consists of six systems, and one IT guy knows them all, you may have your correlation right there. But if you are of any size, doing meaningful correlation manually is going to be more a matter of luck than skill.
But first, you need the data. Then you need to be able to transform that data into information by adding quality context. Lastly, you need to apply correlation to see how the pieces fit together. If you have all three, you can figure out the puzzle. If you think you are going to rely on manual processing and brute force, then I am pretty sure that I have an X-Acto knife and rubber mallet I can sell you.