Security Experts:

Big Data Security Analytics: Building The War Room

After my last column complaining about the hype to delivery ratio in Big Data for security analytics, I seem to have convinced some people that I’m anti-Big-Data. That’d be like ordering the tide not to come in (and as far as we can tell, Cnut was misunderstood when he tried that too). Let me take the other side this time – what do we know about effective use of security analytics?

First, it works. Second, what it delivers may not be what you initially expect. You might almost come to see it as “the journey is its own reward”. The process of reaching for more sophisticated analytics forces some much-needed discipline in other areas that have been too long neglected.

Security AnalyticsThink of a classic World War II “war room” – at its center, a situational map on a table. People around the edges receive intelligence updates from different sources in the outside world, process them, and then feed them to the people standing around the map, and they update the positions of forces on the terrain. I think this is the vision for security analytics and big data. The trick, of course, is to remove most of the people – to set up machines to integrate all the diverse incoming data feeds, to use machines to update the positions on the map, but only keep people around to look at the visuals, think about the strategic implications, and act decisively. It’s a great vision for security operations; it’s just a little complicated in practice, because we tend to overlook some basic needs before we can build this.

First, we need the table – the map – on which all this action will occur. Seems obvious, right? How pointless would it be to just get the intel feeds and try to process “unit 14 is taking incoming fire”, without being able to see where unit 14 is, and the other information we know about terrain in the area? Can they get to higher ground? Is an air strike appropriate? Do we know which direction moves them towards or away from the enemy? All these are questions of context – context you don’t have if you don’t have a reasonably accurate scale model of the environment.

Distressingly, most organizations have no reasonably up to date map of their infrastructure – the terrain on which the modern cyber battle is fought. Nor do they have what military strategists call “force accounting” – knowing your own troops’ strengths, locations, and status. In a war, all this is vital, but in a corporate environment, asset inventory just never gets the same cachet – it’s hard to stifle a yawn when the accounting team wants to put asset tags on machines, figure out who owns what, or what it’s for. Is it any wonder that information is perpetually being lost? (I don’t mean theft – I mean simple entropy in the record keeping of what your equipment is supposed to do.)

Then there’s the data quality problem – well known to military people, but often overlooked by those building out security risk management “war rooms”. How reliable is that intel coming in to your central location? How do you process the feeds, looking for contradictions or other suggestions that the data has faults? That is, you know the data has faults in principle, but what are you doing to address it? In classic war rooms, this is a highly skilled and ultimately human job – the “intelligence analyst”. It’s not going to be easy to automate this for big data security analytics, but we have already taken some good early strides.

The next issue to think about is unit management. In a war room, you need to know whether the units on the map are strong or weak, ready or in disarray. The same is true in security analytics – you need up to date status information. Asset management alone can’t get you there – you need automated rules to assess readiness in meaningful ways.

Warfare and security would both be much easier if this was all we needed – the terrain (context) and the asset readiness/hardening info. Unfortunately, it’s not, but then, that’s why a war room is built specifically to put all the data together. It’s not good enough to know the state of your individual units; you also have to look at the whole, because it truly is more than the sum of the parts. 15 units, all reporting they are battle hardened and ready to go, but split up and disorganized over a wide area is very different from the same units, drawn up in a tight battle line formation. So you have to look at the system as a whole, not just the elements. In IT security, this is where the technical effort gets really quite complex, but it’s also where a lot of people tend to skip over the details – the lazy myth of “pile up the data, and I’m sure enlightenment will just follow”. Unfortunately, there are even vendors willing to play into that – to provide good “checklist thinking” or ways to manage individual elements, but no coherent way to look at the system as a whole. (If you’re thinking “Maginot Line”, you get 10 points for predicting the line of thought, but -10 for being an overeducated history geek. Apparently the concept that the generals “forgot” to complete the defensive wall around France, and left a gap in the Low Countries, is a gross over-simplification. It’s a pity – it’d make a great analogy for modern IT security defensive gaps otherwise!)

Hopefully the war room idea is helpful as a way to think through the requirements to achieve meaningful security analytics. To build that strategic “war room” level view of your real security situation, you have to focus initial investments on rather more basic items, such as a table. No, I don’t mean wood-based security investments (although some of our security “dashboards” could use a little more veneer and polish) – I mean the map, the basic terrain on which you are going to engage this stealthy, rapacious, and fast-moving enemy. Building this map may be both more difficult, and more enlightening and valuable, than you anticipate going in. Fortunately, automation can help, but it can’t solve political inability to obtain the data!

Aiming for achievable goals will also help build success for the further layers I’m suggesting in the war room analogy. You’re going to start combining feeds of data to put assets on the map. The first thing you’ll find as you combine the feeds is they disagree, and even contradict each other. Learning to handle this is another discipline that is both surprising, and seriously valuable. Fortunately, techniques do exist to turn these distortions into valuable signal – identifying, for example, where your scanner says “there are troops (aka assets) here”, but there’s no such “here” on your table, your map of the network. (The opposite happens too – it’s much easier to realize “hey, I have no record of any hosts in this large part of my network” if you can correlate the data you’ve got with the map.)

After these two levels – gathering and mapping your assets – you can start asking more significant questions, like “which devices are properly hardened?”, and more interestingly, because there will always be devices that aren’t ready, “where are they and where do they lead?”. Checklists are a good start, but checklist results are far more valuable if you can see them in context of your “war room” map.

Understanding the whole end to end picture is key. Once you have that, you can both understand your real situation, and also engage in advanced cyber war-gaming, simulating scenarios and preparing responses. It’s a great way to work, but it’s only possible once the map and the assets on it correspond to the real world, out there beyond the security team’s bunker.

Security analytics is a great idea – it’s flavor of the month in the industry, and it’s a great direction. We’ve got tons of data – how can we best make use of them? My advice is to think seriously about “terrain mapping” and “force accounting” – two military terms that roughly translate into “know what and where your stuff is” and “know whether you’re ready”. There are people ready to sell you all manner of intelligence feeds, but what use are they if you can’t pull them into a war room and correlate them with your real situation?

view counter
Dr. Mike Lloyd is Chief Technology Officer at RedSeal Networks. He has more than 25 years of experience in the modeling and control of fast-moving, complex systems. He has been granted 20 patents on security, network assessment, and dynamic network control. Before joining RedSeal, Dr. Lloyd was CTO at RouteScience Technologies (acquired by Avaya), where he pioneered self-optimizing networks. Lloyd was previously principal architect at Cisco on the technology used to overlay MPLS VPN services across service provider backbones. He joined Cisco through the acquisition of Netsys Technologies. He holds a degree in mathematics from Trinity College, Dublin, Ireland, and a PhD in stochastic epidemic modeling from Heriot-Watt University, Edinburgh, Scotland.