Machine Learning, or Artificial Intelligence as it is sadly erroneously being marketed as, is all the rage right now. We are being promised a brand new emerging world where digital minions jump at our every whim to fulfil our dreams and wishes. It even promises to do away with pesky employees and their meat body demands and expectations.
Sadly, this is of course just the latest in a stream of hyped up marketing that overpromises and underdelivers. The soberer truth right now is that Artificial Intelligence is like the dancing bear at a circus. We are not fascinated because the bear dances well – because it doesn’t. We are fascinated that the bear dances at all.
To underline this point, Gartner recently compared the craze for every company to pretend to be an AI company to the “greenwashing” that led every company to brand themselves as environmentally conscious and friendly. If you look at Gartner’s Hype Cycle for Emerging Technologies though, Machine Learning it is at the peak of inflated expectations. As anyone who is familiar with Gartner’s Hype Cycles knows, after that comes the trough of disillusionment. And as anyone who is familiar with the history of AI knows, we’ve been here before – several times. Each time, AI was unable to deliver what it promised and suffered from several lost decades with a lack of funding. We are now on the 5th generation of AI.
Of course, that is not to say that the current batch of machine learning technologies do not provide any benefits at all. It is more a case of understanding what sort of problems they are suited to solve and where the limitations lie. Sadly, that requires a PhD in Mathematics, Physics or, well, Artificial Intelligence. If you’ve got one of those, congratulations, there may be a career opportunity available to you in advising organizations on where to invest.
For the rest of us mortals though, there are some basic questions you can ask any vendor that is touting their ML capabilities to understand whether their machine is actually learning something useful, or just wasting CPU cycles to look cool.
What does your machine learn?
This question may seem obvious, but you would be surprised how many vendors struggle to provide a coherent answer. First of all, does it learn anything at all? I mention this specifically, because in my time at Gartner I saw a number of vendors that erroneously marketed more basic analytical and computational approaches such as statistical analysis or correlation as Machine Learning. Secondly, if it is learning something, how is that learning remembered and how does it replay what it has learned.
Where does it learn it?
Does it learn it in a lab or in your environment? The former means that the model or analysis will not be based on your unique profile, whether that is your data, infrastructure or environment. It also means that the machine will not automatically be aware of or adapt to changes in the data it is analysing specifically for you. The latter means that the machine and the resulting models or analysis will be based on your specific circumstances and is learning to adapt to your organization.
A hybrid of the two can be entirely acceptable. Some problems that ML can be applied to has multiple components, for example first learning to understand how to detect attacks in general, and then learning from your environment to establish what is a normal baseline. But something that only learns only in a lab has two inherent weaknesses – unless it’s a very specific and narrow problem (Recognizing cats for example), it will struggle to deal with the variety and complexity of real world data, and you have very little control and insight over what it has learned, rendering it a black box with no way for you to evaluate the inputs.
There is an additional dimension to this. Does it learn in-situ in your environment, or does it send the data to the cloud for learning.
How does it learn it?
Any vendor claiming to do machine learning should be able to provide you with a high-level overview of which ML approaches their implementation is following. Supervised, Unsupervised and Reinforced are the keywords here, as well the high level algorithmic descriptions. If they can’t pinpoint this using one of the many excellent cheat sheets that are available (see here and here for example), my advice is to run a mile. Knowing which approach(es) a vendor is using will at best permit further research and evaluation whether this is the correct algorithm for the problems they are trying to solve, at worst it shows that the vendor has at least a modicum of knowledge of what their own ML is doing.
Some vendors may excuse themselves from this question by saying it’s Proprietary or a trade secret. That is a moot excuse – you are not asking for the mathematical details of how the algorithms have been implemented in detail. I have not come across a single vendor who has invented a new and unknown approach. A serious and credible vendor will tell you. There are many vendors who have adapted open source algorithms available off the shelf and achieved similar results to ones that claim to have a supernatural proprietary implementation. If someone claims to have improved upon those, ask them how.
Why does it learn it?
Why that specific set of data? Why that specific approach. There are any number of different algorithms and approaches that can be applied. Cluster Analysis for example permits the grouping of a set of objects based on related or similar attributes, so it could be used to automatically identify group or set of hosts or users that have a similar function or configuration, and also to determine variations and differences between those groups.
Any vendor should be able to explain why they selected that specific approach to apply, and why they chose that over another. This could be based on accuracy over speed or vice versa, in which case they need to be able to explain why accuracy is not as important, or why speed has a higher priority.
What does it solve?
This of course is the million dollar question. A few simple criteria are the following:
Does it solve a problem that would be impossible to solve with less sophisticated means, or that would be unfeasible or inefficient to solve any other way.
This question is important and intended to identify gratuitous machine learning and ML for the sake of ML. The best example of this that I can come up with is a use case many UEBA vendors routinely show off – identifying whether a user has logged in from two different locations at the same time. This does not actually require machine learning to solve. A simple correlation is actually sufficient – User A is using IP-Address X and Y to access system Z – some SIEM’s have been able to do this for years.
More importantly, does it solve more than one problem, or it is the machine learning equivalent of Rain-man. Considering the cost in monetary terms, time, effort and architecture, if it only solves a single problem, that problem had better be a major one, like curing cancer.
Generally my advice would be to be cautious of how a vendor markets itself. There are no machine learning vendors – there are only vendors that apply machine learning to solve specific problems. User Entity Behavior Analytics is a great example where an entire market was erroneously named after what it does to address specific use cases, rather than the use cases themselves. This has led to much confusion and should be a lesson for marketers. You can solve aspects of incident response, advanced threat detection, hunting and investigation using machine learning. Machine learning by itself solves nothing without being applied to distinct problems.