Have you ever stopped to think about how you go about deciding whether to try a new restaurant that you’ve never been to? Even if you don’t realize what you are doing, when you make this decision, you are likely collecting data around a number of different criteria, analyzing those data points, and then using that analysis to make a decision. Some of the criteria you evaluate might include:
● Does the restaurant serve the type of food that I want to eat?
● Is the restaurant located conveniently for me?
● Do the hours suit the time I want to eat?
● Am I willing to pay what the restaurant charges?
● Does the restaurant have good reviews?
● Is the restaurant clean?
These are just a few potential data points that a person might evaluate when deciding on whether to try a new restaurant. There are, of course, numerous other ones. Regardless of which data points are important to the decision maker, it is likely that the number of data points is somewhere between five and 10.
One or two data points would not be sufficient. For example, if I base my decision only on if the restaurant has good reviews and serves the type of food that I want to eat, I may be disappointed when I show up to the restaurant and it is closed, or when I receive the bill and find out that it costs three times what I wanted to pay.
On the other hand, having 500 data points doesn’t make the decision-making process any easier either. Imagine if in addition to the six data points above, I had another 494 that I needed to evaluate. It would completely overwhelm me, and I would be unable to make effective use of nearly all of those data points.
I believe that we can learn a valuable lesson about better fraud decision-making from this restaurant choosing example.
If we think about it, detecting fraud is not about making a binary decision. If I look outside, either it is raining or it is not. That is something binary. Fraud on the other hand involves probabilistic decision-making. In real-time, I can be 10%, 50%, or 90% certain that something is fraud, though it is almost never the case that I can be 100% certain. Sure, I can be 100% certain that something was fraud long after it happened, though not in the moment as it is happening.
The reason for this is very simple. Fraud is business logic abuse. It is about using legitimate applications for fraudulent purposes. In other words, we are looking to understand the intent of the user as they interact with the application and journey through their session. That is not something that the traffic itself can tell us. We need to look beyond the traffic and understand the behavior of the user in the session, the resources they are requesting, and the device(s) and environment(s) from which they are operating.
Only when we have that deeper understanding – an understanding that goes beyond the application layer data – can we make better decisions around fraud. When we think about it, that should be one of our main goals when it comes to improving our fraud programs. To improve the probability that we are making the right decision, both when it comes to identifying fraud and also when it comes to identifying legitimate traffic.
What are a few ways in which fraud teams can improve their probabilistic decision-making around fraud? While not an exhaustive list, here are a few ideas:
● Examine behavior: How is the user behaving in the session? Are they cutting and pasting fields that are typically typed by legitimate users? Are they using unusual key combinations? These and other behavioral data elements provide valuable insight into the user’s behavior in the session that can be leveraged to make better decisions around fraud.
● Look closely at requests: What resources is the user requesting? Latch onto the user’s journey through the session and understand if that user is requesting any unusual, high risk, or suspect resources that aren’t typically accessed. This too is another way to strengthen fraud decisions.
● Understand devices: Who is this user? What device or devices are they coming from? Is what we are seeing expected? Fraudsters often have favorite devices that they use to access many accounts. Or, in some cases, fraudsters may try to access one account from a number of different devices. There are many other device-based examples as well. In any of these cases, having a good handle on reliable device identification, along with an understanding of how to look for anomalous and unexpected activity from devices can provide fraud teams with a tremendous amount of insight.
● Understand environments: Sometimes, something isn’t quite right about the environment that a user is coming from. Maybe it is a hosting or cloud environment, which is unusual for the typical home user. Or, perhaps there are mismatches between claimed versus actual environmental settings such as timezone offsets, character sets, languages, and other clues that something is off.
● Value quality over quantity: When looking to understand intent and improve decision-making around fraud, it isn’t necessarily the case that more is better. In fact, I would say that less is more when it comes to fraud. Having a small data set of highly actionable, reliable, valuable, and insightful intelligence elements can help the fraud team tremendously when it comes to improving decision-making. Certainly more so than several hundred data fields that provide little to no insight or context.
The above points can help fraud teams gain an increased understanding of what is happening in the user session. Insight into any one data point alone won’t help us understand the bigger picture, but together they help us put together the puzzle and understand intent. That understanding of intent allows us to improve our decision-making around fraud. We gain more certainty when it comes to determining if a given session and certain transactions are fraudulent or legitimate. This, in turn, increases our true positive rates and decreases our false positive rates.