On May 25, 2018, two years after it was adopted by the European Union, the General Data Protection Regulation (GDPR) came into force. For two years companies have been bombarded with offers for GDPR solutions from security firms; and publications have been bombarded with surveys claiming that only n% of firms are ready or even understand GDPR.
In truth, however, the ‘data protection’ element in GDPR is little different to pre-existing European laws. The GDPR changes come in the way user data is gathered, stored, processed, and made accessible to users; in breach disclosure; and in the severity of non-compliance fines.
That said, companies can learn from last year’s data protection non-compliance incidents to gain insight into next year’s potential GDPR non-compliance fines. One source is the statistics available from the Information Commissioner’s Office (ICO — the UK data protection regulator).
The ICO’s latest ‘Data security incident trends’ report was published on 14 May 2018. During Q4, the ICO levied just a single fine: £400,000 on Carphone Warehouse Ltd “after serious failures put customer data at risk.” There were, however, a total of 957 reported data security incidents. The ICO defines these as “a major concern for those affected and a key area of action for the ICO.”
An analysis of those incidents is revealing. Healthcare — a major worldwide criminal target for extortion and theft of PII — reported a total of 349 data security incidents in Q4. The most common incidents were not technology-related: 121 incidents involved data posted or faxed to the wrong recipient, or the loss or theft of paperwork.
The most frequent technology-related incidents were not down to hacking, but to simple email failures (49) involving data sent to the wrong recipient, or a failure to use BCC when sending email. There is, in short, an easily overlooked backdoor into GDPR non-compliance.
Data sent to the wrong recipient is commonly addressed by data labeling and data loss prevention technologies. One problem is a high level of both false positives and false negatives. Employees charged with labeling the data they generate frequently ‘over-label’; that is, they label unprotected data as ‘sensitive’ in an abundance of caution. This can lead to time-consuming, hampered workflows. Alternatively, sensitive data can remain unlabeled and still be sent to the wrong address.
In September 2017, the national Law Journal reported, “Wilmer, Cutler, Pickering, Hale and Dorr was caught Wednesday in an email mix-up that revealed secret U.S. Securities and Exchange Commission and internal investigations at PepsiCo, after a Wilmer lawyer accidentally sent a Wall Street Journal reporter privileged documents detailing a history of whistleblower claims at the company.” This was not just an embarrassment; had it involved any EU data, it would have been a serious breach of GDPR.
(While writing this article, the author received an email from a major cybersecurity vendor: “You may have accidentally received an email from us yesterday with the subject line “SUBJECT LINE”. Our server had a bad moment and sent the email to wrong people.” This was a benign error — but it could have been serious, and it further illustrates the problem.)
One new start-up firm — UK-based Tessian — is seeking to solve the email GDPR backdoor using machine learning artificial intelligence. “What we’re doing,” co-founder and CEO Tim Sadler told SecurityWeek, “is helping organizations protect against the human threats. At our core, we prevent organizations sending highly sensitive emails to the wrong people.”
The difficulty with the email problem is that it doesn’t lend itself to a traditional rules-based solution — email is used too frequently, too easily, with too many subjects and to too many people. “The approach we have taken is machine learning,” explained Sadler. “We analyze historical communications patterns to understand the kind of information that is shared with different people in the user’s network. On outgoing emails we understand anomalies. We understand that it is unusual that this data is shared with that contact. This is an approach we have not seen elsewhere, but it is one that works very effectively.”
He claims that within 24 hours of analyzing the user email logs, a base-line of ‘normality’ can be produced. Anomalies to that baseline are flagged. Users are kept on board by being fully involved — flagged emails aren’t simply blocked. A full explanation of the system’s decision is relayed to the user and can be accepted or overridden — and the user’s response is added to the system’s machine learning knowledge. Using credit card fraud as an analogy, he said, “We don’t just block the card because of anomalous behavior, we explain why, we ask the user if he wants to unblock it — and we learn from the process.”
The company was founded in 2013 by Tim Sadler, Ed Bishop and Tom Adams, and was originally known as CheckRecipient. In April 2017 it raised $2.7 million seed funding, bringing the total seed funding to $3.8 million. The company was rebranded and renamed as Tessian in February 2018. Part of the reason for the rebranding is the evolving and growing nature of the company.
“Our belief at Tessian,” Sadler told SecurityWeek, “is that organizations’ security has moved on from perimeter firewalls, and even endpoint security. I think we are in a third phase here, where humans are the real endpoints of the organization.” If you look at how hackers try to break into a company, they’re not so much hacking devices as hacking the humans.
We are focused on building security for the human endpoint. In short, we are thinking not just about outbound email threats, but also inbound email threats; and in going beyond that to understand what are the other ways in which humans leak data within an enterprise.”
Sadler declined to go into details on Tessian’s future road map — but it is probably fair to say that a machine learning solution to BEC and general phishing threats is on the drawing board. Right now, Tessian is almost unique in bringing a machine learning solution to an email problem that from historical data is likely to prove a major and often overlooked threat to GDPR compliance.