Connect with us

Hi, what are you looking for?

SecurityWeekSecurityWeek

Data Protection

AI-based Document Classification Firm Concentric Emerges From Stealth

Concentric Emerges from Stealth with AI Document Classification Product and $7.5 Million Seed Funding

Concentric Emerges from Stealth with AI Document Classification Product and $7.5 Million Seed Funding

Unstructured documents — especially those that have been given wrong or no sensitivity classification — are among the most difficult assets for any enterprise to track and secure. Problems come from staff inappropriately sharing and insecurely storing documents. Ensuing threats go beyond the compliance concern of leaking personal data, and include the danger of sensitive commercial data falling into the wrong hands.

San Jose-California based Concentric has emerged from stealth with the availability of a new deep learning solution called Semantic Intelligence. It uses language analysis to determine the sensitivity of individual documents to help solve and prevent this problem. At the same time, Concentric has raised $7.5 million seed funding from Clear Ventures, Engineering Capital, Homebrew and Core Ventures. Concentric was founded in 2018.

Concentric Raises $7.5 millionIn a separate report (PDF) published January 29, 2020, Concentric provides the result of analyzing 26 million unstructured documents from companies in the technology, financial and healthcare sectors. It found that each company has just short of 10 million unstructured documents. Each employee owns almost 2,000 documents. Among these, each employee owns 253 business critical documents — and among these, 38 documents per employee are at risk. Over 627,000 source code files and over 1 million trading files were also found.

But Concentric did not simply find files that were at risk, it found files that were actually risked. Per employee, five business critical documents were erroneously shared with an external party. Twenty-one were improperly shared with other groups. Nine were erroneously shared with internal users. And three business critical documents were wrongly classified. 

Manual classification of this volume of documents requires extensive staff training and is prone to error. Manual classification done in arrears is so costly and time-consuming that it is a project often delayed, sometimes indefinitely. Existing automated rule-based methods of searching documents for key words or phrases leads to large numbers of false positives, causing many documents to be over-classified and reducing the general availability of data to the company.

Concentric brings deep learning language analysis that can analyze context. It can tell the difference, for example, between a personal email quoting the dollar-value of a home, and the dollar-figure quoted in sales or M&A documents.

“Discovering and protecting unstructured data is a huge problem,” Concentric CEO and founder Karthik Krishnan told SecurityWeek. “The challenge is that this data is complex: contracts, NDAs, source code, design documents, and so on. Traditional methods of discovery have relied on using word patterns, but this lacks the context to be able to accurately classify the document. The result is that most companies don’t know where their high value assets are.” 

Meanwhile, he continued, “deep learning has progressed to the point where it can both solve problems at scale and do it with a degree of precision. What we have built is a system that uses a deep learning language model to develop a semantic level of understanding of the context. We can look at both the words and how they are used within the broader context of a document to understand the meaning. This allows us, in a completely unsupervised manner, to build thematic groups, putting contracts, design documents, NDAs into their own groups.”

Advertisement. Scroll to continue reading.

By then analyzing and comparing documents within their groups, he explained, the Semantic Intelligence product can understand “how the data has been identified or classified or shared across the business units to provide a risk-based view over that data. The idea is that business-critical data combined with how it has been shared, whether it has been shared with the right sets of people, provides a view into the risk. We could compare a design document with another design document and look for signs of risky sharing where a document might have been shared inappropriately. This is all autonomously derived without a single rule or regular expression or a policy function that needs to be defined up front. It’s all driven by the thematic groupings that we build using our deep learning models. The goal is to help companies discover and protect their unstructured data.”

Semantic Intelligence uncovers, categorizes and classifies the documents, and allows IT and security teams to monitor data security with timely information and risk visualizations that drill down into the at-risk documents. The solution also integrates with major third-party security and data stores to help customers leverage the security investments they already have in place.

“Businesses understand the importance of protecting their critical assets, and yet, despite their best efforts, an extreme amount of data is left unsecured, unidentified, misclassified and at risk,” said Krishnan. “Unstructured data is currently copious and dispersed, and it includes an alarming amount of business-critical information. It’s a target for cybercriminals and can be a pitfall for regulatory compliance, but securing it is incredibly difficult. It’s the data challenge of our digital generation that we’re laser-focused on solving.” 

Related: What Deep Learning Means for CyberSecurity 

Related: Adoption of AI-enhanced Cybersecurity is Growing Rapidly 

Related: Knowing Value of Data Assets is Crucial to Cybersecurity Risk Management 

Related: Handling Classified Information: Lessons Learned

Written By

Kevin Townsend is a Senior Contributor at SecurityWeek. He has been writing about high tech issues since before the birth of Microsoft. For the last 15 years he has specialized in information security; and has had many thousands of articles published in dozens of different magazines – from The Times and the Financial Times to current and long-gone computer magazines.

Click to comment

Trending

Daily Briefing Newsletter

Subscribe to the SecurityWeek Email Briefing to stay informed on the latest threats, trends, and technology, along with insightful columns from industry experts.

Join the session as we discuss the challenges and best practices for cybersecurity leaders managing cloud identities.

Register

SecurityWeek’s Ransomware Resilience and Recovery Summit helps businesses to plan, prepare, and recover from a ransomware incident.

Register

People on the Move

Shay Mowlem has been named CMO of runtime and application security company Contrast Security.

Attack detection firm Vectra AI has appointed Jeff Reed to the newly created role of Chief Product Officer.

Shaun Khalfan has joined payments giant PayPal as SVP, CISO.

More People On The Move

Expert Insights

Related Content

Application Security

Cycode, a startup that provides solutions for protecting software source code, emerged from stealth mode on Tuesday with $4.6 million in seed funding.

Data Protection

The cryptopocalypse is the point at which quantum computing becomes powerful enough to use Shor’s algorithm to crack PKI encryption.

Artificial Intelligence

The CRYSTALS-Kyber public-key encryption and key encapsulation mechanism recommended by NIST for post-quantum cryptography has been broken using AI combined with side channel attacks.

Compliance

The three primary drivers for cyber regulations are voter privacy, the economy, and national security – with the complication that the first is often...

Data Protection

While quantum-based attacks are still in the future, organizations must think about how to defend data in transit when encryption no longer works.

Application Security

Virtualization technology giant VMware on Tuesday shipped urgent updates to fix a trio of security problems in multiple software products, including a virtual machine...

Application Security

Fortinet on Monday issued an emergency patch to cover a severe vulnerability in its FortiOS SSL-VPN product, warning that hackers have already exploited the...

Cybersecurity Funding

Los Gatos, Calif-based data protection and privacy firm Titaniam has raised $6 million seed funding from Refinery Ventures, with participation from Fusion Fund, Shasta...