Data encryption is one the keys to data protection, but big data brings its own set of complications to cryptography.
At a session at the Cloud Security Alliance’s upcoming CSA Congress event this week in Orlando, Fujitsu’s Arnab Roy will outline the top 10 challenges in cryptography for big data, which he contends lay in the following areas: communication protocols; access policy-based encryption; big data privacy; key management; data integrity and poisoning concerns; searching and filtering encrypted data; secure data collection; secure collaboration; proof of storage and the secure outsourcing of computation so that cloud environments can compute on encrypted data with sacrificing end-to-end privacy.
Step one to addressing these concerns involves systematically striking the right balance between privacy and utility, he said.
“The advent of high volumes of sensitive data like retail, financial and medical has enabled a plethora of analytics techniques which generate information of high value for third-party organizations who desire to target the right demographics with their products,” said Roy. “In practice, such data is shared after sufficient removal of apparently unique identifiers by the processes of anonymization and aggregation. [But] this process is adhoc, often based on empirical evidence and has led to many instances of “de-anonymization” in conjunction with publicly-available data.”
This can be further complicated by cloud environments.
“Consider that a client wants to send all her sensitive data to a cloud: photos, medical records, financial records and so on,” he said. “She could send everything encrypted, but this wouldn’t be much use if she wanted the cloud to perform some computations on them, such as how much she spent on movies last month. With Fully Homomorphic Encryption (FHE), a cloud can perform any computation on the underlying plaintext all the while the results are encrypted. The cloud obtains no clue about the plaintext or the results.”
“In general, wherever there needs to be a trust boundary between data owners and computation-storage providers, this challenge arises naturally,” he continued. “The only solution which provides mathematical guarantees of privacy in this setting, without the requirement to trust a third party’s hardware, is provided by cryptography.”
Access control is also one of the key challenges to protecting data. According to Roy, access controls should be enforced without depending on the host system.
“Traditionally access control to data has been enforced by systems – operating systems [and] virtual machines – which restrict access to data based on some access policy,” he said. “The data is still in plaintext. There are at least two problems to the systems paradigm: one, systems can be hacked; two, security of the same data in transit is a separate concern.”
“The other approach is to protect the data itself in a cryptographic shell depending on the access policy,” he explained. “Decryption is only possible by entities allowed by the policy. One might make the argument that keys can also be hacked. However, this exposes a much smaller attack surface. Although covert side-channel attacks are possible to extract secret keys, these attacks are far more difficult to mount and require sanitized environments. Also encrypted data can be moved around, as well as kept at rest, making its handling uniform.”
The good news, Roy said, is that encryption technology in both the research phase and in limited deployment can enable big data analytics and governance that “plain vanilla encryption techniques” has not, and emerging research is aimed squarely at addressing complex ownership characteristics, authentication and anonymity expectations.
“There are of course the challenges of retargeting existing cryptographic solutions to the ever increasing volume, variety and velocity and the infrastructural shift due to big data,” he said. “However, there are emergent problems for big data as well which cryptography research has started addressing.”
Roy’s presentation is scheduled for Dec. 5.