Connect with us

Hi, what are you looking for?


Data Protection

Bigger Data, Smaller Problems: Taking Hadoop Security to the Next Level

Securing Apache Hadoop – How Enterprises Can Pass Internal and External Security Requirements and Audits

Securing Apache Hadoop – How Enterprises Can Pass Internal and External Security Requirements and Audits

Over the past several years, thousands of IT products and solutions have emerged to solve every imaginable enterprise business problem. We’ve seen a non-stop procession of apps, clouds, platforms, services, infrastructures and more spring to life. Many of these emergent technologies and solutions are showing promise and in some cases even ROI — with Apache Hadoop among them.

Apache Hadoop, the standard framework for Big Data processing, has swiftly gained momentum across the enterprise. Enterprises are already running production-grade clusters on Hadoop as well as a large ecosystem of tools built around it, including Pig, Hive, Sqoop and Yarn, to name a few. The Hadoop technology stack is moving so quickly that some of the early technology like MapReduce is already being displaced by newer technology like Apache Spark.

Securing Apache HadoopAs always, new challenges arise amid innovation. With security and compliance now dominating concern lists, many organizations are struggling to figure out how to scale security alongside their digital footprints. No IT security and risk professional wants to stall progress, nor do any want to allow their organizations to move at a pace that can lead to breaches, violations, lawsuits and job loss at even the highest leadership levels.

As a result, many enterprises — especially within highly regulated industries — aren’t able to move as quickly as they would like towards implementing Big Data projects and Hadoop. They don’t have to hesitate though, as many of the security and compliance challenges are now surmountable.

This article provides a look at how advancements in Hadoop security and compliance help those charged with maintaining security and compliance reduce associated risks and the size of the overall problems.

Laying the Security Groundwork

Big Data RepresentationWhile the Hadoop platform continues to evolve, there are many security capabilities enterprises can implement today. To ensure secure and compliant Hadoop usage, security and risk professionals should make sure that they start with these basics:

1. Implement basic security measures. Most of the basic security measures are applicable when it comes to the Hadoop platform. Always create users and groups, map users to groups, assign and lock down permissions by group and enforce strong passwords. Build user onboarding and off-boarding processes with periodic audit reports. Limit super users, apply fine-grain permissions on a need-to-know basis and avoid coarse-grain and broad-stroke permissions.

Advertisement. Scroll to continue reading.

2. Seek executive sponsorship. Executive sponsorship is crucial — security professionals need to demonstrate why security is a good investment to reduce risk. Security projects are often not backed until after an incident or failed audit. Prepare well and present the security initiatives needed to lock down the enterprise’s Big Data platform.

3. Harden the OS and lock down the Java VM. Don’t forget that Hadoop runs on an operating system and most of the software runs in a Java VM. Lock down the OS and Java VM according to security best practices. For example: enable built-in Linux firewall (iptables), disable root remote access, force SSH keypair login, use limited sudo, restrict root access and shut down and remove non-required services, just to name a few.

Taking Hadoop Security to the Next Level

Once the security basics are covered, security and risk professionals should then dig deeper into securing Hadoop. Security capabilities are being added every day. Many features are available that can help enterprises pass internal and external security requirements and audits. Dive into these main security areas to further secure Hadoop: perimeter, data, access and monitor .

1. Build a perimeter. Hadoop now supports industry-standard Kerberos to block access to non-authenticated users. And, with integration to LDAP and Active Directory, Hadoop can tie into centralized user and identity management systems.

2. Encrypt data. Due to compliance regulations — including HIPAA, PCI and internal policies —security professionals need to protect data from more than just unauthorized users. Extend protection to clear-text access over the wire using SSL, at rest using Linux encryption or via the soon-to-be-available HDFS encryption.

3. Configure users and permissions. Set permissions for users, groups or roles by defining access control lists. A separate Apache project that was started by Intel and contributed to open source as Apache Rhino has merged efforts with the Cloudera Apache Sentry project for Hadoop security and role-based access control.

4. Monitor, audit, detect and resolve issues. A crucial component of any security model is the capability to monitor, measure and audit the security process. Ensure that the enterprise’s security model is working as expected and that any suspect or actual security breach or non-compliance is quickly detected and resolved.

Hadoop is helping enterprises analyze and derive insights from data in ways they couldn’t before. Tapping into the benefits of Hadoop requires enterprises to secure their information assets to reduce risks that might cause problems down the road. The good news is, it’s possible today to ensure security and compliance in Hadoop, and continued innovation in the platform will let enterprises strengthen that security over time.

In the following columns I will explore these security layers in depth, covering the current, upcoming and future security capabilities of Hadoop.

Written By

Click to comment

Daily Briefing Newsletter

Subscribe to the SecurityWeek Email Briefing to stay informed on the latest threats, trends, and technology, along with insightful columns from industry experts.

SecurityWeek’s Threat Detection and Incident Response Summit brings together security practitioners from around the world to share war stories on breaches, APT attacks and threat intelligence.


Securityweek’s CISO Forum will address issues and challenges that are top of mind for today’s security leaders and what the future looks like as chief defenders of the enterprise.


Expert Insights

Related Content

Application Security

Cycode, a startup that provides solutions for protecting software source code, emerged from stealth mode on Tuesday with $4.6 million in seed funding.

Data Protection

The cryptopocalypse is the point at which quantum computing becomes powerful enough to use Shor’s algorithm to crack PKI encryption.

Artificial Intelligence

The CRYSTALS-Kyber public-key encryption and key encapsulation mechanism recommended by NIST for post-quantum cryptography has been broken using AI combined with side channel attacks.


The three primary drivers for cyber regulations are voter privacy, the economy, and national security – with the complication that the first is often...

Application Security

Fortinet on Monday issued an emergency patch to cover a severe vulnerability in its FortiOS SSL-VPN product, warning that hackers have already exploited the...

Cybersecurity Funding

Los Gatos, Calif-based data protection and privacy firm Titaniam has raised $6 million seed funding from Refinery Ventures, with participation from Fusion Fund, Shasta...

Application Security

Many developers and security people admit to having experienced a breach effected through compromised API credentials.

Cybersecurity Funding

CommandK announced that it has raised $3 million in a seed funding round for a solution designed to help organizations secure sensitive data.