Hadoop servers that are not securely configured expose vast amounts of data, according to an analysis conducted using the Internet search engine Shodan.
A Shodan search uncovered nearly 4,500 servers with the Hadoop Distributed File System (HDFS), the primary distributed storage used by Hadoop applications. These servers were found to expose 5,120 TB (5.12 PB) of data.
Making a comparison to MongoDB deployments, which are also known to expose a lot of data, Shodan found 47,820 servers, but only 25 TB of exposed data.
Of all the Hadoop servers that expose data, 1,900 are located in the United States and 1,426 in China. The next on the list are Germany and South Korea, with 129 and 115 servers, respectively. A majority of the HDFS instances spotted by Shodan are hosted in the cloud, mainly Amazon (1,059 instances) and Alibaba (507).
Late last year, researchers started seeing ransom attacks aimed at unprotected MongoDB databases. Attackers either erased or stole data and asked victims to pay a ransom if they wanted to recover it. These types of attacks later began targeting Elasticsearch, CouchDB and Hadoop servers.
According to Shodan founder John Matherly, these ransom attacks are still being launched against both Hadoop and MongoDB installations, and a majority of the Internet-exposed MongoDB servers appear to have already been compromised.
When researchers first reported seeing attacks targeting HDFS installations, they pointed out that, in some cases, attackers erased most directories and created a single directory named “NODATA4U_SECUREYOURSHIT,” without asking for a ransom.
Shodan searches for the “NODATA4U_SECUREYOURSHIT” string show that, currently, there are more than 200 such HDFS clusters.
Matherly has shared detailed instructions on how to replicate the searches on Shodan for those who want to conduct their own investigations.
Related Reading: Hadoop Data Encryption – “P.S. Find Robert Langdon”
Related Reading: Hadoop Data Encryption at Rest and in Transit
Related Reading: Hadoop Audit and Logging “Back in Time”

Eduard Kovacs (@EduardKovacs) is a contributing editor at SecurityWeek. He worked as a high school IT teacher for two years before starting a career in journalism as Softpedia’s security news reporter. Eduard holds a bachelor’s degree in industrial informatics and a master’s degree in computer techniques applied in electrical engineering.
More from Eduard Kovacs
- Unpatched Econolite Traffic Controller Vulnerabilities Allow Remote Hacking
- Google Fi Data Breach Reportedly Led to SIM Swapping
- Microsoft’s Verified Publisher Status Abused in Email Theft Campaign
- British Retailer JD Sports Discloses Data Breach Affecting 10 Million Customers
- Meta Awards $27,000 Bounty for 2FA Bypass Vulnerability
- Industry Reactions to Hive Ransomware Takedown: Feedback Friday
- US Reiterates $10 Million Reward Offer After Disruption of Hive Ransomware
- Hive Ransomware Operation Shut Down by Law Enforcement
Latest News
- Malicious NPM, PyPI Packages Stealing User Information
- VMware Confirms Exploit Code Released for Critical vRealize Logging Vulnerabilities
- 98% of Firms Have a Supply Chain Relationship That Has Been Breached: Analysis
- Dutch, European Hospitals ‘Hit by Pro-Russian Hackers’
- Gem Security Gets $11 Million Seed Investment for Cloud Incident Response Platform
- Ransomware Leads to Nantucket Public Schools Shutdown
- Stop, Collaborate and Listen: Disrupting Cybercrime Networks Requires Private-Public Cooperation and Information Sharing
- Boxx Insurance Raises $14.4 Million in Series B Funding
