Security Experts:

Unprotected Hadoop Servers Expose 5 PB of Data: Shodan

Hadoop servers that are not securely configured expose vast amounts of data, according to an analysis conducted using the Internet search engine Shodan.

A Shodan search uncovered nearly 4,500 servers with the Hadoop Distributed File System (HDFS), the primary distributed storage used by Hadoop applications. These servers were found to expose 5,120 TB (5.12 PB) of data.

Making a comparison to MongoDB deployments, which are also known to expose a lot of data, Shodan found 47,820 servers, but only 25 TB of exposed data.

Of all the Hadoop servers that expose data, 1,900 are located in the United States and 1,426 in China. The next on the list are Germany and South Korea, with 129 and 115 servers, respectively. A majority of the HDFS instances spotted by Shodan are hosted in the cloud, mainly Amazon (1,059 instances) and Alibaba (507).

Late last year, researchers started seeing ransom attacks aimed at unprotected MongoDB databases. Attackers either erased or stole data and asked victims to pay a ransom if they wanted to recover it. These types of attacks later began targeting Elasticsearch, CouchDB and Hadoop servers.

According to Shodan founder John Matherly, these ransom attacks are still being launched against both Hadoop and MongoDB installations, and a majority of the Internet-exposed MongoDB servers appear to have already been compromised.

When researchers first reported seeing attacks targeting HDFS installations, they pointed out that, in some cases, attackers erased most directories and created a single directory named “NODATA4U_SECUREYOURSHIT,” without asking for a ransom.

Shodan searches for the “NODATA4U_SECUREYOURSHIT” string show that, currently, there are more than 200 such HDFS clusters.

Matherly has shared detailed instructions on how to replicate the searches on Shodan for those who want to conduct their own investigations.

Related Reading: Hadoop Data Encryption - "P.S. Find Robert Langdon"

Related Reading: Hadoop Data Encryption at Rest and in Transit

Related Reading: Hadoop Audit and Logging "Back in Time"

view counter
Eduard Kovacs (@EduardKovacs) is a contributing editor at SecurityWeek. He worked as a high school IT teacher for two years before starting a career in journalism as Softpedia’s security news reporter. Eduard holds a bachelor’s degree in industrial informatics and a master’s degree in computer techniques applied in electrical engineering.