Security Experts:

Connect with us

Hi, what are you looking for?


Data Protection

Data Aggregator LocalBlox Exposes 48 Million Records

48 million records containing detailed personal information of tens of millions of people were exposed to the Internet after data-gathering company LocalBlox left a cloud storage repository publicly available.

48 million records containing detailed personal information of tens of millions of people were exposed to the Internet after data-gathering company LocalBlox left a cloud storage repository publicly available.

The personal and business data search service gathered and scraped the exposed data from multiple sources, UpGuard security researchers discovered. The exposed information includes individuals’ names, physical addresses, and dates of birth, along with data scraped from LinkedIn, Facebook, Twitter, and more.

LocalBlox co-founder Ashfaq Rahman has already confirmed that the exposed information indeed belongs to the company.

Because the exposed information combines personal data with details on the people’s Internet usage, it builds “a three-dimensional picture of every individual affected,” UpGuard says.

Armed with this data, one would not only know who the affected individuals are, but also what they talk about, what they like, even what they do for a living. This information can be used to target users with ads or political campaigning, but can also expose them to identity theft, fraud, and social engineering scams.

The exposed data was stored in an Amazon Web Services S3 bucket that was configured for Internet access and was publicly downloadable. On February 18, when UpGuard discovered it, the bucket contained a 1.2 TB ndjson (newline-delineated json) file that was compressed to a 151.3 GB file.

After downloading and analyzing the file, UpGuard discovered that it belonged to LocalBlox. The company was informed on the issue on February 28 and the bucket was secured later that day.

The file was found to contain 48 million records, each in json format and separated by new lines. The security researchers also discovered that the real estate site Zillow was used in the data gathering process, “with information being somehow blended from the service’s listings into the larger data pool.”

Exposed source fields revealed where the scraps of data were collected from.

“Some are fairly unambiguous, pointing to aggregated content, purchased marketing databases, or even information caches sold by payday loan operators to businesses seeking marketing data. Other fields are more mysterious, such as a source field labeled ‘ex’,” the security researchers note.

Some of the data came from Facebook and included data points such as pictures, skills, lastUpdated, companies, currentJob, familyAdditionalDetails, Favorites, and mergedIdentities, along with a field labeled allSentences, which suggested that the information was scraped from the Facebook html and not through an API.

The main issue that this incident reveals is the ease at which data can be scraped from Facebook.

“In the wake of the Facebook/Cambridge Analytica debacle, the importance of massive sets of psychographic data is becoming more and more apparent,” UpGuard notes.

Another issue this incident brings to the spotlight is that third-parties often target data from popular websites and monetize the information in new ways, perhaps without the knowledge of the impacted individuals (and likely without the website’s – in this case Facebook – knowledge either).

LocalBlox says it is “the First Global Customer Intelligence Platform to search, combine and validate deep business and people profiles.” Thus, the exposed data represents the actual product the company offers: psychographic data that can be used to influence users.

There’s a clear business interest in this type of data harvesting, processing, and resale, meaning that massive and intrusive data sets clearly exist, for both companies and political parties to leverage when looking to influence people.

“What should be a wonder is that these datasets aren’t better secured and administered. This exposure was not the result of a clever hack, or well-planned scheme, but of a simple misconfiguration of an enterprise asset— an S3 storage bucket— which left the data open to the entire internet. The profitability gained by data must come with the responsibility of protecting its integrity and privacy,” UpGuard also points out.

Related: Facebook Admits to Tracking Non-Users Across the Internet

Related: Facebook Says 87 Million May be Affected by Data Breach

Written By

Ionut Arghire is an international correspondent for SecurityWeek.

Click to comment

Expert Insights

Related Content

Application Security

Cycode, a startup that provides solutions for protecting software source code, emerged from stealth mode on Tuesday with $4.6 million in seed funding.

Application Security

Many developers and security people admit to having experienced a breach effected through compromised API credentials.

Application Security

Electric car maker Tesla is using the annual Pwn2Own hacker contest to incentivize security researchers to showcase complex exploit chains that can lead to...

Application Security

Password management firm LastPass says the hackers behind an August data breach stole a massive stash of customer data, including password vault data that...

Application Security

Virtualization technology giant VMware on Tuesday shipped urgent updates to fix a trio of security problems in multiple software products, including a virtual machine...

Application Security

Google’s Threat Analysis Group (TAG) has shared technical details on an Internet Explorer zero-day vulnerability exploited in attacks by North Korean hacking group APT37.


A database containing over 235 million unique records of Twitter users is now available for free on the web, cybercrime intelligence firm Hudson Rock...

Application Security

Fortinet on Monday issued an emergency patch to cover a severe vulnerability in its FortiOS SSL-VPN product, warning that hackers have already exploited the...