Connect with us

Hi, what are you looking for?



200 Million Sets of Japanese PII Emerge on Underground Forums

A dataset allegedly containing 200 million unique sets of personally identifiable information (PII) exfiltrated from several popular Japanese website databases emerged on underground forums, FireEye reports.

A dataset allegedly containing 200 million unique sets of personally identifiable information (PII) exfiltrated from several popular Japanese website databases emerged on underground forums, FireEye reports.

Advertised by a Chinese threat actor at around $150, the dataset contained names, credentials, email addresses, dates of birth, phone numbers, and home addresses, and was initially spotted in December 2017.

The data appears sourced from a variety of Japanese websites, including those in the retail, food and beverage, financial, entertainment, and transportation sectors, and FireEye believes that the cybercriminals obtained it via opportunistic compromises.

The data, which the security researchers believe to be authentic, appears to have been acquired between May and June 2016, though data in one folder suggests some of it was obtained in May and July 2013, FireEye explains in a report shared with SecurityWeek.

Apparently, several actors commented on the advertisement, saying they were interested in purchasing the dataset, but they also provided negative feedback, claiming they did not receive the advertised product.

The dataset contains “at least 200 million lines of data from a possible range of 11 to 50 Japanese websites,” and FireEye discovered that the data is highly varied and not available through publicly available data sources.

Furthermore, analysis of the leak suggested that much of the data was genuine, given that most of the email addresses out of a random sample of 200,000 were previously seen in major leaks, thus unlikely to have been fabricated.

Advertisement. Scroll to continue reading.

“Since we did not observe most of the leaked data in any dataset as coming from one specific leak or on any publicly available website, this also indicates that the actor is unlikely to have bought or scraped the information from data leaks and resold it as a new product,” FireEye explains.

In another sample of 190,000 credentials, 36% contained duplicate values, the researchers say. Furthermore, a significant number of fake email addresses was observed, suggesting that the actual number of real and unique credentials and sets of PII is lower than advertised.

Filenames in the dataset included “a Japanese food brand, an unnamed online handbag shop, an unnamed adult website, an unnamed shipping company, a gaming website, a beauty company, and other references,” the researchers reveal.

The exfiltrated data includes information usually associated with websites with customer login and profile information, and the actor appears to have had access only to data normally stored on servers connected to a website or web portal.

What the security researchers couldn’t verify, however, was that the exfiltrated data indeed came from the claimed sources. The actor might have labeled the files in the data leak using the names of Japanese websites, but the researchers believe the individual had little incentive to falsify the data sources.

The hacker appears to have been actively selling website databases on Chinese underground forums since at least 2013 and FireEye experts found two personas likely tied to the individual through a common QQ address connected to a person living in China’s Zhejiang province.

The actor was observed selling data stolen from websites in China, Taiwan, Hong Kong, European countries, Australia, New Zealand, and North American countries.

However, because the actor has a “significant portion of negative reviews on underground forums,” the sold information could be fabricated or might have been sold before. The negative reviews claimed that the individual either did not deliver data or did not provide the expected product.

“Since much of this information has been previously leaked in large-scale data leaks, as well as the possibility that it has been previously sold, we anticipate that this dataset will not enable new large scale malicious activity against targeted entities or individuals with leaked PII,” FireEye says.

Related: Verizon Downplays Leak of Millions of Customer Records

Related: Accenture Exposed Data via Unprotected Cloud Storage Bucket

Written By

Ionut Arghire is an international correspondent for SecurityWeek.

Click to comment

Daily Briefing Newsletter

Subscribe to the SecurityWeek Email Briefing to stay informed on the latest threats, trends, and technology, along with insightful columns from industry experts.

SecurityWeek’s Threat Detection and Incident Response Summit brings together security practitioners from around the world to share war stories on breaches, APT attacks and threat intelligence.


Securityweek’s CISO Forum will address issues and challenges that are top of mind for today’s security leaders and what the future looks like as chief defenders of the enterprise.


Expert Insights

Related Content


The changing nature of what we still generally call ransomware will continue through 2023, driven by three primary conditions.


Luxury retailer Neiman Marcus Group informed some customers last week that their online accounts had been breached by hackers.


As it evolves, web3 will contain and increase all the security issues of web2 – and perhaps add a few more.


A recently disclosed vBulletin vulnerability, which had a zero-day status for roughly two days last week, was exploited in a hacker attack targeting the...


Satellite TV giant Dish Network confirmed that a recent outage was the result of a cyberattack and admitted that data was stolen.


Zendesk is informing customers about a data breach that started with an SMS phishing campaign targeting the company’s employees.

Artificial Intelligence

The release of OpenAI’s ChatGPT in late 2022 has demonstrated the potential of AI for both good and bad.

Artificial Intelligence

The degree of danger that may be introduced when adversaries start to use AI as an effective weapon of attack rather than a tool...