In an effort to contribute to making authentication more secure, a researcher has decided to publish 10 million username/password combinations that he has collected over the years from the Web.
The number of leaked passwords has increased significantly over the past few years. Specialized websites that allow users to check if their credentials have been compromised in major data breaches have already collected hundreds of millions of records. For example, Have I Been Pawned? has 175 million accounts and PwnedList has close to 390 million.
Leaked passwords have been used by many companies to determine the most common passwords and other trends. However, in many cases, only passwords are made available.
Security consultant, author, and researcher Mark Burnett has been collecting publicly available passwords for the past 15 years and now he has decided to make available 10 million of them, along with their associated usernames, to provide insight into user password selection. The expert believes the analysis of both usernames and passwords has been neglected, which is why he has published a “clean set of data” that others can study.
Burnett has highlighted the fact that the username and password combinations are unlikely to be abused. The researcher has removed the domain part from email addresses, keywords that could provide clues to the source of the credentials, information that could be particularly linked to an individual, financial information, and accounts clearly belonging to government and military employees. Furthermore, the data comes from thousands of incidents that took place over the past 15 years so the accounts cannot be tied to the companies they were stolen from.
The researcher has also pointed out that a majority of the passwords are likely invalid because most of the affected companies have already notified their customers and urged them to change their passwords following a breach.
Burnett said he was concerned about releasing the data, especially after the recent conviction of Barrett Brown, a journalist who was sentenced to five years in prison, partly for publishing a link to sensitive information stolen by hackers from the think tank Stratfor in 2011. Prosecutors charged Brown with trafficking in stolen authentication features.
Due to these recent events, Burnett published a lengthy blog post, which primarily focuses on justifying the release of the data.
“In the case of me releasing usernames and passwords, the intent here is certainly not to defraud, facilitate unauthorized access to a computer system, steal the identity of others, to aid any crime or to harm any individual or entity. The sole intent is to further research with the goal of making authentication more secure and therefore protect from fraud and unauthorized access,” the researcher wrote.
Burnett has noted that he shouldn’t be in any kind of trouble for publishing the data as current legislation only targets those who release passwords “knowingly and with intent to defraud.” However, changes proposed by the White House to the controversial Computer Fraud and Abuse Act (CFAA), which was used to prosecute Andrew “Weev” Auernheimer and Aaron Swartz, could make this illegal.
In the new CFAA, “with intent to defraud” might be replaced with “willfully” and the law will read: “knowingly and with intent to defraud willfully traffics (as defined in section 1029) in any password or similar information, or any other means of access, knowing or having reason to know that a protected computer would be accessed or damaged without authorization in a manner prohibited by this section as the result of such trafficking.”
“I think this is completely absurd that I have to write an entire article justifying the release of this data out of fear of prosecution or legal harassment. I had wanted to write an article about the data itself but I will have to do that later because I had to write this lame thing trying to convince the FBI not to raid me,” Burnett said.