A new tech recruitment project scraped user data from GitHub and other similar websites and inadvertently leaked it online through a misconfigured MongoDB database.
Australian security expert Troy Hunt, the owner of the Have I Been Pwned service, was recently provided a 600 Mb MongoDB backup file containing data from a tech recruitment website called GeekedIn. A closer analysis revealed that the file contained information on more than 8 million GitHub profiles, including names, email addresses, locations and other data.
However, just over one million of the exposed email addresses are valid, while the rest are represented as “[email protected]” and are associated with GitHub accounts with no public email address. The MongoDB database also included thousands of accounts apparently taken from BitBucket.
GeekedIn, announced by its developer in June, is a service that crawls code hosting websites, such as GitHub and BitBucket, and creates profiles for open-source projects and developers. The goal of the service is to help recruiters find developers who match their needs and help developers “enrich their CV.”
The data harvested by GeekedIn is publicly available on GitHub and it does not include any sensitive data such as passwords.
However, while GitHub does allow users to scrape public data from its website, it prohibits the use of scraped information for commercial purposes. GeekedIn was planning to ask recruiters and companies for hundreds of euros per month to use the harvested data.
The second problem is that the data was stored in a MongoDB database that was not protected and could have been accessed by anyone. These types of incidents are increasingly common, with some organizations exposing the details of hundreds of millions of individuals due to misconfigured databases.
“As someone in the data breach myself, I don’t want my data being sold this way,” Hunt said. “And again, yes, you can go and pull this data publicly on a per-individual basis but the constant response I got from close confidants I shared this information with is that ‘it just feels wrong’. And it is wrong, not just the scraping of GitHub in the first place in order to commercialise our information, but then subsequently losing it via a MongoDB with no password and now having it float around the web in data breach trading circles.”
After being notified by Hunt, GeekedIn developers promised to take measures to secure the data. They have also taken the website offline.
Users affected by this incident can use the Have I Been Pwned service to find out exactly which of their information was leaked.

Eduard Kovacs (@EduardKovacs) is a contributing editor at SecurityWeek. He worked as a high school IT teacher for two years before starting a career in journalism as Softpedia’s security news reporter. Eduard holds a bachelor’s degree in industrial informatics and a master’s degree in computer techniques applied in electrical engineering.
More from Eduard Kovacs
- Intel Boasts Attack Surface Reduction With New 13th Gen Core vPro Platform
- Dole Says Employee Information Compromised in Ransomware Attack
- High-Severity Vulnerabilities Found in WellinTech Industrial Data Historian
- CISA Expands Cybersecurity Committee, Updates Baseline Security Goals
- Exploitation of 55 Zero-Day Vulnerabilities Came to Light in 2022: Mandiant
- Organizations Notified of Remotely Exploitable Vulnerabilities in Aveva HMI, SCADA Products
- Waterfall Security, TXOne Networks Launch New OT Security Appliances
- Hitachi Energy Blames Data Breach on Zero-Day as Ransomware Gang Threatens Firm
Latest News
- Microsoft: No-Interaction Outlook Zero Day Exploited Since Last April
- US to Adopt New Restrictions on Using Commercial Spyware
- Hackers Earn Over $1 Million at Pwn2Own Exploit Contest
- GoAnywhere Zero-Day Attack Hits Major Orgs
- Australia Dismantles BEC Group That Laundered $1.7 Million
- ‘Grim’ Criminal Abuse of ChatGPT is Coming, Europol Warns
- Webinar Tomorrow: Understanding Hidden Third-Party Identity Access Risks
- GitHub Rotates Publicly Exposed RSA SSH Private Key
