Connect with us

Hi, what are you looking for?

SecurityWeekSecurityWeek

Privacy

DNA Database Not So Anonymous On The Internet: Study

On the Internet, nobody knows you’re a dog — but it is getting increasingly easy for someone to figure it out.

On the Internet, nobody knows you’re a dog — but it is getting increasingly easy for someone to figure it out.

As more and more of our personal data — and those of the people we know and are related to — gets posted online, the anonymity promised by the remove of a computer screen gets more and more elusive, according to a new study out Thursday in the US journal “Science.”

That’s what a team of scientists uncovered when they started playing Sherlock with a batch of genetic data posted online for researchers to use. The data was anonymous: the participants’ names were not published.

DNA DatabaseBut using the information that was provided, including age and where they live, along with freely available Internet resources, the researchers were able to identify nearly 50 of the individuals in the genomic database.

“This is an important result that points out the potential for breaches of privacy in genomics studies,” said Whitehead Fellow Yaniv Erlich, who led the research team.

Erlich’s team started by analyzing certain genetic markers, called “short tandem repeats,” on Y chromosomes (Y-STRs) that tend to be passed down from father to son.

Because in US culture, surnames are also passed from father to son, there is a strong link between these repeats and family names.

That information is used by genealogy web sites to help people find common ancestors and other family information. Men upload information about the Y-STRs they have to find others with similar ones — leaving a publicly searchable database of Y-chromosome data linked to family trees.

By comparing the Y-STRs of the study participants, the researchers found the last names of a number of the participants. They estimate it would be possible to identify last names for about 12 percent of Caucasian males this way.

Advertisement. Scroll to continue reading.

Cross-referencing these with Internet record search engines, obituaries, genealogical websites, and public demographic data, the team was able to fully identify nearly 50 participants of the genomic study, including some women relatives.

“We show that if, for example, your Uncle Dave submitted his DNA to a genetic genealogy database, you could be identified,” said Melissa Gymrek, a member of the Erlich lab and first author of the Science paper.

“In fact, even your fourth cousin Patrick, whom you’ve never met, could identify you if his DNA is in the database, as long as he is paternally related to you.”

When Erlich’s team informed the NIH of what they had been able to do, the agency removed the participants’ ages from the database, to make it more difficult to identify them.

Erlich said that he and his research team had nothing nefarious in mind when they did the research — but that doesn’t mean someone else might not have more sinister motivations.

“Our aim is to better illuminate the current status of identifiability of genetic data,” he said.

“More knowledge empowers participants to weigh the risks and benefits and make more informed decisions when considering whether to share their own data.”

But he emphasized that genomic databases provide crucial information for researchers and scientific progress.

He said he hoped research like his would prompt better security measures to protect research subjects in future.

Privacy of genetic information — which can reveal predispositions to certain illnesses — is a major concern among the scientific community and the larger public in the US who fear such information could be misused by insurance companies or employers.

Written By

AFP 2023

Click to comment

Trending

Daily Briefing Newsletter

Subscribe to the SecurityWeek Email Briefing to stay informed on the latest threats, trends, and technology, along with insightful columns from industry experts.

Join the session as we discuss the challenges and best practices for cybersecurity leaders managing cloud identities.

Register

SecurityWeek’s Ransomware Resilience and Recovery Summit helps businesses to plan, prepare, and recover from a ransomware incident.

Register

Expert Insights

Related Content

Artificial Intelligence

Two of humanity’s greatest drivers, greed and curiosity, will push AI development forward. Our only hope is that we can control it.

Cybersecurity Funding

Los Gatos, Calif-based data protection and privacy firm Titaniam has raised $6 million seed funding from Refinery Ventures, with participation from Fusion Fund, Shasta...

Privacy

Many in the United States see TikTok, the highly popular video-sharing app owned by Beijing-based ByteDance, as a threat to national security.The following is...

Privacy

Employees of Chinese tech giant ByteDance improperly accessed data from social media platform TikTok to track journalists in a bid to identify the source...

Application Security

Open banking can be described as a perfect storm for cybersecurity. At one end, small startups with financial acumen but little or no security...

Mobile & Wireless

As smartphone manufacturers are improving the ear speakers in their devices, it can become easier for malicious actors to leverage a particular side-channel for...

Government

The proposed UK Online Safety Bill is the enactment of two long held government desires: the removal of harmful internet content, and visibility into...

Cloud Security

AWS has announced that server-side encryption (SSE-S3) is now enabled by default for all Simple Storage Service (S3) buckets.