Privacy

Researchers Link “de-identified” Browsing History to Social Media Accounts

Researchers Demonstrate How “de-identified” Web Browsing Histories Can be Linked to Social Media Accounts

January 23, 2017

Researchers Demonstrate How “de-identified” Web Browsing Histories Can be Linked to Social Media Accounts

While the use of cookies and other tracking mechanisms used to track computers is widespread and well understood, it is often believed that the data collected is effectively de-identified; that is, the cookies track the computer browser, not the person using the computer.

This is the message often promulgated by the advertising industry: tracking cookies allow targeted advertising without compromising personal privacy. Now new research from academics at Stanford and Princeton universities demonstrates that this need not be so.

In the new study ‘De-anonymizing Web Browsing Data with Social Networks‘ (due to be presented at the 2017 World Wide Web Conference Perth, Australia, in April) the researchers show that de-identified web browsing histories can be linked to social media profiles using only publicly available data. Once the social media profile associated with a browsing pattern is known, the person is known.

The basic premise is that social media users are more likely to click on links posted by people they follow. This creates a distinctive pattern that persists in the browsing history. “An adversary can thus de-anonymize a given browsing history,” states the report, “by finding the social media profile whose ‘feed’ shares the history’s idiosyncratic characteristics.”

The theory was tested against Twitter — chosen because it is largely public, has an accessible API, and wraps its links in the t.co shortener. Assuming an ‘adversary’ has access to browsing histories, he can then easily deduce (through timing or referrer information) which links came from Twitter. The pattern of those referrals from Twitter can then be used to identify the user concerned by matching it with users’ Twitter profile characteristics. The same approach could also be used against users with Facebook or Reddit accounts.

“Users may assume they are anonymous when they are browsing a news or a health website,” comments says Arvind Narayanan, an assistant professor of computer science at Princeton and one of the authors of the research, “but our work adds to the list of ways in which tracking companies may be able to learn their identities.”

The approach is not foolproof. Nevertheless, say the researchers, “given a history with 30 links originating from Twitter, we can deduce the corresponding Twitter profile more than 50 percent of the time.” In fact, in a test involving 374 volunteers who submitted web browsing histories, the method was able to identify more than 70 percent of those users by comparing their web browsing data to hundreds of millions of public social media feeds.

Advertisement. Scroll to continue reading.

“All the evidence we have seen piling up over the years showing the strong limits of data anonymization, including this study,” comments Yves-Alexandre de Montjoye, an assistant professor at Imperial College London (not associated with the research), “really emphasizes the need to rethink our approach to privacy and data protection in the age of big data.”

The problem goes beyond simple user privacy, since it could be used to target persons of interest. “The idea would be to look at something such as my Twitter account (as in who I’m following) and to determine what links I’m seeing,” explains F-Secure security advisor Sean Sullivan. “And then, to find the ‘User X’ with the highest correlation between site visits and links seen. At which point, if I’m User X, I could be targeted by somebody who controls one of the sites visited.”

At a purely ‘commercial’ level, this could be used to target individuals with high value goods. But it could also be used to find and target specific individuals prior to a network attack.

The researchers accept that their current methodology is not 100% accurate, but add an “adversary may fruitfully make use of other fingerprinting information available through URLs, such as UTM codes. Thus, the main lesson of our paper is qualitative: we present multiple lines of evidence that browsing histories may be linked to social media profiles, even at a scale of hundreds of millions of potential users.”

Furthermore, it claims, “our attack has no universal mitigation outside of disabling public access to social media sites, an act that would undermine the value of these sites.” It calls for “more research into privacy-preserving data mining of browsing histories.”

Written By Kevin Townsend

Kevin Townsend is a Senior Contributor at SecurityWeek. He has been writing about high tech issues since before the birth of Microsoft. For the last 15 years he has specialized in information security; and has had many thousands of articles published in dozens of different magazines – from The Times and the Financial Times to current and long-gone computer magazines.

Latest News

Click to comment

CIEM Chat: How to Reduce Cloud Identity Risk

March 26, 2024

Join the session as we discuss the challenges and best practices for cybersecurity leaders managing cloud identities.

Virtual Event: Ransomware Resilience & Recovery Summit

April 17, 2024

SecurityWeek’s Ransomware Resilience and Recovery Summit helps businesses to plan, prepare, and recover from a ransomware incident.

You Against the World: The Offenders Dilemma

Foreign attackers have many more toolsets at their disposal, so we need to make sure we’re selective about our modeling, preparation and how we assess and fortify ourselves. (Tom Eston)

Why Intelligence Sharing Is Vital to Building a Robust Collective Cyber Defense Program

With automated, detailed, contextualized threat intelligence, organizations can better anticipate malicious activity and utilize intelligence to speed detection around proven attacks. (Marc Solomon)

Know Your Audience When Speaking to Security Practitioners

How can security practitioners make sense of the vendor landscape and separate those who talk a good game from those who can execute, perform, and solve real problems for enterprises? (Joshua Goldfarb)

Cybersecurity Mesh: Overcoming Data Security Overload

A significant cybersecurity challenge arises from managing the immense volume of data generated by numerous IT security tools, leading organizations into a reactive rather than proactive approach. (Torsten George)

The OODA Loop: The Military Model That Speeds Up Cybersecurity Response

The OODA Loop can be used both by defenders and incident responders for a variety of use cases such as threat assessment, threat monitoring, and threat hunting. (Etay Maor)

Artificial Intelligence

ChatGPT, the AI Revolution, and the Security, Privacy and Ethical Implications

Two of humanity’s greatest drivers, greed and curiosity, will push AI development forward. Our only hope is that we can control it.

Kevin TownsendApril 3, 2023

Cybersecurity Funding

Data Protection and Privacy Firm Titaniam Raises $6 Million in Seed Funding

Los Gatos, Calif-based data protection and privacy firm Titaniam has raised $6 million seed funding from Refinery Ventures, with participation from Fusion Fund, Shasta...

Kevin TownsendFebruary 10, 2022

Privacy

Five Ways TikTok Is Seen as Threat to US National Security

Many in the United States see TikTok, the highly popular video-sharing app owned by Beijing-based ByteDance, as a threat to national security.The following is...

AFPDecember 22, 2022

Privacy

China’s ByteDance Admits Using TikTok Data to Track Journalists

Employees of Chinese tech giant ByteDance improperly accessed data from social media platform TikTok to track journalists in a bid to identify the source...

AFPDecember 23, 2022

Application Security

Open Banking: A Perfect Storm for Security and Privacy?

Open banking can be described as a perfect storm for cybersecurity. At one end, small startups with financial acumen but little or no security...

Kevin TownsendMay 3, 2023

Government

UK Introduces Mass Surveillance With Online Safety Bill

The proposed UK Online Safety Bill is the enactment of two long held government desires: the removal of harmful internet content, and visibility into...

Kevin TownsendMarch 30, 2023

Mobile & Wireless

EarSpy: Spying on Phone Calls via Ear Speaker Vibrations Captured by Accelerometer

As smartphone manufacturers are improving the ear speakers in their devices, it can become easier for malicious actors to leverage a particular side-channel for...

Eduard KovacsDecember 28, 2022

Cloud Security

AWS Enables Default Server-Side Encryption for S3 Objects

AWS has announced that server-side encryption (SSE-S3) is now enabled by default for all Simple Storage Service (S3) buckets.

Ionut ArghireJanuary 9, 2023

SECURITYWEEK NETWORK:

ICS:

SecurityWeek

Privacy

Researchers Link “de-identified” Browsing History to Social Media Accounts

More from Kevin Townsend

Latest News

Trending

CIEM Chat: How to Reduce Cloud Identity Risk

Virtual Event: Ransomware Resilience & Recovery Summit

People on the Move

Expert Insights

You Against the World: The Offenders Dilemma

Why Intelligence Sharing Is Vital to Building a Robust Collective Cyber Defense Program

Know Your Audience When Speaking to Security Practitioners

Cybersecurity Mesh: Overcoming Data Security Overload

The OODA Loop: The Military Model That Speeds Up Cybersecurity Response

Related Content

Artificial Intelligence

ChatGPT, the AI Revolution, and the Security, Privacy and Ethical Implications

Cybersecurity Funding

Data Protection and Privacy Firm Titaniam Raises $6 Million in Seed Funding

Privacy

Five Ways TikTok Is Seen as Threat to US National Security

Privacy

China’s ByteDance Admits Using TikTok Data to Track Journalists

Application Security

Open Banking: A Perfect Storm for Security and Privacy?

Government

UK Introduces Mass Surveillance With Online Safety Bill

Mobile & Wireless

EarSpy: Spying on Phone Calls via Ear Speaker Vibrations Captured by Accelerometer

Cloud Security

AWS Enables Default Server-Side Encryption for S3 Objects

SECURITYWEEK NETWORK:

ICS:

More from Kevin Townsend

Latest News

Trending

Daily Briefing Newsletter

CIEM Chat: How to Reduce Cloud Identity Risk

Virtual Event: Ransomware Resilience & Recovery Summit

People on the Move

Expert Insights

You Against the World: The Offenders Dilemma

Why Intelligence Sharing Is Vital to Building a Robust Collective Cyber Defense Program

Know Your Audience When Speaking to Security Practitioners

Cybersecurity Mesh: Overcoming Data Security Overload

The OODA Loop: The Military Model That Speeds Up Cybersecurity Response

Related Content

Artificial Intelligence

ChatGPT, the AI Revolution, and the Security, Privacy and Ethical Implications

Cybersecurity Funding

Data Protection and Privacy Firm Titaniam Raises $6 Million in Seed Funding

Privacy

Five Ways TikTok Is Seen as Threat to US National Security

Privacy

China’s ByteDance Admits Using TikTok Data to Track Journalists

Application Security

Open Banking: A Perfect Storm for Security and Privacy?

Government

UK Introduces Mass Surveillance With Online Safety Bill

Mobile & Wireless

EarSpy: Spying on Phone Calls via Ear Speaker Vibrations Captured by Accelerometer

Cloud Security

AWS Enables Default Server-Side Encryption for S3 Objects