Connect with us

Hi, what are you looking for?

SecurityWeekSecurityWeek

Cybercrime

Researchers Find Thousands of Twitter Amplification Bots in Just One Day

7,000 Twitter Amplification Bots Found in One Day’s Search

7,000 Twitter Amplification Bots Found in One Day’s Search

Researchers have examined Twitter looking for what are known as amplification bots. These accounts serve no purpose other than to amplify confidence in the content of a tweet and/or confidence in the tweeter. At one level of sophistication they can be used influence public opinion on specific topics. At another level they can be used to increase followers for individual accounts. And in between they can be used by spammers, scammers and phishers.

The first step in discovering the extent of amplification bots is to develop an automated method for recognizing such a bot. Duo Security researchers Jordan Wright and Olabode Anise started with a dataset of 576 million tweets and needed to be able to distinguish between normal twitter behavior and abnormal amplification behavior.

They examined an easy observation. The majority of tweets receive more likes than retweets. “It is reasonable to expect that the number of likes for a particular tweet would be higher than the number of retweets, since liking a tweet is a lower-impact action,” write the researchers.

A ‘like’ tells the author that the content of the tweet is appreciated, but does not post the tweet to the liker’s own followers. Likes are consequently well-used by people, but of little value for amplification by bot. The first task was to test this assumption.

The researchers then filtered out those tweets with less than 50 retweets. The purpose was to limit distortion from the large number of tweets with few retweets: one like and one retweet from the same follower would skew the ratio of RT/Like elsewhere.

They found “that half of the tweets in our dataset have nearly a 2:1 ratio of likes vs. retweets, while 80 percent of the tweets have at least more likes than retweets (greater than 1:1 ratio).” The rest have a much greater ratio of retweets to like.

In one example, they found a tweet that had 969 retweets and just 164 likes — a massive reversal of the normal ratio of likes to retweets. “To put some numbers to how rare this is,” comment the researchers, “only 0.2 percent of tweets in our dataset had more than at least 900 retweets and a similar retweet-to-like-ratio.”

Advertisement. Scroll to continue reading.

This is almost certainly a bot account. Examining the timeline of the bot’s account provides further clues — it didn’t seem to have authored a single original tweet, but contained many other retweets with a similarly high retweet to like ratio.

The researchers also examined the time-distribution of retweets for suspected bots. The assumption is that a genuine account’s timeline will show tweets in basic chronological order, while an amplification bot’s timeline would be more scattered. To test and measure this, the researchers used the inversion count (a mathematical measure of deviation from the natural order) on a genuine account, and the amplification bot already determined. 

A genuine account should show a lower inversion number than an amplification account. “The inversion count for [the genuine account] timeline is 63, while the inversion count is 2028 for the amplification bot,” they found.

Armed with these three clues to the likelihood of an amplification bot, the researchers developed a script to search through their twitter dataset. They had three criteria: at least 90% of tweets should be retweets; at least one-third of their retweets should have been amplified; and the inversion count on their timeline should be greater than 100.

Running this search script over the dataset for just one day found 7,000 amplification bots. The determining criteria were consciously set high to avoid false positives — so the true figure is likely to be higher. Had the search been conducted over a longer period, then it would inevitably have found more. And finally, although the dataset comprised 576 million tweets, that is only about one day’s-worth of all tweets

This research does not intend to estimate the total number of amplification bots on Twitter at any time — but it clearly shows that it is a vast number that cannot be controlled by Twitter’s own processes. It tells us a number of things. Twitter itself cannot keep up with the formation and use of amplification bots, so it is a fair assumption that everyone will sooner or later receive a tweet that looks to be popular but is not necessarily so. If the amplification is for malicious purposes, it may include a link that appears to be safe by virtue of the number of retweets it has received. 

Since URL reputation lists cannot keep up with the generation and use of malicious URLs, there is no guarantee that either Twitter’s own filters or a company’s purchased filters will know that it is malicious. The temptation exists for the user to click a shortened URL in an interesting tweet that has been apparently verified by a large (falsely amplified) number of retweets.

Such is the speed and volume of Twitter that, short of banning its use, there is little that companies can do to prevent this problem. The best solution — as so often happens — is user awareness. This research demonstrates how users can quickly recognize a suspected amplification tweet: a high ratio between retweets and likes, and retweeters’ timelines that have a high proportion of retweets to original tweets that are visibly not in chronological order are the most immediate indicators.

Related: Twitter Unveils New Processes for Fighting Spam, Bots 

Related: New Open Source Tools Help Find Large Twitter Botnets 

Related: Social Media ‘Bots’ From Russia Distorting Global Politics

Related: New Twitter Rules Target Fake Accounts, Hackers 

Written By

Kevin Townsend is a Senior Contributor at SecurityWeek. He has been writing about high tech issues since before the birth of Microsoft. For the last 15 years he has specialized in information security; and has had many thousands of articles published in dozens of different magazines – from The Times and the Financial Times to current and long-gone computer magazines.

Click to comment

Trending

Daily Briefing Newsletter

Subscribe to the SecurityWeek Email Briefing to stay informed on the latest threats, trends, and technology, along with insightful columns from industry experts.

Join the session as we discuss the challenges and best practices for cybersecurity leaders managing cloud identities.

Register

SecurityWeek’s Ransomware Resilience and Recovery Summit helps businesses to plan, prepare, and recover from a ransomware incident.

Register

People on the Move

Cody Barrow has been appointed as CEO of threat intelligence company EclecticIQ.

Shay Mowlem has been named CMO of runtime and application security company Contrast Security.

Attack detection firm Vectra AI has appointed Jeff Reed to the newly created role of Chief Product Officer.

More People On The Move

Expert Insights

Related Content

Cybercrime

A recently disclosed vBulletin vulnerability, which had a zero-day status for roughly two days last week, was exploited in a hacker attack targeting the...

Cybercrime

The changing nature of what we still generally call ransomware will continue through 2023, driven by three primary conditions.

Cybercrime

As it evolves, web3 will contain and increase all the security issues of web2 – and perhaps add a few more.

Cybercrime

Luxury retailer Neiman Marcus Group informed some customers last week that their online accounts had been breached by hackers.

Cybercrime

Zendesk is informing customers about a data breach that started with an SMS phishing campaign targeting the company’s employees.

Cybercrime

Patch Tuesday: Microsoft calls attention to a series of zero-day remote code execution attacks hitting its Office productivity suite.

Artificial Intelligence

The release of OpenAI’s ChatGPT in late 2022 has demonstrated the potential of AI for both good and bad.

Cybercrime

Satellite TV giant Dish Network confirmed that a recent outage was the result of a cyberattack and admitted that data was stolen.