Every day, cyber threat intelligence firm RiskIQ hoovers up terabytes of internet data. It concentrates on the internet infrastructure and how it functions, gathering up domains, IP addresses, email addresses and web page materials. It does this on behalf of its customers. With booming cloud and social media, not only is there no longer a perimeter to defend, companies often don’t even know what they have to defend.
The attack surface is expanding, and attackers target company brands, suppliers and customers across the internet as well as companies’ own data centers. RiskIQ scans the internet to see what, where and how its customers might be vulnerable.
“We collect crawled web pages, mobile apps, social media profiles and more so that we can identify what our clients own online, so they in turn can identify any vulnerabilities or risks — down to, for example, criminal or malicious actors who may be attempting to masquerade as their business in an effort to go after their employees, or customers, and so on,” explained Brandon Dixon, RiskIQ VP of product.
“We use web crawlers,” he told SecurityWeek, “which we call ‘virtual users’ because they have been instrumented to be able to scroll through a page as if they were a normal internet-browsing user.” This instrumentation became necessary soon in the company’s existence because malevolent actors began to recognize RiskIQ, and began to design their own resources to block or divert the crawlers.
“We run about 2 billion virtual user requests every day,” continued Dixon. “The virtual users follow a natural path across the internet — so, for example, they might conduct a Google search on a keyword of importance to a customer of ours. When it finds a link of relevance, the virtual user clicks on the link, visits and interrogates the page, and visits the links contained on that page. For each page we visit, we grab and save the content, grab all the remote sources from which the page is constructed, and all of the cookies and session information.”
So far, RiskIQ has gathered approximately 6 petabytes of data. That’s 6,000,000 gigabytes. Some of the gathered data is held for 60 days before being aged out — but the metadata is stored forever.
The company also scans the internet and gathers mobile apps. “One of our other methods of collecting data,” said Dixon, “is to do weekly — effectively continuous — internet scans. We run an entire sweep of the IPv4 space, looking for IP addresses that are online, and services they may be running. We’ll scan up to 111 ports, in some cases allowing customers to specify a specific port.” As a result, RiskIQ is able to identify servers online, the services they’re running, and whether they are on or off.
“We’re also downloading as many mobile app stores as possible,” he added; “including the Android store and whatever we can get our hands on from Apple — and a number of third-party global app stores and underground mobile app stores. Where possible we decompile the apps to see what permissions they use and if they call out to any blacklisted URLs.”
The analysis of this data allows RiskIQ to provide its customers with an overview of threats to their wider internet estate. The analysis is performed in the company’s own pipeline. “Any time we collect data,” explained Dixon, “it enters that pipeline and we apply pretty complex proprietary policies that allow us to admit an event whenever something satisfies the policy. This could be something that our customers define as interesting, or something our research team has defined as interesting — or it could just be a generic feed including, for example, a known bad/malicious event or item.”
This is a huge mass of analyzed internet data. Each year, RiskIQ compiles some of that data to generate an ‘evil internet minute’ report (PDF). “It brings that missing sense of scale to all the malicious things that happen on the internet,” he told SecurityWeek. “People don’t generally have this amount of data available to them. We’re in a unique position. Not only do we observe these things, we can provide a pretty heavy statistic around what is happening online for the majority of people that we serve, and we collect from.”
That sense of scale is sobering. This year’s Evil Internet Minute depicts a range of bad things that happen on the internet every minute of every day. This year’s report, published this month, shows 0.17 blacklisted mobile apps are produced every minute (that is, one in every 6 minutes of every day). 0.21 of a new phishing domain is spun up every minute (that is, one every five minutes). 9.2 malvertising incidents occur every minute. 0.05 new hosts running crypto mining malware appear every minute (that is, one new one every 20 minutes). And four potentially vulnerable web components are discovered.
“When brands understand what they look like from the outside-in,” notes an associated blog post, “they can begin developing a digital threat management strategy that allows them to discover everything associated with their organization on the internet, both legitimate and malicious, and monitor it for potentially devastating cyber-attacks.”
Last month, researchers at RiskIQ connected some the dots in this huge database and discovered that the small Ticketmaster breach reported in June 2018 was actually just a small part of a major campaign, known as Magecart, designed to steal users’ payment details. Incidentally, the Evil Internet Minute notes that there are 0.07 new Magecart incidents (about one in every 14 minutes) somewhere on the internet.
San Francisco, Calif.-based RiskIQ raised $30.5 million in a Series C funding round led by Georgian Partners in November 2016. It brought the total raised by the firm to $65.5 million.