Internet Data Overview

Internet Data Overview

RiskIQ's Security Intelligence Service Internet Data offering provides customers with access to RiskIQ’s high volume API endpoints to directly ingest RiskIQ data into their security operations tools and technology.

Standard Data Sets

Passive DNS                                                    

Passive DNS is a system of record that stores DNS resolution data for a given domain or IP address. This historical resolution data set allows analysts to view which domains resolved to an IP address and vice versa.                 

RiskIQ offers API access to our Passive DNS repository in multiple ways to provide analysts with the ability to correlate domain and IP address overlap   Passive DNS data can provide analysts insight into how a particular domain name or IP address changes over time and enables them to identify other related domains/IP addresses. When researching a suspicious or malicious event, PDNS data can provide context to an attack or additional malicious domains/IP addresses.                

Use Case       

  • Indicator of Compromise correlation
  • Historical resolution lookups                            
  • Time-based analysis
  • Fully quali ed domain name lookups
  • SIEM event enrichment
  • Domain or IP enrichment to proactively hunt for threats


WHOIS is a protocol that lets anyone query for ownership information about a domain, IP address, or subnet. RiskIQ has a vast repository of WHOIS data, which is available to query for registrant information.

Attackers need to establish infrastructure to conduct their attack from and communicate with their malware. WHOIS data can provide an organization with insight into who is behind an attack campaign. Using domain registration information, an organization can unmask an attacker’s infrastructure by linking a suspicious domain to other domains registered using the same or similar information.

Use Cases

  • Identify additional domains registered using similar information
  • Determine the maliciousness of a given domain or IP address based on ownership records
  • SIEM event enrichment
  • Domain enrichment to proactively hunt for threats

SSL Certificates

SSL certificates are files that digitally bind a cryptographic key to a set of user-provided details and assist in providing security when transmitting information over the internet. These certificates should be signed by a third-party to verify their authenticity, but they can be self-signed by malicious actors. Beyond just securing data, certificates can be used to encrypt data sent between command and control servers and machines infected with malware.

Threat actors often use similar information across different SSL certificates for their various infrastructure. RiskIQ collects SSL certificate data as we crawl the internet, and we can correlate malicious certificates we find with their signatures.

Use Cases

  • Determine if a domain or IP address is legitimate based on certificate
  • Identify self-signed certificates vs. third-party certificate authority
  • Identify IP clusters based on shared certificates
  • Identifying additional certificates of interest based on shared properties
  • Surface connections among subject alternate names for certificates

Derived Data Sets

Built on terabytes of collected data from across the internet, RiskIQ extracts and analyzes internet data to create new data sets that aid in discovering, understanding, and mitigating digital threats.

These data sets provide customers with insight into web page attributes and associations based on RiskIQ’s vast crawling infrastructure and can provide security analysts with new data sets through which to investigate and track attacks to their organizations.

Our derived data sets provide customers with indexed access to this vast amount of crawl data to enrich and bring context to events, alerts, and investigations.

Host Attributes

Website Metadata and Trackers

RiskIQ gathers the full DOM during the loading process of pages that we crawl. We extract details such as website trackers, analytics codes, social network accounts and other unique details. These values can provide insights into additional infrastructure that typically goes unnoticed by static data sets. RiskIQ has data about trackers from includes IDs from providers like Google, Yandex, Mixpanel, New Relic, Clicky, and more.

Use Case: Threat Hunting

  • Correlating malicious infrastructure using shared components and analytics trackers
  • Determine the tracking ID is associated with a legitimate organization or an actor
  • Correlate resources using the same analytics IDs
  • Identify unique web components threat actors use that can track them to other domains
  • What was used to host the page.
  • What technology may have been loaded at the specific time of the crawl

Host Pairs

Host pairs are unique relationships between pages that are observed by RiskIQ when we crawl a web page. Each pair has a direction of child or parent and a cause that outlines the relationship connection. These values provide insight into redirection sequences, dependent requests or specific actions within a web page when it loads.

The connection could range from a top-level redirect (HTTP 302) to something more complex like an iframe or script source reference. What makes this data set powerful is the ability to understand relationships between hosts based on details from visiting the actual page. Host pairs relies on knowing website content, so it’s likely to surface different values that other sources like passive DNS and SSL certificates do not

Use Case: Threat Hunting

  • Have any of the connected artifacts been blacklisted
  • Is this domain redirecting users to malicious content
  • Where are users being redirected from/to
  • Determine redirections taking place


As RiskIQ virtual users crawl the internet, they capture everything that happens under the hood when the virtual users visit a website. This includes capturing any cookies that might be dropped by the site to track user behavior or note the status of the user’s machine. Cookies are yet another source of information that can tie pieces of infrastructure together across attack campaigns, or connect seemingly unrelated assets together. RiskIQ correlates cookie source name and data with infrastructure hosting the cookies to allow analysts to pivot and find other sites with related cookies.

Threat actors often use cookies to track users who have been delivered a malicious payload so as not to try to infect a user again. Threat hunters who are investigating a cookie as a possible indicator of compromise can search the RiskIQ internet database for that cookie.

Use Case: Threat Hunting

  • Identify malicious actors cookie squatting legitimate website cookies
  • Connect other websites are issuing the same cookies
  • Understand what other websites are tracking the same cookies
  • Understand how frequently a specific cookie is observed across the internet

Blacklist Lookup

RiskIQ analyzes terabytes of content on a daily to provide curated information to our customers.  Our blacklist intelligence data provides customers with insights based on observations in RiskIQ data sets.

Our blacklist lookup endpoint allows customers to access RiskIQ intelligence in real time and on-demand making it easy to programmatically triage malicious and suspicious domains appearing in their environment

Use Case: Real time intelligence on domains and IP addresses

  • Ingest On-demand enrichment of Domains & IP addresses into security operations tools via a queryable endpoint