Domain Infringement Events

Domain Infringement events are available to External Threats customers subscribed to the Domain Threats module. They alert customers to third party-owned domain and subdomains names that are confusingly similar to branded terms or trademarks. This is a common way for threat actors to spoof your organization’s identity and use that domain to carry out phishing attacks and/or otherwise deceive Internet users into mistakenly believing that your organization is the source of the content being presented to them via that domain/host.

When such an infringing domain is found, a Domain Infringement event is created in the workspace which can be viewed in the the events dashboard and events list inside the RiskIQ web application, in an email alert, or via the RiskIQ events API.

For a general introduction to events and other parts of the RiskIQ system, please see RiskIQ Platform Architecture.

This article describes :

  1. How to read and interpret the information presented in a Domain Infringement event (field definitions)
  2. Suggested best practices for Domain Infringement event management, including user workflow and tagging
  3. How it works: Domain Threats detection and system overview 

Example: an event showing a domain name related to the Wikimedia Foundation, but registered to a third party.

Reading Domain Infringement Events - Field Definitions

This is how Domain Infringement events are represented in the Events section of the RiskIQ web application. Clicking on a list item in the column on the left side of the screen brings up details for that event and user-initiated workflow actions in the panel on the right. 

Event List Item

  • Thumbnail screenshot of the webpage hosted on the domain that generated the event (if there is associated web content).
  • Event-Type: what kind of event it is (e.g. Domain Infringement).
  • Domain: the domain associated to the event (subdomains rather than domains may contain the matching business logic).
  • Status: current status of the event.
  • Created: date the event was generated.
  • Updated: date of the last entry in the event history.
  • Active: Domain Infringement events are considered active as long as the domain contains or is confusingly similar to a configured brand keyword, and is not either owned by the organization (confirmed as an asset in your Inventory) or otherwise whitelisted
  • Tags (if any have been applied--not pictured above).

Event Header

At the top of each event's details is a header containing workflow actions. See Event User Actions for information on these options.

Summary Tab

The Summary page provides information for initially assessing the event and deciding how to act on it, including screenshots from the first and most recent crawls of the page if the event domain has live web content associated to it. The Summary tab is organized into multiple sections:

ATTRIBUTES

  • Domain: infringing domain name.
  • Initial URL: the URL used as the starting point of the crawl
  • Final URL: the final destination where the crawl ended (i.e. if there was a redirect sequence, this would be different than the initial URL)
  • Domain Expires: what date the domain expires, or “Not Registered” if the domain is not currently registered.
  • Live Website: whether the domain has live website content associated to it (a non-zero http response code).
  • Parked Website: whether the domain is parked (as opposed to having a live website that is something other than a parking page).
  • Alexa: degree of web traffic indicated by the site’s Alexa rank (High = Top 1,000, Medium = Top 10,000, Low = 10,000+).
  • Exact Match: whether the domain name contains an exact match to a branded term (Yes) or a spelling variation (No).

WHOIS

  • IP: IP address for the site associated to this event (if there is one).
  • Registrar: name of registrar for the domain associated to this event.
  • Registrar Email: email address to contact the registrar for the domain.
  • Registrar Phone: phone number to contact the registrar.
  • Registrant: name of the registrant for the domain associated to this event.
  • Registrant Email: email address to contact the registrant for the domain.
  • Registrant Phone: phone number to contact the registrant for the domain.
  • ASN: autonomous system number (ASN) associated to this event with the country of origin, and company (if there is one).
  • Hosting Provider: name, city, state, and country of the hosting provider for the site associated to this event (if there is one).
  • Name Servers: name servers for the domain associated to this event (click to expand if there are more than 5).
  • MX Records: mail exchanger records specifying mail servers that can be used to receive email on behalf of this domain name
  • TXT Records: Other text records associated to this domain's DNS entry--typically this is in the form of an SPF record, which determines what mail servers are allowed to send email on behalf of this domain name (if there is an TXT record, then the event will contain a flag in the event header and list item noting that the domain is "Email-Capable")

HISTORY

  • Timeline of changes made to the event with the date, time, and name of the user who took each action, including:
    • Status changes 
    • Emails sent (with recipients)
    • Notes added
    • Tags added/removed
    • Changes in ownership or priority

Site Details

If more detail is needed beyond what is shown in the summary tab, this section provides any more information about the website associated to this event that is available, including:

  • CName
  • Nameserver Information
  • ASN Information
  • Metro Code Information
  • Alexa Category and Exact Rank
  • Full WhoIs Record (includes raw response)
  • Full IP WhoIs Record (includes raw response)
  • Host Details
  • SSL Information
  • File Information

Classify Tab

This section details what about the page was flagged by the RiskIQ system in relation to your business logic. In the case of Domain Infringement events, typically this simply means displaying which of your brand keywords was flagged as similar to this domain (you can look for exact or punycode-replacement matches, regular expression matches, and/or fuzzy/algorithmic matches--see the Domain Threats System Overview section at the end of this article for more details). 


If you have custom classifiers added in to your business logic, there might be other types of classifier results displayed as well, including properties in the domain, hostname, or the web page content, if there is content associated to the event. Domains and hosts are analyzed by RiskIQ in terms of the following common features. Each field can be targeted individually within your classifiers as needed. 

If you are a RiskIQ admin user looking for step-by-step instructions on creating/modifying Domain Infringement classifiers, policies, or projects refer to Setting Up Domain Infringement Events

Blacklist Tab

If this domain is included on blacklists, then this tab shows additional information on what reputation sources have reported this domain, whether the blacklisting applies to the entire domain, one host within the domain, or just specific URLs on the domain,  and for what type of activity (malware, phishing, or spam).

Enforcement Tab

If this event has been enforced, then this tab shows additional information about any threat mitigation actions this event has been involved in, such as the date, the user who initiated the action, the recipients of the request, and the full text of the sent messages and any responses received in return.

Crawls Tab

Whenever an infringing domain is found, RiskIQ automatically checks to see if there is any live website content hosted on it or redirected to by it. If any content is found, we send a virtual user to investigate and monitor the webpage over time.

When applicable, this section houses information on each instance a page was analyzed by virtual users. Users can select from any of the times that RiskIQ analyzed the page associated to this event (red arrow next to the timestamp indicates, active, while grey signals inactive) to see details about the virtual user's interaction with the event page and user session overall at that point in time. 

Details provided about the crawl include: 

  • An overview providing metadata on the crawl and the screenshot taken by the virtual user
    • Global Unique Identifiers for the user session and the page within the user session
    • Date and time
    • Initial URL where the virtual user began the crawl
    • Browser used
    • Geographic location of the virtual user
    • Total number of pages visited during the user session
    • Total number of pages visited that returned error messages
    • URL of the event page
    • IP address
    • Response code and message returned by the event page
    • Page Content-type
    • Page Content length
    • Page response time
    • Window name
  • The original HTML response of the page
  • The rendered document object model after the page loaded in the user's browser
  • Files
  • Cookies
  • Links
  • Headers

Managing Domain Infringement Events - Workflow and Tagging Best Practices

User Review

The flow chart below describes a decision tree encompassing best practices for reviewing Domain Infringement events. It describes in more detail the 'User Review' step in the System overview diagram at the end of this article.

  • Green represents steps taken automatically by the RiskIQ system
  • Pink represents steps taken by a human user
  • Blue represents the result of an action (e.g. a status or tag label)


Tag Set

  1. Authorized Site
  2. Phishing/Malware
  3. Traffic Diverter
  4. Undetermined
  5. Other

Domain Threats System Overview

Detection

RiskIQ detects Domain Threats based on our PDNS and WHOIS databases as well as our crawls. From these sources, we derive NOD (newly observed domains) and NOH (newly observed hosts) feeds, which we then compare to the keywords and policies configured in each workspace. For NOD, this process occurs hourly. For NOH, it occurs daily. We process each domain and hostname through several steps to find both infringing domain names as well as subdomains/hosts during the inspection vs. workspace policies:

First, we convert any Punycode-encoded hostnames to ASCII and map homoglyphs to their ASCII equivalents.  Many threat actors try to hide domain threats by creating domains that use non-ASCII characters that will display similarly to ASCII characters – for example, "riskiqbånk.com" instead of "riskiqbank.com."  Since URLs may only contain a limited number of characters (a subset of ASCII), these hostnames will be encoded via Punycode.  For example, "riskiqbånk.com" will be encoded  as "xn--riskiqbnk-ora.com."  A web browser would then decode this hostname and display it as "riskiqbånk.com" to a user.  This step ensures that the domain or hostname we analyze is what a user would likely interpret it to be (in this case "riskiqbank.com").

Next, we compare the similarity of observed domains and hosts to the brand keywords configured in a customer's workspace, including exact matches, regular expression matches, and/or "fuzzy matches" as specified in the config. While a basic string distance or other simple algorithms may give us words that are roughly similar, it will not interpret a hostname as a human user would and may result in both missed detections as well as false positives.  For example, "friskyband.com" has a small word distance to "riskiqbank.com", but a human user would not think that "friskyband.com" was somehow associated with the brand "RiskIQ Bank."  

In order to mimic most closely how a user would interpret a hostname, we augment simple word distance calculation with a dynamic programming approach to parse a domain into its most probable word segments.  For example, the parser would deduce that "riskiqbank.com" should be tokenized as "riskiq / bank / com."  Since many domain infringement attempts will take advantage of common misspellings and common "fat finger" typos, such as "riskiqbanck.com" or "riskqbank.com" or "riskiqvank.com," the parser will also look for slightly misspelled words when it parses a hostname.  

Then an algorithm developed by the RiskIQ Data Science team uses several features of the parsed hostname to consider the context, such as whether it includes the keywords we are looking for, whether they are they misspelled,  and how many other 'real' words are in the hostname (vs. random characters) to decide if a hostname is likely infringing or not. Notably this approach can also handle infringements that "span" multiple parts of a domain as an additional layer of obfuscation that's becoming increasingly popular with threat actors. For example ("risk.iqbank.com")

System Overview

The diagram below follows a Domain Infringement event through the RiskIQ system from a domain or host first being observed,  through the event creation and monitoring, including enforcement procedures to resolution, and post-resolution monitoring if applicable. 

  • Green represents steps taken automatically by the RiskIQ system
  • Pink represents steps taken by a human user (refer to User Review diagram above for a detailed view of this step)
  • Blue represents the result of an action (e.g. a status or tag label)

Monitoring and Auto-Resolution

  • Domain Infringement events are re-crawled roughly every 48 hours. or instantly whenever a Whois or DNS change is observed between scheduled checks. Additional samples can occur outside of this schedule based on normal/non-monitoring-related virtual user activity (if, for example the same pages also show up in searches for new pages). 
    • Monitoring times are somewhat rough--to balance load across the entire system, so crawls may be slightly advanced or delayed to prevent road spikes.
  • Upon the first inactive sample of an event, an additional crawl will be scheduled 12 hours later to confirm whether it should resolve or the first crawl was an anomaly.
  • Events in the monitor status will automatically move to Tenacious immediately upon a Whois or DNS change being observed.
  • An event will automatically resolve after 2 consecutive inactive samples and at least 1 hour of continuous inactive time.
  • Events change from Resolved to Tenacious if the next crawl is found to be active.