Member-only story

Compromised vs. Attack Domains: Building Machine Learning Models to Identify Malicious Hosting Types

7 min readMay 23, 2021

Note: This write up is based on our USENIX Security 2021 paper on detecting compromised vs. attack domains. [paper]

Malicious websites come in all sizes and shapes. Every day millions of users are tricked to access malicious websites crafted by Internet miscreants. They either impersonate a popular website (e.g. PayPal, Apple, Facebook) or trick you to download/install malware.

How many malicious websites do we really observe every day?

For example, the above figure shows the number of unique malicious URLs observed from VirusTotal (VT), which is one of the most popular URL scanning services out there that provides aggregated threat intelligence on URLs.

On average, we see 276K malicious URLs per week and a roughly half of them are actually newly observed, meaning that they are scanned in VT for the very first time.

While VT is not exhaustive, it gives us a pretty good understanding of the malicious URL ecosystem.

A key question we ask in this work is that, “Given a malicious website, can we tell how are they hosted?”

Let me elaborate more on this question.

Compromised vs. Attack Domains: Building Machine Learning Models to Identify Malicious Hosting Types

Written by Mohamed Nabeel

No responses yet