Why converting domain blacklists to ip format is impractical. ( So you can stop asking )

One can easily bulk lookup mass domain lists to find the corresponding ip addresses, and foolishly attempt to convert the raw data into an ip address blacklist.

But I wouldn’t advise it.

I have long ago concluded that converting domain blacklists to ip format for direct use with content filtering is impractical for a variety of obvious reasons. Yet, I have been asked about this topic many times over the years. So I’m going try and outline the various problems one encounters when trying to do so in this article. One can easily perform bulk mass domain host-name look ups on domain blacklists to find the corresponding ip addresses.

But it is impractical, and the primary reason why, is that multiple malicious ‘vhosts’ or web hosts can, and often times do exist on the same ip address as multiple legitimate websites. It doesn’t take long for one to find multiple examples taken directly from actual real world blacklists converted to ip format. When we lookup a known malicious domain to see if any other websites might be hosted on the same ip address, we can see that there are in fact, often times, multiple, legitimate websites sharing some of these ips with malicious hosts.

One of the many reasons for this is that legitimate websites often times become compromised and used for various malicious purposes, legitimate hosts, which reside on the same physical web server or ip as other legitimate websites. Or one might encounter hosting services, which offer low cost or value added web hosting for multiple customers on the same network infrastructure or ip.

Squarespace and Wix are two providers that come to mind which offer similar services. Note: When we did a lookup of wix and squarespace domains to find any matches against a list of 250,000+ known malicious hosts, we did not find any matches which means the security teams at both service providers must be doing a good job of responding to abuse reports by quickly taking malicious hosts down. The number of domains hosted by each of these providers is not trivial.

Multiply that figure by whatever number of other shared hosting providers there may be in operation, and the figures would be staggering and that directly translates into fp for anybody attempting this type of domain to ip conversion. I cannot stress enough, this is not isolated to just a few hosts, and therefore it would escalate into a problem that would not be easily controlled. It is a widespread occurrence for multiple hosts to share the same ip, the scale of which compels me to dare
challenge anybody to successfully overcome this problem when attempting this method short of deploying some advanced classification AI to read each domains contents and determine of the ip should be included or excluded, but things are not always not black and white, and neither is this problem.

And again, the reason for that is simple, if you blacklist an ip that a malicious website exists on, and on the same ip exists a legitimate website, you have collateral damage in the form of a the form of multiple false positives. But in the case of domain blacklisting, you could blacklist a malicious domain name without having a negative impact on the legitimate domain. With IP blacklisting, you must weigh the risk vs consequences. Because of this fact, there is, and will always a degree of risk that comes with maintaining and producing quality ip blacklists. A problem that must be actively mitigated with more that just flat blacklists for a more effective degree of border security.

Mitigation of the problems of fp takes place during production of these lists, and it is the responsibility of the publishers and curators of this data to actively be aware of, and develop systems to discover, control, eliminate, and to minimize the inherent problems with both domain and ip classification.

It is this same problem which reflects upon closer analysis that the incentive for hostile malicious actors to seize control of high confidence networks and hosts which are trusted by the world is high, because it is these trusted hosts which can be leveraged with great effect in successful deployment of malicious operations.

Leave a Reply

Your email address will not be published. Required fields are marked *

*