Built-In Security: How to Leverage the Hosts File to Block Ads and More

The hosts file in servers and workstations will readily and dutifully block a, well, host of potentially dangerous content. Here's some info and advice on how to use the hosts file to block ads, malware, trackers and other unwanted content.

Tom Henderson

April 7, 2018

5 Min Read
Black hole
Wikimedia Commons

It takes a village of technology to identify and block malware, adware and the like, but organizations have one tool right at their fingertips: the hosts file in servers and workstations. Organizations can use the hosts file to block ads based on its map of known malware, analytics, cryptocoiners, trackers and/or adware sites. The file can also prevent mistyped website addresses from being logged by browser default engines, and is easily edited prior to distribution to then-protected hosts, so as not to destabilize platforms. Better still, the same data can also be readily adapted to fuel .htaccess files, which are the initial visitor access file used by web server engines like Apache and NGINX.

All major operating systems (but not their smartphone versions) support the use of the hosts file to block ads and, in general, access to unwanted content and sites. The hosts file is an ancient (by today's standards) Unix mechanism that was used before DNS came to be an accepted standard, and was sometimes aided by a tool called YP, or Yellow Pages.

Even though the hosts file has been around for a long time, it is still respected across a surprising number of platforms. Where it doesn’t work very well is on smartphones, where users are usually blocked from placing system files (for reasons only an ad revenue-loving telco/carrier could love). Only rooted phones allow a hosts file to be placed where it will block errant keystrokes, embedded malware or virus outbound requests. The blocking provided by this file can also be added to routers, firewalls, and other devices and appliances that accept the file format.

The best amalgamated, updated source file I’ve found comes from GitHub. The file is updated frequently, usually once a week, and a table of sources provides information on how the file was built. Placed in the proper directory or folder, almost any computer system will dutifully look in the hosts file first, prior to ever looking onto its network interfaces. This allows the file to trap errant excursions to the sites listed in the file.

Should there be an entry in the hosts file matching a user request (desired or not), the request--say, from a browser--will follow the actions listed in the hosts file. If there’s a match in the hosts file, the operating system will do the bidding of the hosts file's directive, routing the request to a "null route" address pointer. Then, nothing happens, meaning no undesired excursion to the site listed in the file. There are methods of getting around the hosts file, but it takes quite a bit of skill to do so. 

Understanding Hosts File Contents

The hosts file contents are simple to understand. There are two columns. On the left side are IP addresses that direct where packets should be routed, with most modern operating systems supporting IPv4 or IPv6. To block the packets, use the null route addresses IPv4 0.0.0.0 or 127.0.0.1.

Therefore:

0.0.0.0     google.com

will simply forbid your system from getting to Google. However, goog.gl, gmail.com and other variations will each require a unique entry. It’s for this reason (among others) that the hosts file we use contains nearly 1.5MB of "bad guys." 

The IPv4 address 0.0.0.0 is a blackhole address, while IPv4 127.0.0.1 is the localhost address or “loopback” address (meaning the machine’s internal networking address).

The IPv6 blackhole address is just as easy: 0.0.0.0.0.0.0.0 or simply ::1. Entered as the left tuple, this will blackhole an errant address request. Use IPv4 syntax or IPv6 syntax where appropriate when you add your own listings.

Of course, the hosts file is not without its limitations.

Should you use the downloaded hosts file as is, several problems can easily crop up. For one thing, some sites will misbehave: Videos might not load or pages will look strange because they were built to accommodate ads and certain script behavior. Sites that are financed by ads may also face issues if they can't communicate with systems that monitor page views, and the cited hosts file contains many ad-serving websites that users probably didn't realize were embedded in their web pages. Indeed, organizations with diverse support bases may not benefit from using a hosts file. 

Organizations will also have to consider the impact the size of the hosts file will have on performance. The hosts file we use is now more than a megabyte and a half; in practice, this means that the lookup table for bad sites has to be scanned 60 times in a single complex web page load, slowing things down.

With that said, a hosts-file-protected machine blocks an incredible number of malware sources, adware, and tracking tools, which is why some admins use the hosts file as a block at the firewall or router level.

And, although IP addresses with zero-day difficulties may or may not be covered by such blacklistings, many sites that are potential troublemakers are covered. We do recommend that admins who want to deploy the file and/or its subsequent weekly updates take a moment to scan through the file listings to ensure that certain resources won’t clobber certain websites (for example, MSN’s ad trackers) in a way that would cause application misbehavior. For most users and admins, a little editing of the hosts file is necessary, but the system security benefits are worth it.

What is your experience with hosts files as a security tool? Do you have any advice or context to add? Let us know in the comments section. 

 

Sign up for the ITPro Today newsletter
Stay on top of the IT universe with commentary, news analysis, how-to's, and tips delivered to your inbox daily.

You May Also Like