Ignoring 404 Not Found Errors

Another way to handle not found errors is to ignore that 404 error. It is important to emphasize right from the start that this is the online equivalent of sweeping dust under the rug instead of getting rid of the dust altogether.

As a result, we don't recommend this solution except in a few rare cases. For the majority of 404 errors on your website, you want to correct the broken links on your site and implement redirects to fix 404 errors.

So, why would you sweep 404s under the rug and how exactly do you sweep 404s under the rug (technically speaking)?

Why Would You Ignore A 404?

One of the best examples of a 404 you'd want to ignore are not found errors that result from spambots. Spambots may hit random URLs on your site (examplesite.com/12345abcdefg.htm) to test for exploits. By doing so, the spambots might link to these random URLs. Google and Bing might encounter these not found errors while they are crawling the web. However, people rarely will encounter these not found error pages.

Another example of a not found error you may want to ignore are old control files you have removed from your website. For instance, you deleted your old JavaScript and CSS files. People may not encounter these not found errors, but Google and Bing may encounter links to that old JavaScript file located at /scripts/mysite.js.

The problem with these not found errors, then, isn't that they cost you business since people will rarely encounter these 404s. However, the problem with these 404s is that they waste Google's and Bing's time while they explore (or crawl) your site. Do you want Google and Bing exploring the good pages of your site or the junk? Clearly, you want them looking at the good pages. The junk gets in the way.

How To Ignore 404 Errors

Because the problem is with Google and Bing encountering not found errors (and not visitors), we want to signal to Google and Bing that they should avoid these pages on the site. The way you create that signal is via a robots.txt file. A robots.txt file is designed to communicate with robots (like Google and Bing) that explore your website. (Learn more about setting up robots.txt files.)

With a robots.txt file, you can ignore a specific page. (For example, the old JavaScript file /scripts/mysite.js or the spam URL /12345abcdefg.htm.)

User-agent: *
Disallow: /scripts/mysite.js
Disallow: /12345abcdefg.htm

Or, you can ignore an entire directory on your site. Perhaps you removed all of the JavaScript files in the /scripts/ directory and want to ignore any 404s resulting from that directory.

User-agent: *
Disallow: /scripts/

A Word Of Caution

Robots.txt files can help robots navigate your website. However, it is also easy to block a robot from seeing legitimate pages on your site as well. Because of this, we suggest you only change your robots.txt file if you know what you are doing. We also suggest that you test your robots.txt file before releasing any changes.

Managing Ignored 404s With SpringTrax

For any 404s you want to ignore, SpringTrax gives you the option of ignoring 404 errors within the account area. By ignoring 404s, SpringTrax will stop telling you about these not found errors in the reports and stop sending you alerts about these errors on your website. You can always undo an ignored URL as well. (For more, see our demo video about ignoring 404s.)



 

Share This Post


Add Comments

comments powered by Disqus

Are You Ready To Quit Losing Customers?

Sign Up Today To Find & Fix Every 404 Error On Your Website.


Sign Up Today

Starts at $19.99/month. 30 day money back guarantee. Cancel at any time.