See your site from the search engine view. Sign up for a free website crawl today!

SEO Guides, Tips & More!

Learn from Our Experience
Phone with a screen that reads "ERROR 404"

The Thing About 403/404 errors and Google’s Webcrawlers

     -     Jun 26th, 2023   -     Google   -     0 Comments

Google demonstrates that it intends to work around the server bandwidth issues caused by its crawlers. What does that mean for the keen digital marketer?

On February 17, 2023, Google posted about what it perceived as a surge in uses of 4xx (excluding 429) errors to throttle its indexing crawlers.

“The short version of this blog post is: please don’t do that; we have documentation about how to reduce Googlebot’s crawl rate. Read that instead and learn how to effectively manage Googlebot’s crawl rate.”

The issue, according to Google, is that these codes are for signaling that a client’s request is bad. In essence, Google removes these 4xx error pages from its indexing. If you were planning on using these error codes to help throttle indexing for bandwidth, you may want to think again.

And what will this mean going forward for SEO? A few things:

  • It seems that google is offering a sort of ‘competitive advantage’ to sites who adhere to specific guidelines about minuet things like error codes and their meaning. It has always been and will continue to be a good idea to ensure that each protocol or line of HTML has been implemented properly and with the correct documentation to back up its usage.

 

  • In addition, Google is trying to improve the user experience end of things when it comes to these errors. Inaccurate errors can cause frustration, confusion, and distrust among site users on top of the possibility of higher bounce rates for those pages. Google simply wants to prevent all of this from being an issue.

 

  • Another SEO aspect implicitly touched on by Google via this blog post is that they can successfully index pages more often and more correctly. As it stands, quite a few login-protected and sensitive information pages can be encountered through Google searching. Google may be signaling that, going forward, it will provide harsher blocks or penalties towards pages that misuse the 4xx errors or pages that should not be findable with Google.

 

  • There are other direct benefits for Google’s webcrawlers as well. They could more thoroughly index or understand the content and context of a website’s pages and may be aiming to exclude checks for error misuse (as possibly opposed to the above-mentioned inclusion for quality-sake) in favor of faster results and crawling, or less bandwidth use. While it’s always difficult as an SEO expert to predict what google will try to do with its search engine, the baseline messaging here clearly demonstrates an unwillingness to have to do deal with the problem themselves.

 

Of course, 5xx errors are not mentioned in this blog post. Google will always understand 5xx errors to not be a product of unavailable or inaccessible content, but rather temporary server issues that will eventually be resolved; Google treats 5xx errors like unanswered questions at the end of an exam, questions that it intends to go back to at a later point. With that being said, I hope this provides new insight and context around Google’s thinking with its decision to formally discourage rate limiting through these error codes.

 

Contact Us

Concerns about which error codes are the golden standard is just one of many tiny details in technical SEO that may go unnoticed. If you’d like to find out if there are other oversights on your page, feel free to get in touch.


Latest Tweets