How Unnecessary Redirects Are Slowing Down Your Website and Hurting Your SEO
HTTP redirects are a fundamental mechanism on the web. Navigate to some URL and the web server will respond by telling you to go to a different URL instead. This is useful when pages are no longer relevant, have been deleted, or have moved. Instead of getting a "404 Not Found" error, the user's browser will seamlessly continue navigation to the directed page.
There's another extremely common means of employing redirects, often unintentional, that hurts websites. These are unnecessary redirects that serve no purpose, and are often the result of incorrect linking throughout a website. These redirects inhibits performance—users get a slower navigation experience—and have a negative impact on SEO. In this article, we'll cover what exactly causes these redirects, what their impact is, and how to find and fix them.
What happens when your browser sends an HTTP request?
Your web browser (whether it be on your computer, phone, or any other device) communicates via HTTP, which is a text-based protocol. Your browser will send an HTTP request, consisting of a path, HTTP method, and several headers.
Here's a simplified HTTP request that contains the fundamentals
> GET / HTTP/2
> Host: www.cloudtrellis.com
> User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Safari/537.36
> Accept: text/html
Walking through the above request:
- The first line starts by indicating this is a
GET
method request (what your browser is using each time you type an URL in the address bar or click a link) - The path is
/
(the "root" or home page of the specified site) - The
Host
iswww.cloudtrellis.com
, or the domain that the request is being sent to - The
User-Agent
is a descriptor of the browser we're using (in this case this is the user agent for a Chrome-based browser) - The
Accept
indicates that we're looking for an HTML response (indicated by thetext/html
MIME type)
Every time you click on a link or type a URL in your address bar, your browser will send a request that contains all of the above for the respective page you are trying to visit.
The server will respond with a status code, headers, and the contents of the page. Here's a simplified response to the above request.
< HTTP/2 200
< date: Mon, 30 Sep 2024 16:51:31 GMT
< content-type: text/html; charset=utf-8
< content-length: 53901
... more headers ...
<!doctype html><html class="no-js" lang="en"><head>...
Looking at this response:
- The status code is
200
, indicating that the page is available and accessible to you. If the page were missing, the server might respond with a404
instead. - The date of the response, the content type (we see that
text/html
matches what we requested), and the length. - Additional headers will be included with most requests, these can serve various purposes like defining a Content-Security-Policy or setting cookies.
- The actual HTML of the page, usually starting with
<!doctype html>
.
What exactly happens when there is an HTTP redirect?
When a redirect occurs, we tend to get a very different response. Let's look at the request and response.
Sending a request to the URL https://www.cloudtrellis.com/signup-request
:
> GET /signup-request HTTP/2
> Host: www.cloudtrellis.com
> User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/127.0.0.0 Safari/537.36
> Accept: text/html
Yields the following response
< HTTP/2 301
< date: Mon, 30 Sep 2024 17:15:16 GMT
< content-type: text/plain; charset=utf-8
< content-length: 0
< location: https://www.cloudtrellis.com/request-demo
Taking a quick look at this response we see some key differences:
- The status code is
301
, indicating that this is a permanent redirect - There's no content (indicated by a 0 value for the content length)
- There's a
location
header telling us to visit a different URL (in this case:https://www.cloudtrellis.com/request-demo
)
If your browser receives a response like the above when you click on a link, it will automatically send a second HTTP request to the URL indicated in the location
header of the redirect. As the user, you only wait a short while longer and end up on the new page. You will see the URL change in your address bar as this happens.
What unnecessary redirects look like
Web servers such as Apache and Nginx are responsible for handling requests like the ones we looked at above. They are very particular in terms of which requests will yield a 200
response, and will serve a given page only from its exact URL. They do offer some flexibility however, recognizing when a similar—but not exactly correct—URL is requested. In this case they typically respond with a 301
redirect.
Listed below are several examples of various types of similar-but-not-exact URLs will yield in one or more redirects.
Example 1: www. mismatch
- Server expected URL:
https://www.example.com/
- Browser requested URL:
https://example.com/
- Problem: Missing
www.
subdomain - Response:
301 Redirect to https://www.example.com/
Example 2: not using https
- Server expected URL:
https://www.example.com
- Browser requested URL:
http://www.example.com
- Problem: Request made over
http://
instead ofhttps://
- Response:
301 Redirect to https://www.example.com
Example 3: including a trailing slash when unwanted
- Server expected URL:
https://www.example.com/signup
- Browser requested URL:
https://www.example.com/signup/
- Problem: Extra trailing slash
/
at the end of the URL path - Response:
301 Redirect to https://www.example.com/signup
Example 4: omitting a trailing slash when wanted
- Server expected URL:
https://www.example.com/about/
- Browser requested URL:
https://www.example.com/about
- Problem: Missing trailing slash
/
at the end of the URL path - Response:
301 Redirect to https://www.example.com/about/
Example 5: capitalization issues
- Server expected URL:
https://www.example.com/about/
- Browser requested URL:
https://www.example.com/About/
- Problem: Capitalized letter used for lowercase path1
- Response:
301 Redirect to https://www.example.com/about/
Example 6: multiple problems at once
- Server expected URL:
https://www.example.com/about/
- Browser requested URL:
http://example.com/about
- Problem:
- Missing
www.
subdomain - Request made over
http://
instead ofhttps://
- Missing trailing slash
/
at the end of the URL path
- Missing
- Response:
301 Redirect to https://example.com/about
, then301 Redirect to https://www.example.com/about
, then301 Redirect to https://www.example.com/about/
What's happening here? In each case, the URL that the server expects closely resembles the URL requested by the browser. However, in no case is the URL exact, and this always yields at least one 301 response. You can see in the last example that the URL that gets multiple aspects incorrect doesn't just result in a single 301 redirect, but a chain of several redirects.
Redirect chains not only add latency for users but also multiply the performance cost as each extra redirect introduces a full round trip between the server and the browser.
How do these extra redirects appear on a website?
Typically the way that these types of redirects occur is by incorrectly formatted URLs being included in links throughout the site. For example, in your navigation bar you might have a tag that looks like the following:
<a href="/about">About us</a>
The browser will interprets a leading slash (/
) as a link to the root of the current domain, so the above href
will result in a request to https://www.example.com/about
(assuming the domain of the current page is www.example.com
). However, as we saw above, this URL will likely result in the server responding with a 301 redirect to the version of the URL that has a trailing slash at the end, if that's what the server expects.
Do these extra redirects make a difference?
The short answer is yes, these redirects do make a difference (in a bad way). The longer answer is that these redirects impact both the usability and the SEO of your site.
Performance implications of extra redirects
Unnecessary redirects can have a significant performance detriment on page load times. Each redirect requires a full round trip between the browser and the server. This extra back-and-forth can add significant time to each user navigation action, both upon the initial load of the site as well as when they move between pages in your site.
In our testing with a gigabit internet connection in the US connecting to a site hosted on Amazon Web Services (AWS), each round trip added between 160 and 200 milliseconds, even for a highly optimized web server with minimal load. This means that the network latency alone was largely responsible for this slowdown.
This is slowdown a best case scenario. If the user had a slower internet connection, or the web server took longer to respond to the initial request (such as during times of very high traffic), the delay caused by each additional redirect would be even longer.
SEO implications of extra redirects
Search engines such as Google have a limited "crawl budget" when crawling large websites. In simple terms, a crawl budget is the total allocation of time and resources a search engine such as Google will dedicate to crawling your site.
Google's SEO guidelines on managing crawl budget that say crawl budget comes into play with:
- Large sites (1,000,000+ unique pages) whose content changes weekly
- Medium size (10,000+ unique pages) whose content changes daily
- Sites with a large proportion of pages that have been discovered currently not indexed
The guidelines also say that there are two primary factors in influencing crawl budget:
- Crawl health: how fast your site is to respond to requests
- Google's crawling limits: the current availability of crawling machines
If your site is a large or medium site, needs rapid updates to its content to be reflected quickly in search, it's probable that a large number of redirects will adversely affect your website's search presence. Unnecessary redirects use up part of this crawl budget without delivering real content, meaning fewer key pages are crawled or indexed by search engines
For example, an ecommerce site with 25,000 unique product pages whose category pages all rely on redirects from /product/<product-id>
to /product/product-id/product-slug
(such as /product/12345
redirecting to /product/12345/dog-chew-toy
) may find that Google is not updating search results as quick as it would otherwise. This is because crawling 25,000 product pages is effectively doubled to 50,000 pages, due to the additional redirect for each product URL. This may be relevant for time-limited promotions or other sales, or for the appearance of new products.
How to find and fix these unnecessary redirects
Fixing unnecessary redirects such the ones we looked at above requires effectively identifying them. The problem is that the number of links on a given website tends to increase exponentially with the number of pages, due to each page cross linking to many others.
This exponential link growth becomes a problem particularly on blogs, documentation sites, or product sites with large numbers of pages, which often rely on heavy cross-linking to encourage user engagement. In our analysis, we've found that sites with as few as 40 pages can have well over 1,000 internal links. Although many of these are likely to be automatically generated and repeated across pages, manually searching for invalid links becomes an untenable burden. This becomes doubly true as sites will change over time, with links on a given page being added or changed as the content evolves.
With all this in mind, the best way to find and fix these types of unnecessary redirect issues is by using a tool like Cloudtrellis. Helping website operators easily find and fix issues like this is one of the key motivations for building this tool.
With Cloudtrellis you can:
- Schedule scans to run automatically, based on how frequently your site changes
- See the context of each unnecessary redirect (which page it's on, what the URL is, what the link text is, and what page it resolved to)
- Track issue statuses directly in the software, allowing you to monitor resolution progress
- Share scan results with your team, even if they don't have a Cloudtrellis account
Unnecessary redirects like the ones discussed in this article are only one of many types of issues Cloudtrellis can help you find and fix. If interested, request a demo or see our pricing.
Most webservers will be strict about capitalization in their default configuration, and are likely to return a
404
instead of a301
in cases such as/About/
being requested when the actual path is/about/
. However, it is somewhat common in webite deployments to use Apache rewrite rules to normalize a title capitalization in a title to compensate for mixed capitalization. This is not recommended, as it tends to lead to the proliferation of the types of unnecessary redirects discussed in this article.↩︎