The folly of Google Analytics’ exit rates

Have the Web analytics folks designing automated metric reporting for Google Analytics missed a great opportunity to improve the content management of websites? Sure, they had to simplify things to make their automated reports broadly understandable. But simplification doesn’t fully justify their handling of the “Bounce Rate” and “% Exit” rate in the “Content Drilldown” section of Google Analytics.

The good news is that both of those two rates have the same concept of “exit”: an exit is a departure from a site. A visitor is measured as having “departed” from a site if the visitor has no activity on that site for some arbitrary period of time, usually defined as 30 minutes. So, a transit from page X to page Y on the same site is, for Google Analytics, not an “exit” from page X. Since the bounce rate and the %Exit rate refer only to site exits, that is how we’ll use the word “exit” here too.

The bad news is that, the percent exiting a site via a particular web page (the %Exit rate) includes (overlaps with) the percent bouncing from that same page (the Bounce Rate). As a result, the “% Exit” result can be so swamped by the volume of bounces (e.g., on those pages that receive a lot of traffic from external web pages) that it’s hard to evaluate the rate of non-bounce exits. In other words, Google Analytics leaves us unable to evaluate how much visits to that web page are leading to exits by people who arrived at the page from another page internal to the site. Suppose the exit rate of page X is 60%. And suppose the bounce rate of that same page is 70%. Then what is the exit rate for that page that isn’t coming from bounces? Google Analytics doesn’t directly reveal that.

How to solve this problem? If we’re only going to get two attrition rates in Google Analytics’ automated reporting, then the solution would seem to be to define the two metrics so that they don’t overlap. That solution would let us separately measure–and separately manage the budgets for— each of those two sources of attrition.

The first source of traffic attrition, correctly measured by the bounce rate, is immediate exits of visitors that were drawn to the page from an external web page or other external mechanism. Such externally-sourced visitor traffic is often driven by web advertising and SEO budgets. People generally bounce from a web page when its contents do not match the expectations set by the external page that led them to that page. Tracking bounce rates can help website managers align the content of their web pages to their promotion of those web pages.

The second source of traffic attrition is exits of visitors that got to the page via some other page within the same site. We’ll call this kind of exit an “internal-traffic exit.” Such internal traffic is often driven by different budgets, e.g., navigational and website-design budgets. People generally have an internal-traffic exit either when a transaction has been completed or when the content of the page did not match the expectations set by internal navigation. When internal-traffic exit rates are too high, navigation or internal links need to be reworked.

The trouble is that this second source of internally-driven attrition isn’t reported separately by Google Analytics. Google, in effect, rolls the first source of attrition (bounces) into the second source of attrition (internal-traffic exits) to get an overall measure of attrition, which it reports as the “exit rate.”

So how to separately measure the second source of attrition, the internal -traffic exits?

First, we need to understand the exact components of the bounce rate and the exit rate.

A web page’s bounce rate is a ratio. Every ratio has a numerator and a denominator. In the case of a bounce rate, the numerator is (a) the number of exits from a web page that was the first and only page of the visit to that domain. The denominator is (d) the number of views of that page’s web domain that were part of visits that started on that web page. The bounce rate is simply (a) divided by (d). If page X got 10 page views from visits that started on page X, and 6 of those page views were visits that ended before visiting any other page on the site, then the bounce rate was 60%.

A web page’s “exit rate,” as reported by Google Analytics, is also a ratio. Both its numerator and denominator are more broadly defined than is the case with the bounce rate. The numerator of the exit rate includes not only the exits (a) included in the bounce rate, but also two other types of exits from that page (which we’ll continue calling “page X”): (b) exits from visits that started on page X and ended on page X, but included a viewing of at least one other page on the domain; and (c) exits from page X of visits that started on a different page of the domain.

Similarly, the denominator of the exit rate includes not only page views from visits that started on that page (d), but also includes (e) page views of page X from visits that started on any other page of the domain. In other words, the denominator (d) of the “bounce rate” is page views where page X was the “landing page,” while the denominator of the “exit rate” (d+e) is “all views of page X.”

When we put the broadly defined numerator and denominator of the exit rate together, the “exit rate” equals (a+b+c)/(d+e). Suppose a website had 30 page views of page X. Suppose 10 of those 30 page views came from visits that started on page X. Of those 10 page views, suppose 6 came from visits that ended before going to another page, 1 came from a visit that ended after going to another page and then returning to page X to exit, and 3 came from visits that ended by exiting from another page on the site. Of the 20 other page views of the original 30, suppose 5 page views were immediately followed by exits from page X. Then Google Analytics would report the (overall) “exit rate” of page X to be (6+1+5)/(10+20) or 40%.

Now that we understand the two rates that Google Analytics is already reporting, we are ready to carefully extract the bounce rate from the exit rate, to arrive at the missing “internal-traffic exit rate” that navigational web-designers need.

The internal-traffic exit rate would be a ratio that removes the bounce-rate activity from the overall exit rate. How to do that? Well, any time that a view of page X doesn’t result in a bounce (a), the visit will potentially end on that page after visiting some other page on the site. In other words, all views of page X, except for page views that result in bounces (a), are candidates for an “internal-traffic exit.” So our denominator (or “base”) of potential internal-traffic exits equals (d+e-a).

The internal-traffic exit numerator should similarly exclude “a” (bounces) from the overall exit rate’s numerator. So the numerator would be (b+c).

So, overall, the internal-traffic exit rate would be (b+c)/(d+e-a). Using the numbers in the example above, the internal-traffic exit rate would be (1+5)/(30-6) or 25%, well below the overall exit rate of 40% and the bounce rate of 60%. And, in accord with our intuition, this internal-traffic exit rate will increase when the exits of page X within multi-page visits increase faster than total views of page X in those multi-page visits that include page X.

Last but not least, how does the bounce rate merge with the internal-traffic exit rate, in algebraic terms, to form the overall exit rate? The overall exit rate, as reported by Google Analytics, is, in effect, a muddy weighted average of the bounce rate and the internal-traffic rate: [a/d x (d/(d+e))] + [(b+c)/(d+e-a) x (d+e-a)/(d+e)]. Because some of the visits (visits d minus visits a) are included in the numerator of the weights of both component rates, the weights do not sum to 100%. As a consequence, it’s possible for the bounce rate and exit rate to both be equal (say 50%), even when the internal-exit rate is different (say 40%). But, since we have extracted an internal-exit rate that can be measured, managed, and budgeted for separately from the bounce rate, do we even care about the overall exit rate for each web page?