by on June 8, 2020, in

Google Analytics: Cookieless Tracking Without GDPR Consent

This article presents a simple method to use Google Analytics without cookies. Going cookieless means there is no need for a cookie consent dialog. It also means much easier GDPR compliance. The only thing you need to do is to embed the Google Analytics tracking script in a different way, which should be possible with most web publishing platforms. Please bear in mind that my background is technical, not legal.

TL;DR:

  1. Remove your current Google Analytics script
  2. Add the cookieless Google Analytics implementation:

Why Website Analytics Without Cookies?

The EU cookie directive (as it is often called) requires user consent before tracking cookies may be stored on the user’s device. I am a big fan of online privacy, but that is the wrong approach. It leads to the situation we have today, where every single site harasses first-time users to consent to … whatever … or else … Unfortunately, we seem to be stuck with this legislation for the time being.

High time we take back control of the web’s UX. Cookie consent dialogs need to go away. Until the legislation changes we can only do one thing: get rid of tracking cookies (which is not so bad a result, to be honest).

A Browser Extension Solution

Getting rid of cookies involves changes to a website’s code. End users cannot modify the sites they visit, of course, but they can change what their browsers do with the cookie consent harassment by installing a browser extension. I warmly recommend I don’t care about cookies. It does exactly what one would expect it to do.

How Google Analytics Uses Cookies

The HTTP protocol is stateless: there is no concept of users or sessions. Consequently, identifying users is difficult and error-prone. Google Analytics tries to solve that problem by creating unique strings that persist between page hits. These unique strings are stored in cookies.

So, basically, cookies are but a storage mechanism for unique IDs. Cookies are not the only technology that can be used to persist data in a browser. Local storage and similar techniques are available, too. All of them have in common with cookies that they require user consent, though.

Google Analytics Scripts

The algorithmic logic of generating the unique IDs required for website visitor tracking is embodied in JavaScript snippets that need to be loaded on every page webmasters want to include in their analytics. There are three generations of Google Analytics scripts currently in use that I am aware of.

ga.js

ga.js is the oldest of the three Google Analytics scripts and now considered legacy. It can still be used (for compatibility reasons) but Google recommends one of its successors for new projects. From a privacy point of view, ga.js is less than optimal because cookies cannot be disabled.

analytics.js

analytics.js is the successor of ga.js. It is more modern and, most importantly, allows cookies to be disabled.

For those planning to upgrade: here is Google’s migration guide from ga.js to analytics.js.

gtag.js

gtag.js is the newest of the three Google Analytics scripts. It is not a successor of analytics.js, though, but a wrapper. In short, you only need gtag.js if you’re using other Google services in addition to Analytics (more information).

Disabling Google Analytics Cookies: Simple Solution

Disabling cookies with Google Analytics is quite simple: make sure you’re using the analytics.js script and set storage to none in the create command (details).

That works quite well, but it breaks part of Google Analytics’ functionality. Without cookies, Google Analytics’ client ID is not persisted between page views. As a consequence, every page view is considered a unique visit. This skews user and session counts significantly.

Disabling Google Analytics Cookies: Advanced Solution

An advanced solution not only disables Google Analytics cookies but also changes the way client IDs are generated for the better (from a privacy point of view).

Our Requirements

  • We create the client ID for Google Analytics ourselves, taking over from Google’s script.
  • We don’t want to store the ID on the client, so we must be able to recreate it on every page a user visits.
  • We must not send personally identifiable information to Google Analytics.
  • The client ID should be different for every website (we don’t want to create a super-ID that can be used to track users across the internet).
  • The client ID should distinguish between multiple users sharing a public IP address (NAT or proxy).

How Others Are Doing It

Matomo
Plausible
Fathom

Our Solution

We don’t want to send personally identifiable information to Google Analytics, so we hash the raw data that goes into the client ID. We also don’t want Google to track us over long periods of time, so we regenerate the client ID every few days.

Given those premises, we create our client ID dynamically during page load by hashing: IP address + website domain + user agent + language + validity days.

Interestingly, the only way to get your IP address from within JavaScript in a browser is to ask a server. We use a little trick: since we have control over the code on the server, we stamp the client’s IP address into the script the server sends to the client.

The browser’s user agent is not reliable but that’s OK. We just want something that differs between browser types and, ideally, OS versions.

We consider validity days to be the number of days before the hash changes. A shorter interval improves privacy but makes session tracking more unreliable.

As hash algorithm we use cyrb53, which seems to be high-quality and fast. Caveat: our implementation does not work in Internet Explorer.

Implementing Cookieless Google Analytics

WordPress

If your site is running on WordPress head over to the plugin directory and install Cookieless Privacy-Focused Google Analytics.

Other Platforms

If you’re not running WordPress or don’t want to install my plugin, add the following JavaScript code to your pages’ HTML head section instead:

<script>
const cyrb53 = function(str, seed = 0) {
   let h1 = 0xdeadbeef ^ seed,
      h2 = 0x41c6ce57 ^ seed;
   for (let i = 0, ch; i < str.length; i++) {
      ch = str.charCodeAt(i);
      h1 = Math.imul(h1 ^ ch, 2654435761);
      h2 = Math.imul(h2 ^ ch, 1597334677);
   }
   h1 = Math.imul(h1 ^ h1 >>> 16, 2246822507) ^ Math.imul(h2 ^ h2 >>> 13, 3266489909);
   h2 = Math.imul(h2 ^ h2 >>> 16, 2246822507) ^ Math.imul(h1 ^ h1 >>> 13, 3266489909);
   return 4294967296 * (2097151 & h2) + (h1 >>> 0);
};

let clientIP = "{$_SERVER['REMOTE_ADDR']}";
let validityInterval = Math.round (new Date() / 1000 / 3600 / 24 / 4);
let clientIDSource = clientIP + ";" + window.location.host + ";" + navigator.userAgent + ";" + navigator.language + ";" + validityInterval;
let clientIDHashed = cyrb53(clientIDSource).toString(16);

(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','//www.google-analytics.com/analytics.js','ga');

ga('create', 'YOUR-GA-TRACKING-CODE', {
   'storage': 'none',
   'clientId': clientIDHashed
});
ga('set', 'anonymizeIp', true);
ga('send', 'pageview');
</script>

Please make sure to replace YOUR-GA-TRACKING-CODE with your actual Google Analytics tracking code.

Also, if your web server is not running PHP, you need to find another way of embedding the client’s IP address. In that case, replace {$_SERVER['REMOTE_ADDR']} with whatever is required by your platform.

Checking a Website’s Cookies

Don’t check cookies in your browser, the results might not be similar to what first-time visitors get. Use an online cookie checker tool instead.

Alternatives to Google Analytics

While researching this topic I found several website tracking and analytics services that focus on privacy and GDPR compliance. My notes reflect my requirements: I need to be able to track downloads, which some vendors support through custom events that can be generated with client-side JavaScript.

Matomo (formerly Piwik)

  • Website
  • Not to be confused with Piwik Pro
  • SaaS or on-premises
  • Mature product, lots of configuration options
  • Can be configured not to use cookies
  • Automatic download tracking (in my testing, nothing ever showed up in the Downloads report, though)
  • JavaScript event support nearly syntax-compatible with Google Analytics

Fathom

  • Website
  • Privacy-friendly analytics
  • Does not use cookies
  • Single-page dashboard
  • Supports events, but only one dimension (whereas GA and Matomo store three dimensions per event: category, action, and label)

Simple Analytics

  • Website
  • Privacy-friendly analytics
  • Does not use cookies
  • Event support “highly experimental”

Plausible

  • Website
  • Privacy-friendly analytics
  • Does not use cookies
  • Supports events, but only one dimension (whereas GA and Matomo store three dimensions per event: category, action, and label)

Minimal Cookieless Web Analytics

  • Website
  • DIY solution by Christian Bär
  • Visitor identification through browser fingerprinting
  • Data storage in Azure
  • Visualization via a simple dashboard

Previous Article Azure DevOps: Restricting Credentials to a Single Repository