dev

Same Origin Cloudflare Analytics

Reverse Engineering Cloudflare Web Analytics

⛅️

Cloudflare's new analytics offering 'Cloudflare Web Analytics' promises a solution to the dichotomy of information vs privacy. Their free tier operates using a traditional JS beacon script loaded from Cloudflare's own domain static.cloudflareinsights.com, that reports back to cloudflareinsights.com. Cloudflare promises their solution to be 'privacy-first', and while I have very little doubt about it being the case, it won't be long before ad-blockers and other privacy-conscious tools block it - as a 3rd party resource.

Reverse Engineering the Beacon

The script Cloudflare loads is heavily minified, and while not intentionally obfuscated, is almost impossible to read. I tried looking for stray source maps and the like, but didn't find any. Instead I used de4js with the unreadable add-on, and then ran it through Prettier. The unreadable extension uses data from JSNice, and tries to rename variables back to their likely name using a statistical model1. This works pretty well, but isn't infallible, so you have to have your whits about you and not blindly trust that a variable named url, is even a URL, for example.

Once inside, the code showed a number of options that are documented (or that I could find).

There's the option to include custom tags in the beacon requests

...

if (window.__cfBeaconCustomTag) {
    if ("object" != typeof window.__cfBeaconCustomTag || Array.isArray(window.__cfBeaconCustomTag)) console.warn('Invalid custom tag format. Please use the following format: { "first_key": "first_value", "second_key": "second_value" }');
    state.ct = applyChange(window.__cfBeaconCustomTag)
}

...

This is cool but let's keep digging.

Here we find what we've been looking for, an option to configure the URL the beacon will pass the Real User Metrics to.

...

var initial = params.token
	? params.send && params.send.to
			? params.send.to
			: "https://cloudflareinsights.com/cdn-cgi/rum"
	: null;

...

Looking further up in the script, we can see that the params can be taken from the script tag or, interestingly for us, from window.__cfBeacon.


...
var t = document.currentScript || ("function" == typeof document.querySelector ? document.querySelector("script[data-cf-beacon]") : void 0);

...

var params = window.__cfBeacon;

...

var paramJson = t.getAttribute("data-cf-beacon");
	if (paramJson) try {
	    params = JSON.parse(paramJson)
	}
...

So, we can pick our own URL using window.__cfBeacon = {token: ANALYTICS_TOKEN, send: {to: ENDPOINT_URL}}

cdn-cgi

My first thought was to change this to a Cloudflare Worker proxy, sitting on the same origin. However, this isn't even necessary...

Cloudflare's CDN network includes a 'secret' /cdn-cgi directory on sites it proxies. The best-known use of this is the /cdn-cgi/trace document, that details information about the user's interaction with the Cloudflare network.

fl=21f439
h=cloudflare.com
ip=2b00:24e3:6444:5301:fcf9:4321:29f7:abcd
ts=1609427929.302
visit_scheme=https
uag=Mozilla/5.0 (Macintosh; Intel Mac OS X 11_1_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4356.6 Safari/537.36
colo=LHR
http=http/2
loc=GB
tls=TLSv1.3
sni=plaintext
warp=off
gateway=off
gateway_account_id=nil

It turns out that a lot of Cloudflare's services make use of this path, including their email protection, and their rocket loader scripts, as well as many other things2.

More pertinent to our needs, as seen in the default URL, /cdn-cgi/rum exists, and it's live on all Cloudflare-proxied sites - same origin 👌.

✨ Take Home Message

window.__cfBeacon = {token: ANALYTICS_TOKEN, send: {to: '/cdn-cgi/rum'}}

will provide us with a same-origin destination for our requests

To solve the initial request being 3rd Party, I suggest just serving the beacon.min.js, file yourself, as I haven't found it exposed in the same way.

Worth it?

I'm in two minds about solving this. On one hand its somewhat futile as it only defeats DNS based blocking, uBlock or other HTTP-level blockers could quite easily write a rule to block this, on the other hand, Cloudflare Web Analytics is the only free limitless offering I've found that doesn't track users (not even using their IP address)3, and doesn't require cookies - so maybe it is worth trying.

There are many ways to solve the issues of simple pattern blocking, using 'Cloudflare Workers', or literally any other proxy service - but I want a simple stack and don't want to be bounded by usage limits more than necessary - optimistic, I know!

Cloudflare Web Analytics is still in beta, so it's quite possible these features will be documented or removed 😢 in the future.


  1. http://www.nice2predict.org/
  2. https://www.google.com/search?q="/cdn-cgi/"
  3. https://www.cloudflare.com/en-gb/web-analytics/