The web is the web…

< Terug naar blog overzicht

Recently we found ourselves at an event in Madrid, Spain.  We were (once again ) discussing the topic of cookies, managing them and why this is an issue for the modern day system administrator. While we were geeking out on this topic we decided to look at a quick example of what happens when we browse some of the more popular web sites across the globe.

Which websites?

So what we needed first was an overview of the most popular websites across the world. A quick google search led us to https://www.alexa.com/topsites.  This website shows the top 500 sites on the web (their words).
We decided that we’d pick the category “News”, and use the top 50 sites, listed here: https://www.alexa.com/topsites/category/Top/News.

What data?

So now we got 50 websites to browse and from which we could record data. So, what do we want to record? We simply wanted to see what cookies were set on our machine by just browsing to the home page of each site, no interaction and never accept or decline cookies.

How to record it?

To make life simple we chose to use a virtual machine running Windows Server 2016 and a script which launched each of the 50 news sites in Internet Explorer. We choose Internet Explorer because today it’s still one of the most used browsers in corporate environments and is a browser most are familiar with.

What did that look like?

Here is a short video showing our session. The script logs on a new user (one who’s never logged on before), launched each page, waited for the page to load and then moved onto the next URL.

What can we learn from this?

Looking at our Cookies folder – we can see we accumulated 283 “cookie” files in the \Low folder:

Wait; 281 cookies, that’s not that bad is it?

If we take a proper look at these 281 cookie files with our ‘Avanite Cookie Parser Utility’ we see that these 281 cookies contain 993 cookies, consisting of 642 different types of cookie:

So what are these cookies?

If you’ve never actively looked at webdata cookies, most of the data probably doesn’t give you any information about anything. But if we look closer at the data, we see a count and a name. So what we are interested in are the high count numbers and the name associated with it.

We can look up these cookies using a site such as https://www.cookiebot.co.uk and find out what they are actually for.

Taking the top entry from the list above we can see:

Saving you the trouble of having to lookup every individual cookie, we’ve collected the information with some basic information which is available about each:

Top 15 items
Cookie Type Cookie Purpose Details
_gid Performance This cookie name is associated with Google Universal Analytics. This appears to be a new cookie and as of Spring 2017 no information is available from Google. It appears to store and update a unique value for each page visited.
__cfduid Strictly Necessary Cookie associated with sites using CloudFlare, used to speed up page load times. According to CloudFlare it is used to override any security restrictions based on the IP address the visitor is coming from. It does not contain any user identification information.
_gads Targeting/Advertising This cookie is associated with the DoubleClick for Publishers service from Google. Its purpose is to do with the showing of adverts on the site, for which the owner may earn some revenue.
_cb_ls Performance This cookie is set on websites using real time analytics software by Chartbeat.
_cb Performance This cookie is set on websites using real time analytics software by Chartbeat.
_chartbeat2 Performance Set on a site using the Chartbeat real-time analytics platform. Used to distinguish between new and returning visitors.
_cb_svref Performance This cookie is set on websites using real time analytics software by Chartbeat.
__qca Targeting/Advertising This is a cookie usually associated with Quantcast, a digital advertising company. They provide website rankings, and the data they collect is also used for audience segmentation and targeted advertising.
uid Performance This cookie provides a uniquely assigned, machine-generated user ID and gathers data about activity on the website. This data may be sent to a 3rd party for analysis and reporting.
_parsely_visitor Tracking Anonymous user identifier used to track new vs. returning visitors.
_parsely_session Tracking Anonymous user identifier used to track behavior within the current session.
s_vi Performance Adobe Site Catalyst cookie, used to identify unique visitors, with an ID and timestamp.
tuuid Targeting/Advertising This cookie is mainly set by bidswitch.net to make advertising messages more relevant to the website visitor.

Wait, what now?

Most cookies provide no benefit to the user? But they are automatically placed on our file system without any interaction other than opening a homepage. How to explain this? Makes us wonder what the new EU GDPR legislation was all about and if everyone is aware of all the data that’s being collected…

The web is the web and it isn’t changing any time soon (at least in our humble opinion).

Our test was fully automated, we never interacted with a single website and never accepted or declined any cookies from any of the sites.

Thanks to Guy Leech (@GuyRLeech) for providing the script used for automatically opening and closing the websites!

Mike Cobussen (Workspace Specialist)

Meer nieuws

Meer weten?

Laat uw gegevens achter en wij zullen zo snel mogelijk contact met u opnemen om uw vragen te beantwoorden.



Ik geef toestemming om mijn gegevens te verwerken op de manier zoals omschreven in de privacy verklaringIk geef toestemming om mijn gegevens te verwerken op de manier zoals omschreven in de privacy verklaring