Index of questions
- Are you saying that Google’s moves to protect user privacy is just smoke and mirrors?
- Google claims that they care about users' privacy, which is why they are removing third-party cookies and trying to kill the UA string. They also claim that Safari and Firefox have already gone the same route before Google decided to follow suit with Chrome.
- Google claims that User-Agent Client-Hints are a better and more modern way to serve the use-cases currently served by the User-Agent string. Is Google lying?
- If Google is so involved in anti-competitive practices as you claim on your site, why isn’t the US government (or other governments for that matter) looking really close into this?
- Why aren’t other organizations vocal about Google’s behaviour?
- Google has been working with W3C and other standard organizations to get consensus on their UA freeze proposal. Isn’t that enough of a safeguard for everyone else?
- Working with W3C and other standard bodies means that the process to adopt new protocols and change existing ones is vetted and fair.
- What's the difference between User-Agent freeze and User-Agent reduction?
- Interesting website, but why aren’t you revealing your identity?
Are you saying that Google’s moves to protect user privacy is just smoke and mirrors?
Yes, this is exactly what we are saying.
Someone has named this “privacy washing”, i.e. companies artfully trying to depict themselves
as the good guys in the ecosystem, when, in fact, their final goal is simply
to grow and make more money for shareholders at the expense of everyone else. Observing how Google has been operating within standards
bodies to drive its agenda is a constant reminder of this. Google is clearly engaged in anti-competitive behaviour. The changes they
intend to make deliberately damage and hinder publishers and make their competition’s life harder by exploiting their dominant position.
Google is cynically hiding behind political weasel-words and distorted views of reality to justify their position against all evidence.
Google claims that they care about users' privacy, which is why they are removing third-party cookies
and trying to kill the UA string.
They also claim that Safari and Firefox have already gone the same route before Google decided to follow suit with Chrome.
"Privacy gaslighting" someone called it.
We don't think it's an exhageration.
It is correct that other browsers have done it before, but Safari will still tell you if it’s an iPhone, an iPad or an iPod touch,
in addition to the iOS and Safari versions. In general, publishers were not smart enough to get together and counteract the trend of removing features
that were key for their business models. In the case of Firefox, publishers determined that Firefox was a minor browser, very focused on user privacy,
and acting against Firefox wouldn’t have reflected well on the single publisher from a marketing perspective (nobody likes to be cast as a villain of the
internet ecosystem). Apple blocked third-party cookies a few years ago. Ad-tech companies and publishers have felt the pain, but they still
decided to refrain from reacting.
Now the Mountain View giant is doing it. Google can forfeit third-party cookies because they have first-party access to user identity and they are
aware that removing third-party cookies will screw its competitors (i.e.other ad tech players) with little impact on their business. This will force
publishers and service providers to embrace Google even more than before, as it will be the only game in town.
One interesting note here to show how much weight is carried by (apparently small) technical details. Google is first-party, but it runs
its business from multiple domains (such as: google.com, gmail.com, youtube.com) for historical reasons. Wouldn’t it be nice (for them!) if they
could come up with a mechanism to share cookies (i.e. user identity) across those domains?
Yes. In fact, here they are, trying to convince W3C (specifically TAG, the Technical Architecture Group) to pass the concept of
“first party sets”, i.e. to
allow certain 3rd party origins to be redefined as 1st party!
This way, you may think that your YouTube persona stays separated from your Gmail persona, but that wouldn’t be the case.
So much respect for users’ privacy!
Fortunately, other group members
in W3C TAG didn’t fall for it.
As a workaround, when you log into Gmail,
Google will go through a chain of redirects across all of its web properties to make sure that users are properly tracked.
They care about privacy, you know.
Google claims that User-Agent Client-Hints are a better and more modern way to serve the use-cases currently served by the User-Agent string.
Is Google lying?
This is a great example of “the Devil is in the details”. If we had to compress the answer into a YES or a NO, the short version would be
“YES. Google is lying”,
but this is worth a closer look: If Google refrained from touching the User-Agent string, then Client-Hints headers (UA-CH) would be additional,
and this
would be good for publishers and content providers.
The use cases for User-Agent strings and Client-Hints are overlapping, but they are not the same. This means that one can come up with examples where
UA-CH are better suited, but there are other cases where User-Agent based Device Detection is the viable option.
If a device with a 640×960 screen is used in landscape mode, Client-Hints can do a better job at telling you that 960×640 is more suited for tailoring
your UX. On the other hand, there are plenty of other cases where Client-Hints may not be quite up to the task. As we wrote in the main page,
the User-Agent string is needed to support many use cases ranging from UX (user experience) optimization to image/video resampling to content security
policy to fraud detection to Javascript polyfills to detecting webviews to bug workarounds to analytics and more.
The way Client-Hints have been designed, the User-Agent will be frozen (killed, effectively) and websites will need to explicitly piggyback the request
for UA-CH information in the HTTP response, with the hope (but not the certainty) that the information that was once available through the User-Agent string
is now sent by the browser (in the form of UA-CH headers). This is not OK. First, the de-facto absence of a meaningful user-agent string makes many
existing use cases impossible and/or a lot more cumbersome to support. Secondly, there is no guarantee that those headers will be delivered,
as their presence may depend on user settings. If the very first request from a device fails for some reason, engineers will never be able to
debug because they’ll never get to find out what the device was.
Also, there is no safeguard to keep Google from further restricting the amount of device information provided in the future. Google has the habit of
changing things without having to respond to anyone for its choice. Back in the 90’s Microsoft was heavily criticized for a lot less.
Once a company gets away with unilaterally redefining internet protocols, avoiding further abuse becomes a lot harder.
HTTP was designed such that each request should contain enough information on its own to fulfill the request. In the UA-CH proposal,
if a server seeks detailed information about the browser or device making a request, it must add an Accept-CH header in its response
and await the next request to obtain the Sec-CH-UA hints. But a server cannot tell the difference between a new request from a previously
unseen user agent and a request where a user agent has opted to withhold the sought Sec-CH-UA hints for its own reasons. This effectively
forces the server to maintain state about whether (or not) each client has been seen previously in order to decide what to do.
This goes against the design principles of the HTTP protocol. "Automated recognition of user agents for the sake of tailoring responses''
is a specific use case outlined in the HTTP protocol, but this use case will become more difficult for implementers with the User-Agent Client Hints proposal.
W3C and Google claim that the User-Agent string leaks “entropy”,
which is data science lingo to claim that User-Agent strings, particularly when used in connection with IP numbers, can be used to build
pseudo user IDs (i.e. methods for ad tech companies to recognize users and associate a user-profile),
Google has claimed this and W3C has taken it as a given without proper investigation. For many months,
Google has failed to provide evidence that User-Agent strings are effectively being used that way,
despite the fact that this is
the main problem the proposal claims to solve.
“The primary goal of User Agent Client Hints is to reduce the default entropy available to the network for
passive fingerprinting.”
Only recently was Google able to dig up a
2012 paper by Microsoft researcher that
documented how UA strings (and IP numbers) could be joined to emulate user-IDs with 80% accuracy. The fact is that that paper is outdated
and User-Agent strings in 2021 work very differently from how they worked in 2012 (more later). If that research was repeated today with
today’s top UA strings by popularity and users’ IP numbers, the accuracy with which users are detected would be lower than 50% and probably
lower than 20 or 30%.
If anything, this paper proves how Google is able to slyly misrepresent existing research to drive their agenda, hoping that the world around them won’t notice.
To provide more details on why Microsoft’s research is outdated and does not apply today, here are a few examples of what User-Agent strings looked like
in 2012, a time where every .NET runtime version and every toolbar installed in the browser would blithely add itself to the browser UA string:
Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; WOW64; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; InfoPath.2; .NET4.0C; MSOffice 12)
Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; InfoPath.2; .NET4.0C; .NET4.0E; MS-RTC LM 8)
Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; 4399Box.1335; 4399Box.1335; SE 2.X MetaSr 1.0)
Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C; BOIE8;ENUS)
Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; InfoPath.2; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)
Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; InfoPath.2; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; MS-RTC LM 8; UserABC123; UserABC123)
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)
Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB7.3; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; .NET4.0C; .NET4.0E)
Of course, by tacking so much information onto the UA string, users would easily end up with strings that were unique to them. But this is not the case today,
and Google knows that very well.
If Google is so involved in anti-competitive practices as you claim on your site, why isn’t the US government
(or other governments for that matter) looking really close into this?
Because they are. Look at this
report by the US congregational committee on competition in digital markets
Ok, long document. What about a couple of quotes then:
“Google has an outsized role in the formal stakeholder standards-making processes. Other market participants believe that Google is significantly overrepresented in the W3C web platform incubator community group"
Of course governments tend to move relatively slowly on matters of this kind for a few reasons:
- It’s a complex field and Google is really good at hiding its interest in the details of highly technical specifications and software implementations.
- Politicians move very cautiously, as they constantly balance the expected outcomes against the possibility that the position of a US company is weakened internationally.
- We are sure Google is very good at lobbying.
Why aren’t other organizations vocal about Google’s behaviour?
They are (here
, here
, here
, here, and
here,
just to provide some examples). You should consider that a lot of companies have depended and still depend on Google to support themselves.
Even for those which don’t, having Google against you is hardly
conducive to good business. Hence the (relatively) little willingness to stick one’s head out when it comes to standing up against Google.
But trust us: a lot of companies are appalled by how Google used the internet to become the juggernaut that it is and now they are shamelessly
exploiting their position to wage scorched earth war against competitors and whoever is not willing to be assimilated.
Google has been working with W3C and other standard organizations to get consensus on their UA freeze proposal. Isn’t that enough of a safeguard for
everyone else?
The UA freeze move is a decision that Google is hellbent on taking anyway, no matter what anyone in any standard body is going to do or say.
Make no mistake about this. Having said this, Google can afford to send a bunch of people to work with standards and drive its agenda. And this is
exactly what they are doing. Even a relatively large company in the Ad Tech world can only afford to send two or three people. Smaller companies
will dedicate 50% of someone’s time to monitoring what is going on in one of the many standard bodies. Not Google. Google can afford to have many
brilliant engineers spend a lot of their time monitoring that things go the way Google wants. Here is a list of standard bodies in which Google is
involved for the purpose of driving Client-Hints and getting the User-Agent string killed with the “blessing” of standard bodies.
- Web Incubator Community Group (WICG)
The W3C’s Web Platform Incubator Community Group provides a lightweight venue for proposing and discussing new web platform features. Each proposal is discussed in its own repository within the group's Github account.
The WICG doesn’t have any ability to author internet standards. Successful proposals are intended to transition to a W3C working group and ultimately become a “W3C Recommendation”.
Two of the four chairs of this group are Google (Yoav Weiss and Chris Wilson).
- World Wide Web Consortium (W3C)
The W3C is the main standards organization for the World Wide Web. As such it is more concerned with document standards and browser APIs than core protocols such as HTTP.
- W3C Technical Architecture GroupTAG)
The mission of the TAG is stewardship of the Web architecture. Its charter is as follows:- to document and build consensus around principles of Web architecture and to interpret and clarify these principles when necessary;
- to resolve issues involving general Web architecture brought to the TAG;
- to help coordinate cross-technology architecture developments inside and outside W3C.
- IETF HTTP Working Group (HTTPWG)
- IETF is an open standards organization, which develops and promotes voluntary Internet standards, in particular the standards that comprise the Internet protocol suite (TCP/IP).
- Within IETF, the HTTP Working Group maintains and develops the Hypertext Transfer Protocol, i.e. the core protocol of the World Wide Web.
Google heavily dominates all of these groups e.g. in the WICG they have more participants than the next 18 companies together. Here’s an overview of the key specifications with which Google is trying to force the UA freeze:
- HTTP Client Hints:
- Core Client-Hints specification (within HTTPWG,the IETF HTTP Working Group).
- Intended status: experimental.
-
Reached RFC status in Q1 2021: https://www.rfc-editor.org/
rfc/rfc8942.html - Key people/authors: Ilya Grigorik and Yoav Weiss, both from Google at the time of the RFC creation.
- User Agent Client Hints:
- Builds on the Client-Hints specification to cover User-Agent data.
- Status: proposal (not on the W3C standards track)
- Key people/authors: Mike West and Yoav Weiss, both from Google (at the time of the report creation).
- Permission Policy:
- Assuming UA-CH is widely available, this document specifies how Client-Hints in HTTP requests are handled (for example, they may be removed base on user-settings or other policies). This is going to be disruptive for ads & analytics.
- W3C Web Application Security Working Group.
- Status: working draft. On track to become a W3C standard.
- Key people/authors: Ian Clelland from Google.
- Client-Hint Reliability
(Web Platform Incubator Community Group (WICG)):
- Google is aware that, the way Client-Hints are designed, a lot of use-cases will be hindered. For this reason their engineers are creating specs as they go to propose solutions/workarounds. This is an additional spec to work around the first visit issue with UA-CH.
- Status: proposal.
- Key people/authors: David Benjamin and Aaron Tagliaboschi, both from Google.
Every single author involved works for Google, mostly for the Chrome team specifically. This is a set of standards almost entirely designed by front-end
(browser) people from a single company.
Google is undermining the standards process that brought us the web by doing its own thing and using its browser and web property
dominance to force through a change that benefits them and hurts everyone else.
Working with W3C and other standard bodies means that the process to adopt new protocols and change
existing ones is vetted and fair.
This is what we all would like to believe, but unfortunately this has not always been the case, and certainly not within the
WICG group (which should be renamed to WIGG — Web Incubator Google Group - really,
as it is probably the most Google-dominated one, along with HTTPWG).
The really big issue here is that Google uses its dominance of the browser market and web properties to impose its changes.
Google will propose a standard and implement it in Chrome before there has been any broad review.
After that, other browsers will look bad if they don’t follow. It’s becoming a Google proprietary web.
AMP has been
a good example of this.
Here are two examples (but there are many) of how key issues were closed just because those
participating on behalf of Google thought so:
- Discussion on how replacing the UA string with Client-Hints should require support by a more representative set of stakeholders. This is a long but very instructive read that well represents the dismissive tone and attitude by Google.
- Valid point raised by Ronan Cremin on how the negotiation part of Client-Hints surreptitiously introduces a significant change in foundational HTTP design principles (statelessness). That discussion was closed at the whim of a single Google engineer.
As an side, an important discussion is going on here. If you feel strongly on Google attempt to freeze the UA string, this is the place where impacted stakeholders should make their voice heard.
What's the difference between User-Agent freeze and User-Agent reduction?
None, really. Just an attempt by Google to create more smoke and mirrors special effects. They hope
that by giving UA freeze
a new name their attempt to change everyone's web protocols will go unnoticed. We are here to avoid exactly that.
Interesting website, but why aren’t you revealing your identity?
As you can imagine, the authors of this website have jobs in the internet ecosystem. We know enough to understand the issue and articulate it, and we
also know that standing up against the Google giant is probably not a good idea for our future careers. Please focus on what we wrote, instead.
There’s plenty of evidence that shows how Google is exploiting its dominant position to gain even more power and become even more of a monopolist;
the equivalent of a tyrant in the modern digital world. The sooner publishers can get together and react collectively, the better for everyone involved.