Episode 66: FLoC of Sheep

Third-party cookies are on the way out and Google says it has found a privacy preserving way of replacing them, using a technology called Federated Learning of Cohorts. Is such a thing even possible? And what are the potential problems we, as web users, are facing here?

On this episode of The Private Citizen, we will be looking into Google’s upcoming advertising technology Federated Learning of Cohorts (FLoC).

Next week’s episode might be slightly off-schedule and there might not be a live stream of the recording, but I will give my best to get the content to you. Got some work coming up that is hard to plan around, so please bear with me…

This podcast was recorded with a live audience on my Twitch channel. Details on when future recordings take place can usually be found on my personal website. Recordings of these streams get saved to a YouTube playlist for easy watching on demand after the fact.

Google is Fighting a Rearguard Action

It’s clear that users don’t want to be tracked. Whenever you give them the chance to opt out of it, they will. Advertisers know this. So the people selling ads know that they are scumbags: They are selling a thing that the people affected by the thing don’t want to exist.

Because people don’t want to be tracked, we are now starting to have regulation and legislation that curtails this – especially in Europe (ie. the GDPR). And we also have a push by browser makers and interest groups to get rid of third-party cookies, the main mechanism by which users are tracked across the web.

Google, who’s whole revenue comes from tracking people, sees this. So they have now come up with Federated Learning of Cohorts (FLoC), a technology to replace third-party cookies. In fact, Google is fighting a rearguard action to maintain its grip on user information on the web.

Google’s very announcement of the technology exposes how biased they are.

It’s difficult to conceive of the internet we know today – with information on every topic, in every language, at the fingertips of billions of people – without advertising as its economic foundation.

How FLoC Works

In essence, with FLoC, your browser creates a list of all websites (TLDs) you visited in the last seven days and hashes them using something called a SimHash. Like other cryptographically secure hashes, it should be impossible to reverse this hashing process. Unlike other hashes, though, SimHashes can be compared to see how similar the originating data sets were.

Google uses these special hashes to create cohorts of web users – people with similar interests – that can be sold to advertisers to advertise against. This process happens on a user’s PC. The cohort ID will automatically be shared with any website the user visits. Similarly, web admins have to explicitly opt out from their websites being included in these calculations. Some websites that Google deems sensitive are excluded by default, but it is Google who decides what is sensible and what isn’t.

Details of how Google aims to eliminate the ability by anyone (including Google) to tell specific things about a user by his or her cohort can be found in this research paper: Measuring Sensitivity of Cohorts Generated by the FLoC API, Medina et al.

The term “research paper” is to be taken lightly here, however. This is essentially Google people advertising Google technology to the world; a kind of highbrow propaganda.

c.f.: FLoC API specification

Google’s initiative is controversial because, well, they are Google. But also because they forced a large number of Chrome users to test it for them without explicitly telling them (which would be illegal under the GDPR, which is why it isn’t tested in Europe) and made webmasters using their Google Ads service complicit in this. They are also forcing privacy-conscious administrators to add special code to their web infrastructure if they don’t want their visitors to be profiled based on their website.

Tracking vs. Profiling

FLoC isn’t tracking in a way that third-party cookies are. It anonymises users – although given what we know about metadata analysis (especially using machine learning algorithms) it remains to be seen how effective this will be. But declaring it benign because of this is short-sighted, in my view.

Google’s new technology is in essence profiling users. And that is exactly what advertisers were doing before with tracking cookies. They aren’t interested in what you specifically are up to on the web. They are trying to profile you and put you in a cohort. And that is exactly what FLoC does. It’s essentially a prejudice generator.

Profiling can also be very sinister. Large scale profiling of large populations into cohorts was essentially pioneered by the Cheka and its founder Felix Dzerzhinsky to build, and support for decades, a mass murdering regime that discriminated against people based on their upbringing, religion, past associations and work status. And you don’t have to be an ideologue hopped-up on the idea of discriminating against people based on largely imaginary “social classes” to use profiling of people for shady means.

We should also keep in mind that Google and other large adtech companies can still figure out who you are by other means if you use their services. Google analyses your email if you use Gmail. Facebook know who you talk to and what about.

FLoC is probably a good step insofar as, if – and this is a big if – it actually enables us to get rid of third-party cookies and similar technologies, it will make the life of intelligence agencies a bit harder who are spying on us. I don’t think it will change much in the way of how targeted advertising works. And it could in future be used for shady things we can’t conceive of yet.

Although I don’t agree with everything the EFF said in their attack of FLoC, I do agree with one thing: The alternative to third-party cookies should not be another advertising technology. The alternative should be just to get rid of these means of targeting advertisements all together. People don’t want this crap. Period.

Producer Feedback

Iwan Currie sends me boots on the ground feedback on Ubiquiti via the podcast’s Matrix room:

I manage my network from the Unifi app on my iPhone, and all of a sudden I lost the connection and couldn’t see my network at all, I tried logging in to the cloud portal from a PC – I could get to my dashboard but nothing was there. I tried logging in locally and that failed, so I assumed it was a fault with my UDM, I hard reset it and built the network again, for a time I was able to access the management interface locally using the cloud credentials, but not via the cloud, then some days later it was visible in the cloud again. The comms from Ubiquiti was shockingly bad. Basically it just amounted to “minor breach, change your password”.

It is amazing to me how I can discuss important civil liberty, privacy and security matters on this show and then the thing that generates almost the most feedback ever is a throwaway line of how I, personally, don’t like Microsoft Office for the work I do. Well, I guess that’s just the way it is.

And so, we have some more feedback on MS Office from Barry Williams from down under:

With regard to Office 365, IIRC there are two options. Anybody can use the web version for free, but if you have a paid subscription (or education license etc) you can either use the web client or install the desktop version. With the desktop version you can choose to save on your local PC or Onedrive (which gives you collaboration etc). When you do save to Onedrive it will (unless you disable it) save a local copy to your PC. In that case you can still work on the document if the servers are down and it will sync when they are back up.

As an IT teacher I did look into using a HTML5 slide deck but the fact that I often need to throw slides together multiple times a week I have just got used to using the WYSIWYG PowerPoint and while it would be a great distraction from getting actual work done to learn a new way I cannot at this point. I am sure if I spent some time finding the best way to create slides perhaps with Markdown I could actually do this quicker (I did have a quick look) but I currently do not have the motivation to do this.

With regard to not using Excel I agree Excel is far to much overused. Firstly, you say people are trained in their organisation to use Excel. Often they are not trained and just figure it out on the job and it often goes horribly wrong. It is claimed that 90% in production Excel spreadsheets have errors and there have been several catastrophic ones. I did consider teaching my students data visualisation in Python instead of Excel but somebody actually needs to teach them Excel, however I will show them it is an option next time.

We also had some other people saying they think Excel is excellent software. Let’s leave it at: I don’t agree and you will not change my mind, alright?

If you have any thoughts on the things discussed in this or previous episodes, please feel free to contact me. In addition to the information listed there, we also have an experimental Matrix room for feedback. Try it out if you have an account on a Matrix server. Any Matrix server will do.

Toss a Coin to Your Podcaster

I am a freelance journalist and writer, volunteering my free time because I love digging into stories and because I love podcasting. If you want to help keep The Private Citizen on the air, consider becoming one of my Patreon supporters.

You can also support the show by sending money to via PayPal, if you prefer.

This is entirely optional. This show operates under the value-for-value model, meaning I want you to give back only what you feel this show is worth to you. If that comes down to nothing, that’s OK with me. But if you help out, it’s more likely that I’ll be able to keep doing this indefinitely.

Thanks and Credits

I like to credit everyone who’s helped with any aspect of this production and thus became a part of the show. This is why I am thankful to the following people, who have supported this episode through Patreon and PayPal and thus keep this show on the air: Georges, Butterbeans, Michael Mullan-Jensen, Jonathan M. Hethey, Niall Donegan, Dave, Steve Hoos, Shelby Cruver, Vlad, Jackie Plage, 1i11g, Philip Klostermann, Jaroslav Lichtblau, Kai Siers, ikn, Michael Small, Fadi Mansour, Joe Poser, Dirk Dede, Bennett Piater, Matt Jelliman, David Potter, Larry Glock, Mika, Martin, Dave Umrysh, tobias, MrAmish, RikyM, drivezero, m0dese7en, avis, Jonathan Edwards, Barry Williams, Sandman616, Neil, Captain Egghead, Rizele, D, Iwan Currie and noreply.

Many thanks to my Twitch subscribers: Mike_TheDane, Galteran, m0dese7en_is_unavailable, Flash_Gordo, centurioapertus, indiegameiacs, Sandman616, redeemerf and buttrbeans.

I am also thankful to Bytemark, who are providing the hosting for this episode’s audio file.

Podcast Music

The show’s theme song is Acoustic Routes by Raúl Cabezalí. It is licensed via Jamendo Music. Other music and some sound effects are licensed via Epidemic Sound. This episode’s ending song is Walk Ahead by Blood Red Sun.