When Whitfield Diffie, Ronald Rivest, Steven M. Bellovin, Peter Neumann, Matt Blaze and Bruce Schneier come together to publish a paper on the security and privacy implications of client-side scanning, we should listen up.

Several highly regarded cryptography experts have published a paper explaining why client-side interception of end-to-end encrypted communications is a bad idea. The Private Citizen is on the scene to report about it.

This podcast was recorded with a live audience on my Twitch channel. Details on the time of future recordings can usually be found on my personal website. Recordings of these streams get saved to a YouTube playlist for easy watching on demand after the fact.

Why Client-Side Scanning is a Bad Idea

In a recently published paper a number of prominent security experts explain why client-side scanning, ie. intercepting end-to-end encrypted communications at their source, is dangerous and disadvantageous to society as a whole.

These experts include: Whitfield Diffie of Diffie-Hellman fame (co-invented public-key crypto), Ronald Rivest who co-invented RSA, Steven M. Bellovin who co-invented encrypted key exchange and is credited with inventing the firewall, Josh Benaloh who invented the Benaloh cryptosystem, Jon Callas who is one of the founders of PGP Inc. and the co-founder of Silent Circle, Peter Neumann who is the editor of the RISK Digest, Carmela Troncoso who was the main author of the original DP-3T paper, as well as Bruce Schneier, Matt Blaze and Ross Anderson, all well-known crypto and security experts. Professors Matthew Green and Nicolas Papernot, among others, helped with the editing.

In the introduction to their paper, the group remarks:

Our increasing reliance on digital technology for personal, economic, and government affairs has made it essential to secure the communications and devices of private citizens, businesses, and governments. This has led to pervasive use of cryptography across society. Despite its evident advantages, law enforcement and national security agencies have argued that the spread of cryptography has hindered access to evidence and intelligence. Some in industry and government now advocate a new technology to access targeted data: client-side scanning (CSS). Instead of weakening encryption or providing law enforcement with backdoor keys to decrypt communications, CSS would enable on-device analysis of data in the clear. If targeted information were detected, its existence and, potentially, its source, would be revealed to the agencies; otherwise, little or no information would leave the client device. Its proponents claim that CSS is a solution to the encryption versus public safety debate: it offers privacy—in the sense of unimpeded end-to-end encryption—and the ability to successfully investigate serious crime.

In this report, we argue that CSS neither guarantees efficacious crime prevention nor prevents surveillance. Indeed, the effect is the opposite. CSS by its nature creates serious security and privacy risks for all society while the assistance it can provide for law enforcement is at best problematic. There are multiple ways in which client-side scanning can fail, can be evaded, and can be abused.

Its proponents want CSS to be installed on all devices, rather than installed covertly on the devices of suspects, or by court order on those of ex-offenders. But universal deployment threatens the security of law-abiding citizens as well as law-breakers. Technically, CSS allows end-to-end encryption, but this is moot if the message has already been scanned for targeted content. In reality, CSS is bulk intercept, albeit automated and distributed. As CSS gives government agencies access to private content, it must be treated like wiretapping. In jurisdictions where bulk intercept is prohibited, bulk CSS must be prohibited as well.

Although CSS is represented as protecting the security of communications, the technology can be repurposed as a general mass-surveillance tool. The fact that CSS is at least partly done on the client device is not, as its proponents claim, a security feature. Rather, it is a source of weakness. As most user devices have vulnerabilities, the surveillance and control capabilities provided by CSS can potentially be abused by many adversaries, from hostile state actors through criminals to users' intimate partners. Moreover, the opacity of mobile operating systems makes it difficult to verify that CSS policies target only material whose illegality is uncontested.

The introduction of CSS would be much more privacy invasive than previous proposals to weaken encryption. Rather than reading the content of encrypted communications, CSS gives law enforcement the ability to remotely search not just communications, but information stored on user devices.

Introducing this powerful scanning technology on all user devices without fully understanding its vulnerabilities and thinking through the technical and policy consequences would be an extremely dangerous societal experiment. Given recent experience in multiple countries of hostile-state interference in elections and referenda, it should be a national-security priority to resist attempts to spy on and influence law-abiding citizens. CSS makes law-abiding citizens more vulnerable with their personal devices searchable on an industrial scale. Plainly put, it is a dangerous technology. Even if deployed initially to scan for child sex-abuse material, content that is clearly illegal, there would be enormous pressure to expand its scope. We would then be hard-pressed to find any way to resist its expansion or to control abuse of the system.

The ability of citizens to freely use digital devices, to create and store content, and to communicate with others depends strongly on our ability to feel safe in doing so. The introduction of scanning on our personal devices – devices that keep information from to-do notes to texts and photos from loved ones – tears at the heart of privacy of individual citizens. Such bulk surveillance can result in a significant chilling effect on freedom of speech and, indeed, on democracy itself.

Further into their work, the authors make an interesting point that connects this scanning with the Culture Wars that are currently being fought. Citizens' privacy might not be invaded to make them safe or to stop terrorists and organised crime, but to censor their speech (even on private devices):

Many online service providers that allow users to send arbitrary content to other users already perform periodic scanning to detect objectionable material and, in some cases, report it to authorities. Targeted content might include spam, hate speech, animal cruelty, and, for some providers, nudity. Local laws may mandate reporting or removal. For example, France and Germany have for years required the takedown of Nazi material, and the EU has mandated that this be extended to terrorist material generally in all member states. In the US, providers are required to report content flagged as CSAM to a clearinghouse operated by the National Center for Missing and Exploited Children (NCMEC), while in the UK a similar function is provided by the Internet Watch Foundation (IWF).

Historically, content-scanning mechanisms have been implemented on provider-operated servers. Since the mid-2000s, scanning has helped drive research in machine-learning technologies, which were first adopted in spam filters from 2003. However, scanning is expensive, particularly for complex content such as video. Large machine-learning models that run on racks of servers are typically complemented by thousands of human moderators who inspect and classify suspect content. These people not only resolve difficult edge cases but also help to train the machine-learning models and enable them to adapt to new types of abuse.

One incentive for firms to adopt end-to-end encryption may be the costs of content moderation. Facebook alone has 15,000 human moderators, and critics have suggested that their number should double. The burden of this process is much reduced by end-to-end encryption as the messaging servers no longer have access to content. Some moderation is still done based on user complaints and the analysis of metadata. However, some governments have responded with pressure to re-implement scanning on user devices.

Client-side scanning is basically a wiretap because it makes available content that the victim assumed to be private.

Moving scanning from the server to the client pushes it across the boundary between what is shared (the cloud) and what is private (the user device). By creating the capability to scan files that would never otherwise leave a user device, CSS thus erases any boundary between a user’s private sphere and their shared (semi-)public sphere. It makes what was formerly private on a user’s device potentially available to law enforcement and intelligence agencies, even in the absence of a warrant. Because this privacy violation is performed at the scale of entire populations, it is a bulk surveillance technology.

Considering the many different ways we use our phones, and the apps on those phones, these days, it might even be argued that spying on this content is as close to spying on people’s thoughts as you can get.

Despite this, client-side scanning might actually be less useful than other surveillance if it targets people who suspect they are being targeted and are prepared for it.

Both distributors and consumers of targeted material may seek to defeat a CSS system by making it useless for enforcement. This can be done in broadly two ways: first, by ensuring that targeted material of interest to them evades detection (i.e., by increasing the rate of false negatives), and second, by tricking the CSS system into flagging innocuous content, thereby flooding it with false alarms (i.e., by increasing the rate of false positives).

Such attacks are not new. They have been carried out for years on server-side scanners such as spam filters, but a move to client-side scanning brings one telling advantage to adversaries. The adversary can use its access to the device to reverse engineer the mechanism. As an example, it took barely two weeks for the community to reverse engineer the version of Apple’s NeuralHash algorithm already present in iOS 14, which led to immediate breaches. Apple has devoted a major engineering effort and employed top technical talent in an attempt to build a safe and secure CSS system, but it has still not produced a secure and trustworthy design.

At the end of their paper, the authors conclude that “CSS cannot be deployed safely”.

CSS has been promoted as a magical technological fix for the conflict between the privacy of people’s data and communications and the desire by intelligence and law enforcement agencies for more comprehensive investigative tools. A thorough analysis shows that the promise of CSS solutions is an illusion.

Technically, moving content scanning from the cloud to the client empowers a range of adversaries. It is likely to reduce the efficacy of scanning, while increasing the likelihood of a variety of attacks.

Economics cannot be ignored. One way that democratic societies protect their citizens against the ever-present danger of government intrusion is by making search expensive. In the US, there are several mechanisms that do this, including the onerous process of applying for a wiretap warrant (which for criminal cases must be essentially a “last resort” investigative tool) and imposition of requirements such as “minimization” (law enforcement not listening or taping if the communication does not pertain to criminal activity). These raise the cost of wiretapping.

By contrast, a general CSS system makes all material cheaply accessible to government agents. It eliminates the requirement of physical access to the devices. It can be configured to scan any file on every device.

It is unclear whether CSS systems can be deployed in a secure manner such that invasions of privacy can be considered proportional. More importantly, it is unlikely that any technical measure can resolve this dilemma while also working at scale. If any vendor claims that they have a workable product, it must be subjected to rigorous public review and testing before a government even considers mandating its use.

This brings us to the decision point. The proposal to preemptively scan all user devices for targeted content is far more insidious than earlier proposals for key escrow and exceptional access. Instead of having targeted capabilities such as to wiretap communications with a warrant and to perform forensics on seized devices, the agencies' direction of travel is the bulk scanning of everyone’s private data, all the time, without warrant or suspicion. That crosses a red line. Is it prudent to deploy extremely powerful surveillance technology that could easily be extended to undermine basic freedoms?

Were CSS to be widely deployed, the only protection would lie in the law. That is a very dangerous place to be. We must bear in mind the 2006 EU Directive on Data Retention, later struck down by the European Court of Justice, and the interpretations of the USA PATRIOT Act that permitted bulk collection of domestic call detail records. In a world where our personal information lies in bits carried on powerful communication and storage devices in our pockets, both technology and laws must be designed to protect our privacy and security, not intrude upon it. Robust protection requires technology and law to complement each other. Client-side scanning would gravely undermine this, making us all less safe and less secure.

Producer Feedback

Producer Barry Williams said (via Discord):

While I 100% agree that people should use correct language, at what point does the term SARS-CoV-2 cause more misunderstandings than COVID-19? I am not sure about German but English is a fluid language and common use dictates definition. Certain dictionaries have added a sub definition to “literally” to include the exact opposite, as in common use “I literally broke my leg off”.

This podcast is the only time I have heard the term SARS-CoV-2 and I was happy for the education. However, when I go to use this term correctly in place of COVID-19 it feels it may cause confusion.

To which astralc responded:

I think it is important to use the correct term – is it the virus or the disease? Also every word has extra meaning behind it (in a newspeak way). It may be from propaganda or groupthink, but it still have the extra meaning – “Pandemic! comply or you’re killing grandma!” vs “A virus mutation close to SARS”. It is especially important when reporting on medical issues (incl. treatments). It the same as when the media started to use HIV instead of AIDS – not sure about the reason (California bullshit like “remove the stigma”?).

If using the correct name is not important, what’s wrong with “China Virus” & “Wuhan Disease”? They still describe the same thing, but have the extra meaning of the true origin.

If you have any thoughts on the things discussed in this or previous episodes, please feel free to contact me. In addition to the information listed there, we also have an experimental Matrix room for feedback. Try it out if you have an account on a Matrix server. Any Matrix server will do.

Toss a Coin to Your Podcaster

I am a freelance journalist and writer, volunteering my free time because I love digging into stories and because I love podcasting. If you want to help keep The Private Citizen on the air, consider becoming one of my Patreon supporters.

You can also support the show by sending money to via PayPal, if you prefer.

This is entirely optional. This show operates under the value-for-value model, meaning I want you to give back only what you feel this show is worth to you. If that comes down to nothing, that’s OK with me. But if you help out, it’s more likely that I’ll be able to keep doing this indefinitely.

Thanks and Credits

I like to credit everyone who’s helped with any aspect of this production and thus became a part of the show. This is why I am thankful to the following people, who have supported this episode through Patreon and PayPal and thus keep this show on the air:

Georges, Steve Hoos, Butterbeans, Jonathan M. Hethey, Michael Mullan-Jensen, Dave, Michael Small, 1i11g, Jaroslav Lichtblau, Jackie Plage, Philip Klostermann, Vlad, ikn, Bennett Piater, Kai Siers, tobias, Fadi Mansour, Rhodane the Insane, Joe Poser, Dirk Dede, m0dese7en, Sandman616, David Potter, Mika, Rizele, Martin, avis, MrAmish, Dave Umrysh, drivezero, RikyM, Cam, Barry Williams, Jonathan, Captain Egghead, RJ Tracey, Rick Bragg, D, Robert Forster, Superuser, Noreply and astralc.

Many thanks to my Twitch subscribers: Mike_TheDane, jonathanmh_com, Sandman616, BaconThePork, m0dese7en_is_unavailable, l_terrestris_jim, Galteran, redeemerf, buttrbeans and jj_guevara.

I am also thankful to Bytemark, who are providing the hosting for this episode’s audio file.

Podcast Music

The show’s theme song is Acoustic Routes by Raúl Cabezalí. It is licensed via Jamendo Music. Other music and some sound effects are licensed via Epidemic Sound. This episode’s ending song is Breathe In, Breathe Out by Mattias Tell.