Episode 20: The Happy Plumbers Who Know Everything About You

Almost a quarter of US consumers have given a company access to their bank account that they probably have never heard of. This shadowy company, which is collecting all of this data on financial transactions is called Plaid and they are coming for your bank account next.

Welcome to The Private Citizen! This time we look at the behind-the-scenes of modern app payment transactions and what kind of data is harvested by the companies involved. Specifically, the financial services provider Plaid. We’ll see why this industry is waging a covert war on cash and what it means for people who value their privacy.

This is one of these topics that at first glance only seems to affect people from the US. But all other listeners should pay attention, too, because the history of tech tells us that this stuff sooner or later gets implemented locally for us as well. If we react to it now we might be able to stop some of it. And at the very least we will be aware of what’s going on before everyone else is.

What is Plaid?

Plaid, a relatively unknown financial services company. Forbes characterises them as “Fintech’s Happy Plumbers”. The company was created almost as an accident. Originally, the two founders wanted to create a financial planning tool and needed a way to simplify accessing the bank accounts of their users.

Many businesses still depended on one-penny micro-transactions to verify customer bank accounts. Others uploaded PDFs of paper statements and typed in the data manually. Perret and Hockey sought to create an application programming interface, or API, to perform the same function with only a bank customer’s online user name and password.

The financial planning app failed, but the two had the idea to sell their bank login system as a service.

Setting up shop in New York in the spring of 2012, with Perret as CEO and Hockey as CTO, the pair scored a stroke of luck. Venmo’s engineering chief was in the process of cutting the cost of a peer-to-peer money transfer for making payments. The solution was settling transactions in big batches; while Venmo customers would transact instantly, the actual payment was delayed a day. Plaid helped remove the risk: Venmo would know in real time that the sender had a sufficient bank balance. Venmo’s validation helped the startup take off among other fintech customers that were looking to emulate Venmo’s success. Some now-well-known apps would sign up months before they became household names.

Today Plaid’s reach extends across tens of millions of end users and thousands of apps, which account for hundreds of billions in spending and financial planning. The company’s revenue was $40 million last year, according to Forbes' estimate, and its cash-flow is close to breakeven.

Why is Plaid Worth $5.3 Billion?

On 13 January 2020, Plaid was acquired by Visa for $5.3 billion. Most people have never heard of the company, even though many people in the US actively use their services.

On January 13, all major news outlets reported that Visa would pay $5.3 billion to purchase a FinTech company named Plaid – a company that most consumers have never even heard of. How can a consumer-focused financial company that consumers have never heard of be worth so much? Without realizing it, many of us actually have used Plaid’s services. After all, Plaid is one of the biggest U.S. data aggregators.

Unbeknownst to most consumers, in recent years Plaid has become a key financial industry player using its screen scraping technology and API software to enable start-ups’ apps to get data, which they need in order to successfully operate, from our bank accounts. Plaid’s services are ubiquitous; they’re used by FinTech companies such as Venmo, mobile investing app Robinhood, and cryptocurrency exchanges Coinbase and Gemini.

But Plaid is much more than an enabling tech company. Serving as an intermediary between FinTech startups and banks, Plaid is a gatekeeper to consumer financial data, and its services are critical to FinTech companies. FinTech companies do not bypass companies like Plaid because it is too challenging, costly, and time-consuming for them to connect to customer accounts in thousands of different U.S. banks. Accordingly, they interact with only a few data aggregators, like Plaid, and have them do that work for them.

So how do companies like Plaid help other apps get access to our banking data? After all, we’ve never interacted with Plaid before, right?

To fill their role as this necessary intermediary, data aggregators first need to gain access to consumer financial information that is kept at the banks. This happens when we download and sign up for a FinTech app on our smartphones, and the app requires that we enter our bank account login and password information. We’re providing our bank account login and password to a data aggregator who uses those credentials to access the information and provide services to the FinTech app we’re using.

Frequently, data aggregators store the login credentials and then use the credentials to persistently log into the consumers’ bank accounts and copy all the data, ranging from transaction information, to account numbers, to personally identifiable data. The aggregators then attempt to put that financial information about each consumer under one roof – the consumer’s “dashboard,” which can exhibit one’s investments, savings, insurance policies, credit balances, tax planning, budgeting, and even data on home value / mortgage.

Historically, our consumer data was jealously protected by the highly regulated banks. However, online shopping using confidential financial accounts’ credentials, and downloading personal finance apps on our smartphones have changed all that. Understanding the importance of such data, many believe that consumers’ ability to control their data has become a modern imperative. That notion is tightly linked to the concept of open banking – an initiative that lets customers control and share their banking financial data. But with no regulation or adopted standards of ethical gathering and use of data, consumers’ privacy and their accounts’ safety are jeopardized. In the EU, the legal status of third parties’ rights to access consumers’ financial data is anchored in the new Payment Services Directive II, but that is not the case in the U.S. The American approach to open banking has been a market-based one, in which, without consumers noticing, data aggregators have become a significant player.

But why is this worth billions to Visa? Here’s what a guy from within the industry thinks:

In screen scraping, I ask the consumer for credentials, log in on behalf of them and extract the contents of the HTML. In this process, Plaid most likely obtains explicit consumer permissions to use the data. This is perhaps their biggest asset, as most consumer banks usually maintain permission within a defined set of banking services. Thus Plaid’s APIs are used (by Apps or permissioned data users) to querry against this warehouse (or request real time updates from screen scraping).

Plaid is a Bad Actor

They do what? Screen scraping? Yep, you heard correctly. You give Plaid your banking credentials and then they turn around and log into your bank’s website (or API) for you. Thing is, they can do this anytime they want. And your bank can’t tell it isn’t you. Since Plaid actually has your banking credentials, they have the same access to your bank account as you do. In fact, Plaid is currently being sued because they do this.

Plaid’s software is used by more than 2,000 apps to link consumer financial accounts, and about 1 in 4 people in the U.S. have an account linked via Plaid, the suit says. Plaid uses that access to deceptively obtain bank account information from users, accessing information back up to five years, averaging 3,700 transactions per consumer, the suit says. The app also allegedly gathers information on accounts maintained for others such as relatives and children, and has amassed data from over 200 million distinct financial accounts.

The suit, filed Monday, alleges that when a user enters their bank login information on an app that uses Plaid, the credentials, including security layers such as security questions and answers and one-time passwords, are transmitted directly to Plaid, rather than to the bank. Plaid then uses that information to access the consumer’s bank account multiple times a day, gathering private information and then selling it, the suit says.

A login screen with your bank’s branding is actually controlled by and connected to Plaid, the suit says, which uses bank logos to provide a false sense of comfort for users. Additionally, the privacy policy is not meaningfully presented to users, the suit claims.

“Plaid disputes these baseless allegations, and plans to vigorously defend itself against the lawsuit,” according to a company statement. “Plaid firmly believes that consumers should have permission-based access to and control over their financial data, and embodies these principles in our practices. “To be clear, Plaid does not obtain consumers’ personal information without their consent, nor does Plaid sell or rent consumers’ personal information.”

Now, they presumably announce somewhere in their terms of service that they are doing this. But seeing as a majority of their users seem to have no idea they are even using the service, that can hardly count as obtaining consent in a reasonable argument.

→ c.f.: Cottle et al. vs. Plaid, United States District Court for the Northern District of California

The War on Cash

Have you ever asked yourself why everyone wants you to pay digitally these days? And how all of these fintech companies make money? The war on cash is just another part of surveillance capitalism. Everyone wants your data.

We’ve become accustomed to the grim fact that nearly every major advertiser, website, and personal device maker collects and monitors users’ data to some extent. Some do it for their own purposes. Others do it in the service of various algorithmic spymasters, such as Facebook or Google, which analyze vast arrays of personal information – from social media likes to GPS locations – to serve up relevant ads. But to understand shopping behavior with certainty, you need credit card data. Over the past decade, consumer purchases have quietly become one of the most sought-after and lucrative data sets, used by Wall Street and Madison Avenue alike to infer shoppers’ tastes, budgets, and plans.

These transactions have given rise to a complex data-selling ecosystem. At the heart of it are credit card processing networks, including Visa, American Express, and Mastercard, the latter of which took in $4.1 billion in 2019 – a quarter of its annual revenue – from leveraging its warehouse of transaction data for services that include marketing analytics as well as reward programs and fraud detection. And then there are the banks, retailers, payment processors, and software companies that empower online transactions. Few disclose their methods; some actively obfuscate their work; all vow that personal data is anonymized and aggregated, and therefore secure.

Let that sink in: Mastercard makes a quarter of its revenue from analysing payment data.

Companies have been tapping into transaction data to sell us more things as early as the 1990s, when credit card giants such as American Express analyzed purchases to tailor special offers to cardholders. Marketers with more limited vantage points, meanwhile, pooled the data from their own cash registers to get a better view of their customers. The landscape changed dramatically when fintech startups came knocking a decade later. Banks were at first wary of sharing data and working with them, largely because of the 1999 Gramm-Leach-Bliley Act, which mandates penalties on financial institutions that put customer data, including names, birthdays, addresses, and other personal identifiable information, at risk. To solve this, the startups implemented a sophisticated system that erases personal details and replaces them with randomly generated pseudonyms that act like ID codes: They are unintelligible on their own, but can later be matched up with individual customer files. This substitution system (also known as tokenization) is now standard.

Chip cards, contactless payment systems such as Apple Pay, online payment methods, and other internet banking technologies rely on it to connect with one another. They even form daisy chains: If an e-commerce app needs to accept credit cards, it uses software provided by a payment processor like Stripe. If a financial services app such as Acorns wants to link to customers’ bank accounts, it can use an API from Plaid, which automates logins. If a wealth-management app wants to give users a dashboard view of their credit card, savings, and investment accounts, it can use software from a company called Yodlee.

Today, any American who has bought something online has almost certainly had their data passed along by their card company and middleware startups. And some of those middlemen profit from what they see by selling information to marketers, hedge funds, and other brokers.

Of course, the “anonymisation” doesn’t work.

Tokenization “effectively created a loophole,” says Yves-Alexandre de Montjoye, who heads the computational privacy group at Imperial College London, and who has advised the European Commission on privacy issues. By removing names and other details, companies can argue “that it’s not personal data; it’s ‘anonymized,’” he says. But it isn’t so anonymous. In 2015, de Montjoye and colleagues at MIT took a data set containing three months’ worth of credit card transactions by 1.1 million unnamed people, and found that, 90% of the time, they could identify an individual if they knew the rough details (the day and the shop) of four of that person’s purchases. In other words, a combination of a few receipts, tweets, and Instagram photos of you dining out is enough to reveal your other purchases.

All of this is happening under a veil of secrecy. Credit card companies may acknowledge that they make money from analyzing transactions, but they are vague about what data they actually share.

And, naturally, all of this doesn’t work if you pay cash. Which is why nobody wants you to pay cash anymore.

Feedback

This time in the feedback section, we’ll talk some inside baseball as I got some comments about the recent infrastructure change for the podcast’s website.

Ron (a long time listener from back in the day when I produced Linux Outlaws) asks:

I thought I heard that you were moving your website to a different host because GoDaddy was taking over the former. I am wondering if you could tell me if there are reasons that I should not use GoDaddy, and if so if you can suggest a few inexpensive alternatives.

Martin says I wouldn’t have had to switch to Netlify to publish my Hugo blog from GitHub. He considers it “the standard” to use GitHub Actions to compile the page and then deploy via SFTP. GitLab can compile and host Hugo sites via GitLab Pages.

If you also have thoughts on the things discussed here, please feel free to contact me.

Toss a Coin to Your Podcaster

I am a freelance journalist and writer, volunteering my free time because I love digging into stories and because I love podcasting. If you want to help keep The Private Citizen on the air, consider becoming one of my Patreon supporters.

You can also support the show by sending money to via PayPal, if you prefer.

This is entirely optional. This show operates under the value-for-value model, meaning I want you to give back only what you feel this show is worth to you. If that comes down to nothing, that’s OK with me, pard. But if you help out, it’s more likely that I’ll be able to keep doing this indefinitely.

Thanks and Credits

I like to credit everyone who’s helped with any aspect of this production and thus became a part of the show.

Aside from the people who have provided feedback and research and are credited as such above, I’m thankful to Raúl Cabezalí, who composed and recorded the show’s theme, a song called Acoustic Routes. I am also thankful to Bytemark, who are providing the hosting for this episode’s audio file.

But above all, I’d like to thank the following people, who have supported this episode through Patreon or PayPal and thus keep this show on the air: Niall Donegan, Michael Mullan-Jensen, Jonathan M. Hethey, Georges Walther, Dave, Eric gPodder Test, Butterbeans, Kai Siers, Mark Holland, Steve Hoos, Shelby Cruver, Fadi Mansour, Vlad, Matt Jelliman, Joe Poser, Jackie Plage, 1i11g, ikn, Dave Umrysh, Dirk Dede, David Potter, Vytautas Sadauskas, RikyM, drivezero, Mika, Jonathan Edwards, Barry Williams, Silviu Vulcan, S.J., Daniel B. and Bennett Piater.

Episode 20: The Happy Plumbers Who Know Everything About You

What is Plaid? #

Why is Plaid Worth $5.3 Billion? #

Plaid is a Bad Actor #

The War on Cash #

Feedback #

Toss a Coin to Your Podcaster #

Thanks and Credits #