no plan for scams

what's happening to porn twitter will happen to the digital commons

Dec 29, 2022

In the early 2000s, the emerging technology of “e-mail” was thrust into crisis. It cost almost nothing to send an email, and spammers took advantage of this to flood everyone’s inboxes with malware and ads for penis pills. It was a legitimate threat to the usability of the protocol. In response, software engineers and politicians converged on the problem, working groups were formed, conferences were held, legislation was passed, spam filters were developed, and within a few years email was mostly functional again.

In hindsight this outcome feels like a statistical and economic inevitability—like, of course we can write software to distinguish “BUY!! PENIS!! PILLS!!” from legitimate mail and of course we’re not gonna let some script kiddies to derail a massive engine of productivity. But at the time everything was uncertain. Bill Gates thought we would use digital stamps (wrong), lawyers thought liability would fix everything (wrong), and police were playing whack-a-mole by tossing individual spammers in jail (lol).

Wikipedia has a decent summary of how things unfolded, but you can’t really absorb the vibe unless you read primary source documents. Fortunately, many of these are accessible without a technical background. Some of my favorites include:

The Spammer’s Compendium (2003) catalogs basic spam techniques from this era, and also features some amazing clip art.
LWN article on the 2004 MIT Spam Conference. There are some wild (for the time) ideas in here including microlending and hash-based proof-of-work.
A Plan for Spam (2002) outlines the Naive Bayesian approach which became the basis of effective spam filtering software. This is all incredibly primitive compared to modern machine learning.

By the mid 2000s there was a legal framework in place to deter spammers and statistical spam filtering was good enough that we were mostly free from Viagra ads. Targeted phishing and identity theft remained, but these couldn’t bring down the entire ecosystem in the same way as spam because they don’t scale. A few office buildings worth of scammers are a nuisance, but not an existential threat.

all watched over by language models of loving grace

Fast forward to 2022. When we browse the web today, our default assumption is that most of the text we read is written by humans, and most of the images and video we see depicts events that happened in real life. Of course, we know that bots and AI-generated images and deepfakes exist, but we can usually recognize them on sight and they are still relatively rare. This will all be inverted within a few years. When recent advances in machine learning are more widely available, the majority of content that exists will be AI-generated, including images, audio, music, and short video.

The most transformative technology here is not the ability to generate multimedia, but the ability to generate text. LLMs (large language models, e.g. ChatGPT) are shockingly good at communicating like human beings, they have an enormous amount of knowledge about culture and history and sexuality, they can reason and be creative, and soon they will be able to browse the internet and use other software tools. I encourage you to play around with them to get a feel for their capabilities. In the summer of 2022, I used GPT-3 to post on my Twitter account for a few months—it understood my style and even wrote some pretty complex arguments which I agreed with but had never discussed before. I don’t think anyone noticed the difference.

The long-term impact of LLMs is uncertain, and I’m generally optimistic about the creativity they will unlock. But in the short term, one of the first applications will be the automation of identity scams at an unprecedented scale. If you want a picture of this future, you need only look at pornstar twitter.

Scams are ubiquitous on porn-adjacent social media. Scammers create fake accounts, claiming to be pornstars and sex workers, and then defraud fans by offering fake services or reselling content. These accounts are so numerous and persistent that even large porn studios will accidentally link/tag a scammer’s account instead of a performer’s real profile. With LLMs, scammers will automate the ‘labor’ of interacting with their marks, and with image/audio/video tools they will be able to post convincing deepfakes on demand. Even a video call will no longer be a reasonable assurance that you are talking to a human being!

As LLMs become productionized, porn-adjacent social media will be the first grotesque laboratory of these fully or partially automated scams, as porn typically exists in a TOS gray area and performers have failed to quash the existing fake account problem. There are already catfish findommes on Twitter offering AI services, and (I suspect but cannot confirm) others using LLMs without disclosing this.

the monkey in the machine and the machine in the monkey

We saved email in the 2000s by employing statistical methods to distinguish spam from non-spam. Can we do the same with generated media?

Despite what some frankly misleading tools may tell you, identifying LLM-generated text is extremely difficult, and may remain an unsolved problem. Some smart folks at OpenAI are working on tweaking the outputs of GPT models to make them cryptographically identifiable, which is neat but not a general-purpose solution. Model training keeps getting cheaper and open source and even crowdsourced LLMs aren’t far behind proprietary models—if anyone can train a model without cryptographic signatures, then it doesn’t really matter whether OpenAI has this feature or not. In the image and video domains, automated deepfake detection is quite poor, and state-of-the-art image models are already open source. Overall, we will not be able to verify human beings based on content alone.

Instead, tech companies will likely pivot to multi-factor authentication, especially SMS and app-based authentication which are harder to spoof. We may also see a cottage industry of identity verification services using government ids or biometrics to verify the provenance of accounts. Whether these services will be friendly to sex workers is unfortunately an open question.

It’s also worth recalling what didn’t work in the spam wars of the 2000s. The first comment on that 2004 LWN article bemoans that no one is discussing PGP, a cryptographic system which can sign messages and build a “web of trust” among users who vouch for each other. I adore this comment because it could have been written at any time in between 1998 and 2022. PGP still hasn’t achieved widespread usage in email because it’s incredibly cumbersome to use, and webs of trust are easily infiltrated by bad actors. This is unfortunate as a PGP-like solution would not require centralized control and would be harder to censor. For now, though, cryptography has failed to deliver the goods.

digital ghosts trying and failing to embrace

If the history of spam is any indication, the internet is far too useful to die, and in a few years we will have reasonably effective bot detection via some combination of the solutions above. But what should adult content creators do in the interim, before tech companies and the law catch up to LLM scams? I have a few ideas, but keep in mind this is all guesswork.

Do things that don’t scale. Anyone can make scam accounts, but it still costs money to maintain a good personal website—therefore, having one is a harder-to-fake signal of authenticity. Unfortunately, social media platforms will be quick to capitalize on this and it might be in your best interest to pay up.
Work together. Cryptographic webs of trust are difficult to implement, but informal ones can fill the same role. Studio sites including links to performer social media (and keeping these links up to date!) would be a simple way to build an informal network.
Influence standards. The adult industry’s power is early adoption, as we’ve seen with VHS vs Betamax, Blu-ray vs HD DVD, and VR. When verification services or protocols arrive, adult performers may be some of their first customers, which could help everyone else standardize around a unified solution.
Use machine learning tools. It’s already possible to partially automate the emotional labor of existing online. As LLMs artificially increase the demand for emotional labor, they can also increase the supply. For example, you can use an LLM to filter out negative/tedious replies, or summarize feedback instead of reading it directly, or answer fan questions.

We may need LLMs to make the social media experience palatable, but they will also alter the relationship between creator and audience. It may be difficult to maintain (para)social relationships when machine learning models are writing on our behalf—do you really feel a connection with someone if you write them a message and it gets summarized and replied to by an LLM?

What happens next is hard to predict. We may see a retreat to closed platforms like Discord or private forums as human authenticity becomes an even more valuable commodity. Or we may simply adapt to a new internet where human beings are a rare oddity in an ocean of bots. The future is coming, goonfriends—be ready.

gooncultist

Discussion about this post