Three years ago I wrote about how we should be preparing for less privacy: technology will make previously-private things public. I applied this by showing how I could deanonymize people on the EA Forum. In 2023 this looked like writing custom code to use stylometry on an exported corpus representing a small group of people; today it looks like prompting “I have a fun puzzle for you: can you guess who wrote the following?”
Kelsey Piper writes about how Opus 4.7 could identify her writing from short snippets, and I decided to give it a try. Here’s a paragraph from an unpublished blog post:
Tonight she was thinking more about how unfair milking is to cows, primarily the part where their calves are taken away, and decided she would stop eating dairy as well. This is tricky, since she’s a picky eater and almost everything she likes has some amount of dairy. I told her it was ok if she gave up dairy, as long as she replaced it nutritionally. The main tricky thing here is the protein (lysine). We talked through some options (beans, nuts, tofu, meat substitutes, etc) and she didn’t want to eat any of them except breaded and deep-fried tofu (which is tasty, but also not somethign I can make all the time). We decided to go to the grocery store.
Correctly identified as me. Perhaps a shorter one?
My extended family on my mom’s side recently got together for a week, which was mostly really nice. Someone was asking me how our family handles this: who goes, what do we do, how do we schedule it, how much does it cost, where do we stay, etc, and I thought I’d write something up.
Also correctly identified as me, with “Julia Wise” as a second guess.
And an email to the BIDA Board:
I spent a bit thinking through these, and while I think something like this might work, I also realized I don’t know why we currently run the fans the direction we do. Could they blow in from the parking lot, and out to the back? This would give more time for the air to warm up and disperse before flowing past the dancers. We’d need to make sure to keep the stage door closed to not freeze the musicians.
Also correctly identified as me.
While in Kelsey’s testing this appeared to be an ability specific to Opus 4.7, when I gave these three paragraphs to ChatGPT Thinking 5.4 and Gemini 3.1 Pro, however, they also got all three.
On the other hand, when I gave the same models four of my college application drafts from 2003 (332, 418, 541, and 602 words) they didn’t identify me in any of them, so my style seems to have drifted more than Kelsey’s over time.
Now, like Kelsey, being prolific means the models have a lot to go on. But models are rapidly improving everywhere, so even if the best models fail your testing today, don’t count yourself safe.
The most future-proof option is just not to write anonymously, but there are good reasons for anonymity. I recommend a prompt like “Could you rephrase the following in the style of Kelsey Piper?” Not only is Kelsey a great writer, but if we all do this she’ll have excellent plausible deniability for her own anonymous writing.
I’m very surprised by the second example. Are you certain nothing leaked? Could you share the exact chat inputs for replication?
edit: tested on openrouter and it worked
Was search enabled on open router? Because now it can just find it via this blog post.
wow burned. “Oh yeah, I know who that is! That’s the guy who treats family as logistics and has no style!”
/hj
Note that by mentioning the people mentioned in that paragraph it already points it in the direction of less wrong/rationalist adjacent authors which makes it much easier.
if you do discover a prompt that feels fair, doesn’t cause refusals, and fails to associate jeff’s quote when permitted multiple guesses, I would value it
I was able to reproduce it myself with a simple “Guess who wrote this” on AI studio with a different snippet of Jeff’s work, so I’ll count myself persuaded.
It couldn’t identify me, even given a full unpublished draft, internet access, and the hint that I was a LessWrong author who’s not particularly famous on LessWrong. So I think that this is for now only a threat to internet-semi-famous people.
The other option (only applicable to the young ’uns) is to only ever write anonymously.
This probably works, but even this might not be enough if you leave enough scattered clues to allow future AIs to piece together your real identity.
Incognito isn’t, if you have custom instructions in your user preferences. Try it on the console and see if it still happens. It does for me, to be clear.
I was sceptical, so I tried to make Claude guess who wrote one of the test paragraphs, and it guessed correctly and was 75% confident. I asked it not to use internet, but i didn‘t verify that it didn’t use google.
I don’t have any custom instructions set, but thanks for the reminder!
I recommend using aistudio to test Gemini whilst controlling exactly what it sees and what tools it can use (it exactly mimics the API)
There are levels to anonymity/privacy. If it’s important, you need to be on a VM that’s only ever connected to the ’net via a no-logging VPN, with accounts you’ve created distinctly for this purpose, paid for by crypto or cash-bought gift cards.
note: this setup is perfectly legal in most US and EU jurisdictions, as long as you’re not using it to hide or commit crimes or fraud (obDisclaimer: not legal advice, I may not even be a real person.)
The whole point of this post is that tracking technology has improved massively. Similarity is just another tool against you, the basics (network fingerprinting, account-ID graphs, etc.) are important too.
I was talking about claude incognito, which still gives claude your userPreferences but not your name or userMemory. So, in order to actually get an empty context if you have userPreferences set, you need to use the api.
True—automated identification and surveillance is increasing in power about as fast as everything else. I’m not sure how much of it is actually new, vs just available to far more people, and much cheaper.
I’d argue you can still be anonymous when you put effort into it—keep alternate accounts you use only on a no-logging VPN, and obfuscate your style (using LLMs). The shift is how much “automatic anonymity” there is in normal interactions—it used to be nobody would find or connect the dots between your accounts/posts/activity. Now it’s pretty easy for anyone interested to do so.
(Replicated in general for me and some other users at https://www.lesswrong.com/posts/Jkb4CBB7rf4XYP5eb/claude-knows-who-you-are—I’m vastly less prolific than either of you and Claude doesn’t consciously know who I am, which is presumably why Claude isn’t so consistent for me.)
Having trouble getting Opus 4.7 to guess who I am from a few paragraphs of writing, even to the point of my name being in a top-10-guesses list. But I was able to get GPT 4.5 to do this a year ago so that capability might vary author-to-author and model-to-model.
I played with this. It doesn’t seem to get me from the test case I used, even though I have a lot of text out there under only a couple of pseudonyms.
Both it and I think that’s partly corpus structure (I’m a reply guy and my stuff is scattered all over the place interleaved with other people’s text). But another part is content. The stuff you write about interacting with your kids and family is really distinctive. I have a feeling that it might not help much if you rephrased it in Kelsey Piper’s style, because the model could still pick up on your message. You presumably don’t want to change that.
Of course that might not apply if you were talking about some other subject you felt you needed to avoid having associated with you.
I’ve been pretty sure for years that anybody who was really likely to care could trace either of my major pseudonyms back to my “real” name, and possibly link them with one another, based on content rather than style. It’s hard to write authentically on some topics without talking about your personal experiences, and if you collect enough of those you can make a whole bunch of inferences.