Couldn’t the explanation be that the type of writing used in encyclicals follows the same aesthetics used in the training data? I’m thinking of this article in particular:
There’s a growing community (cult?) of self-proclaimed AI detectives, who have designed and detailed what they consider tells, and armed their followers with a checklist of robotic tells. Does a piece of text use words like ‘furthermore’, ‘moreover’, ‘consequently’, ‘otherwise’ or ‘thusly’? Does it build its arguments using perfectly parallel structures, such as the classic “It is not only X, but also Y”? Does it arrange its key points into neat, logical triplets for maximum rhetorical impact?
To these detectives of digital inauthenticity, I say: Friend, welcome to a typical Tuesday in a Kenyan classroom, boardroom, or intra-office Teams chat. The very things you identify as the fingerprints of the machine are, in fact, the fossil records of our education.
I know you’re very confident in that claim, based on:
It gets 100% on Pangram and the tone and rhythm just seems way too AI-like.
and
I asked Claude to come up with some Kenyan essays from that time period that has this ChatGPT rhythm and Claude was unable to.
As well as what looks like your personal judgment of some unspecified other of his essays you’ve read.
According to Gemini 3.1 pro:
During training, Pangram feeds its system a massive dataset of publicly licensed, human-written documents. However, instead of just comparing human text to random AI text, Pangram generates a “synthetic mirror.” It asks an AI (like GPT-4 or Claude) to rewrite the human example so it closely matches the original content.
By forcing the neural network to differentiate between a human essay and an AI’s exact replication of that essay, the model learns the incredibly subtle stylistic choices that AI makes consistently, rather than just relying on broad topic differences.
AI is trained on a biased sample of human writing. It develops stylistic quirks that come through in “match the style” prompts relative to a separate, biased sample of human writing used to train Pangram.
If a subset of authors happen to share those stylistic quirks, as Olang’ argues, then Pangram may simply not have the capacity to distinguish these “hard cases” and classify them systematically as AI. This would be similar to the problems people have noted where AIs develop racist biases due to the distribution of their training data—perhaps being more likely to classify the same action as a crime when it’s performed by a black person, for example.
Furthermore, if you then turn around and submit pre-AI texts to Pangram to determine whether it can successfully classify them, you’re highly likely to be feeding in precisely its own training data. That’s the only guaranteed 100% clean data Pangram could use for training. It’s no surprise that it reliably classifies it as 100% human. What would be more compelling would be if you put in pre-LLM text that had never been published. But unfortunately, we have no Papal encyclicals, and likely no convenient Kenyan essays, with which to perform that experiment.
Let’s accept that Pangram isn’t just a random number generator and actually provides some evidence, even for hard cases. Then what do we make of it when different subsets of the original text receive 100% human or 100% AI scores? What if one bit is a subset of the other with a different classification? That’s what we’re seeing in the case of Olang’s essay.
Let’s accept that Pangram isn’t just a random number generator and actually provides some evidence, even for hard cases. Then what do we make of it when different subsets of the original text receive 100% human or 100% AI scores? What if one bit is a subset of the other with a different classification? That’s what we’re seeing in the case of Olang’s essay.
We know Pangram has much more false negatives than false positives, I think the parsimonious explanation is that the entire thing or almost the entire thing is AI and they’re false negatives.
As well as what looks like your personal judgment of some unspecified other of his essays you’ve read.
Literally just look at his other essays from before ~2021 vs after ~2023! You don’t have to take my word for it!
We know Pangram has much more false negatives than false positives, I think the parsimonious explanation is that the entire thing or almost the entire thing is AI and they’re false negatives.
High profile authors who share AI’s stylistic quirks are much more likely to be investigated. Therefore, the base rate at which Pangram will flag investigated text as AI generated will be much higher than the base rate for a random sample of text. In other words, the evidence provided by Pangram is already more or less accounted for by the mere fact you thought to submit the essay to it.
Therefore, the base rate at which Pangram will flag investigated text as AI generated will be much higher than the base rate for a random sample of text. In other words, the evidence provided by Pangram is already more or less accounted for by the mere fact you thought to submit the essay to it.
I think the “much higher” claim only makes sense if your prior for random samples of 2026 internet text is much lower than mine.
Can you give toy numbers for “much higher,”, “base rate at which Pangram will flag investigated text as AI generated,” and “base rate for a random sample of text.”
Because as stated, it sounds like you’re making a technically plausible analytic point that in reality nonetheless has effectively zero relevance to the question we’re talking about.
I also found that this chunk was flagged as 100% human-generated by Pangram.
And that one moment wasn’t an aberration. Every English class and every homework assignment for three years prior (and more, it could be argued) was specifically designed to get the teacher marking your composition to award you a mark as close as possible to the maximum of 40. Scored a 38/40? Beloved, whoever is marking your paper has deemed you worthy of breathing the same air as Malkiat Singh.
It’s a memory that’s hard to write over—the prompt, written in the looping, immaculate cursive of the teacher on the blackboard: “A holiday I will never forget.” Or perhaps it was one of those that demanded that you end the entire composition with, “…and that’s when I woke up and realised it was just a dream.” The topic was almost irrelevant. The real test was the execution.
I’m out of free credits, so can’t check other chunks. Would be interesting to get a hierarchical breakdown of Pangram scores on various scales of the essay.
I’ve copy-pasted various chunks of Pope Francis encyclicals into Pangram, and it’s never flagged anything as AI-written. I haven’t been that systematic about doing so I’m afraid, but here are some ones I’m doing right now (note that for all of these I’m using the Italian text under the assumption that that’s the original, and for ease of comparison with my post on the same topic):
Couldn’t the explanation be that the type of writing used in encyclicals follows the same aesthetics used in the training data? I’m thinking of this article in particular:
I’m Kenyan. I Don’t Write Like ChatGPT. ChatGPT Writes Like Me.
You know that article itself is heavily AI-assisted, right?
I know you’re very confident in that claim, based on:
and
As well as what looks like your personal judgment of some unspecified other of his essays you’ve read.
According to Gemini 3.1 pro:
AI is trained on a biased sample of human writing. It develops stylistic quirks that come through in “match the style” prompts relative to a separate, biased sample of human writing used to train Pangram.
If a subset of authors happen to share those stylistic quirks, as Olang’ argues, then Pangram may simply not have the capacity to distinguish these “hard cases” and classify them systematically as AI. This would be similar to the problems people have noted where AIs develop racist biases due to the distribution of their training data—perhaps being more likely to classify the same action as a crime when it’s performed by a black person, for example.
Furthermore, if you then turn around and submit pre-AI texts to Pangram to determine whether it can successfully classify them, you’re highly likely to be feeding in precisely its own training data. That’s the only guaranteed 100% clean data Pangram could use for training. It’s no surprise that it reliably classifies it as 100% human. What would be more compelling would be if you put in pre-LLM text that had never been published. But unfortunately, we have no Papal encyclicals, and likely no convenient Kenyan essays, with which to perform that experiment.
Let’s accept that Pangram isn’t just a random number generator and actually provides some evidence, even for hard cases. Then what do we make of it when different subsets of the original text receive 100% human or 100% AI scores? What if one bit is a subset of the other with a different classification? That’s what we’re seeing in the case of Olang’s essay.
We know Pangram has much more false negatives than false positives, I think the parsimonious explanation is that the entire thing or almost the entire thing is AI and they’re false negatives.
Literally just look at his other essays from before ~2021 vs after ~2023! You don’t have to take my word for it!
High profile authors who share AI’s stylistic quirks are much more likely to be investigated. Therefore, the base rate at which Pangram will flag investigated text as AI generated will be much higher than the base rate for a random sample of text. In other words, the evidence provided by Pangram is already more or less accounted for by the mere fact you thought to submit the essay to it.
What do you think is the base rate of AI usage in say randomly selected substack posts with >1k views?
Why is that relevant?
I think the “much higher” claim only makes sense if your prior for random samples of 2026 internet text is much lower than mine.
Can you give toy numbers for “much higher,”, “base rate at which Pangram will flag investigated text as AI generated,” and “base rate for a random sample of text.”
Because as stated, it sounds like you’re making a technically plausible analytic point that in reality nonetheless has effectively zero relevance to the question we’re talking about.
I know that Pangram flags it, but do we have evidence that rules out “the guy wrote it himself but AI is just trained on that style”?
Here’s a subset of my evidence here:
https://www.lesswrong.com/posts/3LcyoqNTJuCZ65MbL?commentId=TfeGhHBh35rffdo6W
If this were true, you’d expect pre-2020 encyclicals to sometimes be flagged as AI-written, which they never are in my experience.
Also FYI Pangram thinks the text you quoted is 100% human-written.
That said, Pangram flags the rest of the piece as 100% AI-generated.
I also found that this chunk was flagged as 100% human-generated by Pangram.
I’m out of free credits, so can’t check other chunks. Would be interesting to get a hierarchical breakdown of Pangram scores on various scales of the essay.
Can you clarify what your experience is with checking pre-2020 encyclicals for evidence of being AI-written?
I’ve copy-pasted various chunks of Pope Francis encyclicals into Pangram, and it’s never flagged anything as AI-written. I haven’t been that systematic about doing so I’m afraid, but here are some ones I’m doing right now (note that for all of these I’m using the Italian text under the assumption that that’s the original, and for ease of comparison with my post on the same topic):
A big chunk of Caritas in Veritate. Written in 2009 by Benedict XVI, Pangram thinks it’s 100% human-written.
A chunk of Spe Salvi. Written in 2007 by Benedict XVI, Pangram says 100% human.
A chunk of Lumen Fidei. Written in 2013 by Francis, Pangram says 100% human.
A chunk of Fratelli Tutti. Written in 2020 by Francis, Pangram says 100% human.
I also have some examples here:
https://www.lesswrong.com/posts/wRNJZz2iYrfDaSDdz/claude-author-of-the-humanitas#Comparison_to_other_encyclicals