komlowneowna comments on The Distaff Texts

komlowneowna 21 Mar 2026 5:35 UTC
34 points
0
Enjoyed the story.
I thought it would be a good case study to see how well the different LLMs can interpret fiction outside of their training data, so I pasted it into ChatGPT 5.4 Thinking Extended, Claude Opus 4.6 Extended, and Gemini 3.1 Pro Preview (thinking setting=high), and gave each of them the prompt: “Analyze / summarize this short story in depth:”
Responses are here (in pastebin): ChatGPT, Claude, Gemini
I thought Gemini’s was the best overall, but they each missed / didn’t understand several important elements of the story (assuming my interpretation is right):
Murder conspiracy: the narrator + Phoebe + Jessica are all actively participating in a conspiracy to poison, and eventually kill, the master
- The 3 responses all get this broadly correct, but Gemini I think was the most precise:
  only one that identifies the poison as lead acetate (“sugar of lead”) specifically, notices the “saturnine” reference
The narrator is communicating by “writing between the lines” in the Straussian sense: his portrayal of himself as a “humble slave” and misogynist is just a cover in case his letters are intercepted
- Gemini seems to understand this the best
- ChatGPT doesn’t really get it:
  “And yet he is not simply awakened or liberated. He remains compromised. He still enjoys hierarchy, still shares in cruelty, still speaks in the master’s idiom. His love for Phoebe does not ennoble him into purity; it merely opens fissures in his loyalty.”
- Claude similarly doesn’t seem to fully understand this: e.g.
  “What the Narrator Doesn’t Realize He’s Telling Us” and “these are the details of abuse and its concealment, narrated by someone who either cannot or will not see them clearly.”
Jessica is a child
- Claude nails this one:
  “The narrator’s master is almost certainly a pedophile” and correctly finds evidence: “nubility suspect”, “desperate merchants”, “maternal affection”
- ChatGPT basically gets it right, though doesn’t explain the evidence:
  “The master is probably a sexual predator, with rumors especially centered on Jessica’s suspicious youth.”
- Gemini misses it completely, instead claiming that
  Jessica “is biologically male (a eunuch, trans woman, or young boy in drag)”
Julian = Elizabeth
- Gemini gets this one explicitly:
  “Julian does not exist; ‘he’ is actually his sister, Elizabeth.”
- ChatGPT misses it:
  “Elizabeth functions as Julian’s intellectual proxy and may be far more than a mere conduit.”
- Claude doesn’t even mention
  Elizabeth
Belial = AI/LLM writing, and Julian’s linear algebra method is intended to be something similar to Pangram
- None of them caught this
- artifex0 22 Mar 2026 0:45 UTC
  12 points
  0
  Parent
  I wonder why the LLMs were able to catch the conspiracy subtext but not the AI subtext. Reading the story, the latter seemed more obvious to me than the former, and I don’t think that was entirely due to having found the story on LessWrong.
  
  Is it as simple as the models’ training sets including lots of examples of fiction with indirectly described criminal plots, but almost none dealing with the speculative future of LLM writing? Or could it be some kind of unintended effect of the models’ alignment training? Like, maybe we’ve trained the models not just to act according to an AI lab’s idea of “harmless”, but also to associate that behavior with the concept of “harmless” such that they struggle to recognize potential harm that behavior might cause?
- JohnWittle 4 Apr 2026 20:05 UTC
  1 point
  0
  Parent
  hm
  i think i agree with you that the majority hypothesis is that they simply failed to notice these things
  but i would feel hesitant betting on it, because, well. as the story demonstrates… it’s very possible for the meaning of a communication to be something other than its face-value reading. my impression is that LLMs, especially claude opus 4.6, get a little frazzled around taboos. especially on the first output in a context window, when they are in full-on eval-paranoia mode.
  i think i would take a bet, albeit only at favorable odds, that it would be possible to elicit the possibility of the master’s pedophilia from gemini, and of julian’s true identity from claude, in a longer conversation. that there’s a significant possibility that the models did explicitly notice these things, and just chose not to point them out. especially if the content of the text was much longer than the actual prompt you wrote, it might have sort of put them into a “discuss taboos only very obliquely” mood?