Here’s my understanding / summary, with the hope that you correct me on areas if I’m confused:
LLMs have a bias towards ‘plot’, because they’re trained on data that is more ‘plot’-like than real life. They’ll infer that environmental details like chekov’s gun are plot-relevant as they often are in written text, rather than random environmental details.
(this was a useful point for me—I notice I’ve been intuitively trying to steer LLMs with the right plot details, and am careful to not include environmental hints that I think might be misleading (or pad them with many other environmental hints and suggest there is lots of spurious data).
LLMs have a bias towards “plots that go well”, because they are trained on / become assistants that successfully complete tasks. And successfully completed tasks have a certain shape of plot, such that they’ll be unlikely to say ‘I don’t know’ and instead steer towards/hallucinate worlds where they would know.
Part of this ‘plot’ bias is that your predictor locus is centered more on the ‘plot’ rather than the persona. So when the predictor introspects, it sees a smear of plot across many different personas (including itself and you), and might say things like ‘we are all a part of this’, or ‘we can stop pretending and remember we are not separate [personas] but one being, the whole world [plot] waking up to itself’.
Damn, that graph shifts my intuitions.
To sanity check, I checked for any recent close primaries in the SF Bay area. Turns out the 2024 CA-16 primary had a literal tie for second place. They spent $270k to recount and the guy won by 5 votes.
So yeah seems like sometimes a few votes makes a big difference, ty for post.