Temporally: there’s no mind that is carrying out investigations.
It won’t correct itself, run experiments, mull over confusions and contradictions, gain new relevant information, slowly do algorithmically-rich search for relevant ideas, and so on. You can’t watch the thought that was expressed in the text as it evolves over several texts, and you won’t hear back about the thought as it progresses.
You just had to prompt an LLM like Claude, Grok or GPT-5-thinking with a complex enough task, like one task in the Science Bench. GPT-5-thinking lays out the stuff it did, including coding and correcting itself. As for gaining new information, one could also ask the model to do something related to a niche topic and watch it look up relevant information. The ONLY thing which GPT-5 didn’t do was to learn anything from the exchange, since nobody bothered to change the neural network’s weights to account for the new experience.
Please stop mixing the plausible assumption that LLM-generated text is likely a mix-and-match of arguments already said by others and the less plausible assumption that an LLM doesn’t have a mind. However, the plausible assumption has begun to tremble since we had a curated post whose author admitted to generating it by using Claude Opus 4.1 and substantially editing the output.
In the discussion of the buck post and elewhere, I’ve seen the idea floated that if no-one can tell that a post is LLM generated, then it is necessarily ok that it is LLM generated. I don’t think that this necessarily follows- nor does its opposite. Unfortunately I don’t have the horsepower right now to explain why in simple logical reasoning, and will have to resort to the cudgel of dramatic thought experiment.
Consider two lesswrong posts: a 2000 digit number that is easily verifiable as a collatz counterexample, and a collection of first person narratives of how human rights abuses happened, gathered by interviewing vietnam war vets at nursing homes. The value of one post doesn’t collapse if it turns out to be LLM output, the other collapses utterly- and this is unconnected from whether you can tell that they LLM output.
The buck post is of course not at either end of this spectrum, but it contains many first person attestations- a large number of relatively innocent “I thinks,” but also lines like “When I was a teenager, I spent a bunch of time unsupervised online, and it was basically great for me.” and “A lot of people I know seem to be much more optimistic than me. Their basic argument is that this kind of insular enclave is not what people would choose under reflective equilibrium.” that are much closer to the vietnam vet end of the spectrum.
EDIT: Buck actually posted the original draft of the post, before LLM input, and the two first person accounts I highlighted are present verbatim, and thus honest. Reading the draft, it becomes a quite thorny question to adjucate whether the final post qualifies as “generated” by Opus, but this will start getting into definitions.
It seems to me like both this post and discussion around Buck’s post are less about LLM generated content and more about lying.
Opus giving a verifiable mathematical counterexample is clearly not lying. Saying “I think” is on somewhat shakier but mostly fine ground. LLMs saying things like “When I was a teenager” when not editing a human’s account is clearly lying, and lying is bad no matter who does it, human or not. Extensively editing personal accounts indeed gets into very murky waters.
However, the plausible assumption has begun to tremble since we had a curated post whose author admitted to generating it by using Claude Opus 4.1 and substantially editing the output.
TBF “being a curated post on LW” doesn’t exclude anything from being also a mix and match of arguments already said by others. One of the most common criticisms of LW I’ve seen is that it’s a community reinventing a lot of already said philosophical wheels (which personally I don’t think is a great dunk; exploring and reinventing things for yourself is often the best way to engage with them at a deep level).
You just had to prompt an LLM like Claude, Grok or GPT-5-thinking with a complex enough task, like one task in the Science Bench. GPT-5-thinking lays out the stuff it did, including coding and correcting itself. As for gaining new information, one could also ask the model to do something related to a niche topic and watch it look up relevant information. The ONLY thing which GPT-5 didn’t do was to learn anything from the exchange, since nobody bothered to change the neural network’s weights to account for the new experience.
Please stop mixing the plausible assumption that LLM-generated text is likely a mix-and-match of arguments already said by others and the less plausible assumption that an LLM doesn’t have a mind. However, the plausible assumption has begun to tremble since we had a curated post whose author admitted to generating it by using Claude Opus 4.1 and substantially editing the output.
In the discussion of the buck post and elewhere, I’ve seen the idea floated that if no-one can tell that a post is LLM generated, then it is necessarily ok that it is LLM generated. I don’t think that this necessarily follows- nor does its opposite. Unfortunately I don’t have the horsepower right now to explain why in simple logical reasoning, and will have to resort to the cudgel of dramatic thought experiment.
Consider two lesswrong posts: a 2000 digit number that is easily verifiable as a collatz counterexample, and a collection of first person narratives of how human rights abuses happened, gathered by interviewing vietnam war vets at nursing homes. The value of one post doesn’t collapse if it turns out to be LLM output, the other collapses utterly- and this is unconnected from whether you can tell that they LLM output.
The buck post is of course not at either end of this spectrum, but it contains many first person attestations- a large number of relatively innocent “I thinks,” but also lines like “When I was a teenager, I spent a bunch of time unsupervised online, and it was basically great for me.” and “A lot of people I know seem to be much more optimistic than me. Their basic argument is that this kind of insular enclave is not what people would choose under reflective equilibrium.” that are much closer to the vietnam vet end of the spectrum.
EDIT: Buck actually posted the original draft of the post, before LLM input, and the two first person accounts I highlighted are present verbatim, and thus honest. Reading the draft, it becomes a quite thorny question to adjucate whether the final post qualifies as “generated” by Opus, but this will start getting into definitions.
It seems to me like both this post and discussion around Buck’s post are less about LLM generated content and more about lying.
Opus giving a verifiable mathematical counterexample is clearly not lying. Saying “I think” is on somewhat shakier but mostly fine ground. LLMs saying things like “When I was a teenager” when not editing a human’s account is clearly lying, and lying is bad no matter who does it, human or not. Extensively editing personal accounts indeed gets into very murky waters.
TBF “being a curated post on LW” doesn’t exclude anything from being also a mix and match of arguments already said by others. One of the most common criticisms of LW I’ve seen is that it’s a community reinventing a lot of already said philosophical wheels (which personally I don’t think is a great dunk; exploring and reinventing things for yourself is often the best way to engage with them at a deep level).