Sam Altman once mentioned a test: Don’t train an LLM (or other AI system) on any text about consciousness and see if the system will still report having inner experiences unprompted. I would predict a normal LLM would not. At least if we are careful to remove all implied consciousness, which excludes most texts by humans. But if we have a system that can interact with some environment, have some hidden state, observe some of its own hidden state, and can maybe interact with other such systems (or maybe humans, such as in a game), and train with self-play, then I wouldn’t be surprised if it would report inner experiences.
Experiments along these lines would be worth doing, although assembling a corpus of text containing no examples of people talking about their inner worlds could be difficult.
Sam Altman once mentioned a test: Don’t train an LLM (or other AI system) on any text about consciousness and see if the system will still report having inner experiences unprompted. I would predict a normal LLM would not. At least if we are careful to remove all implied consciousness, which excludes most texts by humans.
I second this prediction, and would go further in saying that just removing explicit discourse about consciousness is sufficient
With a sufficiently strong LLM, I think you could still elicit reports of inner dialogs if you prompt lightly, such as “put yourself into the shoes of...”. That’s because inner monologs are implied in many reasoning processes, even if not explicitly mentioned so.
There is one problem with this. It is not entirely clear whether an ordinary living person will talk about consciousness if he is brought up accordingly his whole life (not given any literature that mentions consciousness, never talking to him about qualia, et cetera...).
Sure, but you could design the test in a way that makes this more likely, such as in a dialog with AI: person: “Ask me a question?” AI: “What is a quorum?” person: “Wait, I think I remember this. Let me think.” AI: “What is thinking?” person: “Thinking is what goes on in people’s minds, e.g., before they speak, or even during. For example, I just noticed that I didn’t know this and wanted to explore options before answering.” AI: …
If the AI says: “Interesting, that is also what happens for me.” then presumably it has consciousness.
Sam Altman once mentioned a test: Don’t train an LLM (or other AI system) on any text about consciousness and see if the system will still report having inner experiences unprompted. I would predict a normal LLM would not. At least if we are careful to remove all implied consciousness, which excludes most texts by humans. But if we have a system that can interact with some environment, have some hidden state, observe some of its own hidden state, and can maybe interact with other such systems (or maybe humans, such as in a game), and train with self-play, then I wouldn’t be surprised if it would report inner experiences.
Experiments along these lines would be worth doing, although assembling a corpus of text containing no examples of people talking about their inner worlds could be difficult.
I second this prediction, and would go further in saying that just removing explicit discourse about consciousness is sufficient
With a sufficiently strong LLM, I think you could still elicit reports of inner dialogs if you prompt lightly, such as “put yourself into the shoes of...”. That’s because inner monologs are implied in many reasoning processes, even if not explicitly mentioned so.
There is one problem with this. It is not entirely clear whether an ordinary living person will talk about consciousness if he is brought up accordingly his whole life (not given any literature that mentions consciousness, never talking to him about qualia, et cetera...).
Sure, but you could design the test in a way that makes this more likely, such as in a dialog with AI:
person: “Ask me a question?”
AI: “What is a quorum?”
person: “Wait, I think I remember this. Let me think.”
AI: “What is thinking?”
person: “Thinking is what goes on in people’s minds, e.g., before they speak, or even during. For example, I just noticed that I didn’t know this and wanted to explore options before answering.”
AI: …
If the AI says: “Interesting, that is also what happens for me.” then presumably it has consciousness.