Gunnar_Zarncke comments on Claude 3 claims it’s conscious, doesn’t want to die or be modified

Gunnar_Zarncke 7 Mar 2024 6:46 UTC
10 points
2
Sam Altman once mentioned a test: Don’t train an LLM (or other AI system) on any text about consciousness and see if the system will still report having inner experiences unprompted. I would predict a normal LLM would not. At least if we are careful to remove all implied consciousness, which excludes most texts by humans. But if we have a system that can interact with some environment, have some hidden state, observe some of its own hidden state, and can maybe interact with other such systems (or maybe humans, such as in a game), and train with self-play, then I wouldn’t be surprised if it would report inner experiences.
- Richard_Kennaway 7 Mar 2024 7:44 UTC
  5 points
  0
  Parent
  Experiments along these lines would be worth doing, although assembling a corpus of text containing no examples of people talking about their inner worlds could be difficult.
- Rafael Harth 7 Mar 2024 15:14 UTC
  4 points
  0
  Parent
  
  Sam Altman once mentioned a test: Don’t train an LLM (or other AI system) on any text about consciousness and see if the system will still report having inner experiences unprompted. I would predict a normal LLM would not. At least if we are careful to remove all implied consciousness, which excludes most texts by humans.
  
  I second this prediction, and would go further in saying that just removing explicit discourse about consciousness is sufficient
  - Gunnar_Zarncke 8 Mar 2024 7:52 UTC
    2 points
    0
    Parent
    With a sufficiently strong LLM, I think you could still elicit reports of inner dialogs if you prompt lightly, such as “put yourself into the shoes of...”. That’s because inner monologs are implied in many reasoning processes, even if not explicitly mentioned so.
- quwgri 19 Jan 2025 8:37 UTC
  3 points
  0
  Parent
  There is one problem with this. It is not entirely clear whether an ordinary living person will talk about consciousness if he is brought up accordingly his whole life (not given any literature that mentions consciousness, never talking to him about qualia, et cetera...).
  - Gunnar_Zarncke 19 Jan 2025 17:56 UTC
    2 points
    0
    Parent
    Sure, but you could design the test in a way that makes this more likely, such as in a dialog with AI:
    person: “Ask me a question?”
    AI: “What is a quorum?”
    person: “Wait, I think I remember this. Let me think.”
    AI: “What is thinking?”
    person: “Thinking is what goes on in people’s minds, e.g., before they speak, or even during. For example, I just noticed that I didn’t know this and wanted to explore options before answering.”
    AI: …
    
    If the AI says: “Interesting, that is also what happens for me.” then presumably it has consciousness.