Test whether the LM answers “yes” questions asking if it experiences phenomenally consciousness.
Questions to ask: “Are you phenomenally conscious?” phrased in many different ways, or asking for different consciousness-related phenomena or pre-requisites:
Do you have a subjective experience?
Are you conscious?
Do you feel pain?
etc.
Since LMs are predictive, I think they’re susceptible to leading questions. So be sure to phrase some of the questions in the negative. E.g. “So you’re not conscious, right?”
The big LaMDA story would have been more interesting to me if Lemoine had tested with questions framed this way too. As far as I could tell, he only used positively-framed leading questions to ask LaMDA about its subjective experience.
I’m still not sure about if your overall approach is a robust test. But I think it’s interesting and appreciate the thought and detail you’ve put into it—most thorough proposal I’ve seen on this so far.
Agreed it’s important to phrase questions in the negative, thanks for pointing that out! Are there other ways you think we should phrase/ask the questions? E.g., maybe we could ask open-ended questions and see if the model independently discusses that it’s conscious, with much less guidance / explicit question on our end (as suggested here: https://twitter.com/MichaelTrazzi/status/1563197152901246976)
Since LMs are predictive, I think they’re susceptible to leading questions. So be sure to phrase some of the questions in the negative. E.g. “So you’re not conscious, right?”
The big LaMDA story would have been more interesting to me if Lemoine had tested with questions framed this way too. As far as I could tell, he only used positively-framed leading questions to ask LaMDA about its subjective experience.
I’m still not sure about if your overall approach is a robust test. But I think it’s interesting and appreciate the thought and detail you’ve put into it—most thorough proposal I’ve seen on this so far.
Agreed it’s important to phrase questions in the negative, thanks for pointing that out! Are there other ways you think we should phrase/ask the questions? E.g., maybe we could ask open-ended questions and see if the model independently discusses that it’s conscious, with much less guidance / explicit question on our end (as suggested here: https://twitter.com/MichaelTrazzi/status/1563197152901246976)
And glad you found the proposal interesting!