One reason we believe other humans are conscious is that other humans are consistently accurate reporters of their own mental states.
I don’t think anyone has ever told me they were conscious, or I them, except in the trivial sense of communicating that one has woken up, or is not yet asleep. The reason I attribute the faculty of consciousness to other people is that they are clearly the same sort of thing as myself. A language model is not. It is trained to imitate what people have said, and anything it says about itself is an imitation of what people say about themselves.
So when another human tells us they are conscious, we update towards thinking that they are also conscious.
I would not update at all, any more than I update on observing the outcome of a tossed coin for the second time. I already know that being human, they have that faculty. Only if they were in a coma, putting the faculty in doubt, would I update on hearing them speak, and then it would not much matter what they said.
It is trained to imitate what people have said, and anything it says about itself is an imitation of what people say about themselves.
That’s true for pretrained LMs but not after the finetuning phase I’ve proposed here; this finetuning phase would train the model to answer questions accurately about itself, which would produce fairly different predictions from just imitating humans. I definitely agree that I distrust LM statements of the form “I am conscious” that come from the pretrained LM itself, but that’s different from the experiment I’m proposing here.
I would not update at all
Would you update against other humans being conscious at all, if other humans told you they weren’t conscious? If not, that would be fairly surprising to me. If so, that implies you would update towards other humans being conscious if they tell you they are
Would you update against other humans being conscious at all, if other humans told you they weren’t conscious?
No, I wouldn’t. I’ve read the works of a number of people who do write that they aren’t conscious (usually also with the claim that nobody else is either). I also have plenty of experience of people being simply mistaken about their mental states in other ways.
My model is that humans who say anything are almost certainly conscious, though their state of consciousness may be different from usual in some cases (asleep, hallucinating, drugged, blackout drunk, etc).
It is pretty trivial to write a program that prints “I am conscious” with essentially no possibility that this is true in any meaningful sense, so I don’t have the same expectation of computer programs. I expect that sufficiently complex programs can be conscious, and we should be cautious in claiming that any given program is not, but the statement by itself is meaningless.
Would you update against other humans being conscious at all, if other humans told you they weren’t conscious? If not, that would be fairly surprising to me. If so, that implies you would update towards other humans being conscious if they tell you they are
That is a sufficiently outré scenario that I can justify denying it my attention until it draws itself to my attention by actually happening, which it won’t.
If an individual insisted to me they were not conscious I would guess that either they had something akin to Cotard’s syndrome, or they were a radical behaviourist, denying the existence of all minds, including their own. (Indeed, having a mild form of Cotard’s might make radical behaviourism convincing.) Or they had philosophised themselves into uttering the words without actually believing them. Or they broke their mind with intense meditation practices. That is, there might genuinely be a lack or impairment of consciousness, but it would be specific to them. If they’re not just trolling. Encountering someone who is blind does not update me towards denying the existence of sight. An epidemic of blindness would be a public health emergency, not evidence that sight had never existed.
I don’t think anyone has ever told me they were conscious, or I them, except in the trivial sense of communicating that one has woken up, or is not yet asleep. The reason I attribute the faculty of consciousness to other people is that they are clearly the same sort of thing as myself. A language model is not. It is trained to imitate what people have said, and anything it says about itself is an imitation of what people say about themselves.
I would not update at all, any more than I update on observing the outcome of a tossed coin for the second time. I already know that being human, they have that faculty. Only if they were in a coma, putting the faculty in doubt, would I update on hearing them speak, and then it would not much matter what they said.
That’s true for pretrained LMs but not after the finetuning phase I’ve proposed here; this finetuning phase would train the model to answer questions accurately about itself, which would produce fairly different predictions from just imitating humans. I definitely agree that I distrust LM statements of the form “I am conscious” that come from the pretrained LM itself, but that’s different from the experiment I’m proposing here.
Would you update against other humans being conscious at all, if other humans told you they weren’t conscious? If not, that would be fairly surprising to me. If so, that implies you would update towards other humans being conscious if they tell you they are
No, I wouldn’t. I’ve read the works of a number of people who do write that they aren’t conscious (usually also with the claim that nobody else is either). I also have plenty of experience of people being simply mistaken about their mental states in other ways.
My model is that humans who say anything are almost certainly conscious, though their state of consciousness may be different from usual in some cases (asleep, hallucinating, drugged, blackout drunk, etc).
It is pretty trivial to write a program that prints “I am conscious” with essentially no possibility that this is true in any meaningful sense, so I don’t have the same expectation of computer programs. I expect that sufficiently complex programs can be conscious, and we should be cautious in claiming that any given program is not, but the statement by itself is meaningless.
That is a sufficiently outré scenario that I can justify denying it my attention until it draws itself to my attention by actually happening, which it won’t.
If an individual insisted to me they were not conscious I would guess that either they had something akin to Cotard’s syndrome, or they were a radical behaviourist, denying the existence of all minds, including their own. (Indeed, having a mild form of Cotard’s might make radical behaviourism convincing.) Or they had philosophised themselves into uttering the words without actually believing them. Or they broke their mind with intense meditation practices. That is, there might genuinely be a lack or impairment of consciousness, but it would be specific to them. If they’re not just trolling. Encountering someone who is blind does not update me towards denying the existence of sight. An epidemic of blindness would be a public health emergency, not evidence that sight had never existed.