Possibly by ‘functional profile’ you mean something like what a programmer would call ‘implementation details’, ie a change to a piece of code that doesn’t result in any changes in the observable behavior of that code?
Yes, this is a fair gloss of my view. I’m referring to the input/output characteristics at the relevant level of abstraction. If you replaced a group of neurons with silicon that perfectly replicated their input/output behavior, I’d expect the phenomenology to remain unchanged.
The quoted passage sounds to me like it’s saying, ‘if we make changes to a human brain, it would be strange for there to be a change to qualia.’ Whereas it seems to me like in most cases, when the brain changes—as crudely as surgery, or as subtly as learning something new—qualia generally change also.
Yes, this is a great point. During surgery, you’re changing the input/output of significant chunks of neurons so you’d expect qualia to change. Similarly for learning you’re adding input/output connections due to the neural plasticity. This gets at something I’m driving at. In practice, the functional and phenomenal profiles are so tightly coupled that a change in one corresponds to a change in another. If we lesion part of the visual cortex we expect a corresponding loss of visual experience.
For this project, we want to retain a functional idea of self in LLM’s while remaining agnostic about consciousness, but, if this genuinely captures some self-like organisation, either:
It’s implemented via input/output patterns similar enough to humans that we should expect associated phenomenology, or
It’s implemented so differently that calling it “values,” “preferences,” or “self” risks anthropomorphism
If we want to insist the organisation is genuinely self-like then I think we should be resisting agnosticism about phenomenal consciousness (although I understand it makes sense to bracket it from a strategic perspective so people take the view more seriously.)
Thanks for the clarification, that totally resolved my uncertainty about what you were saying. I just wasn’t sure whether you were intending to hold input/output behavior constant.
If you replaced a group of neurons with silicon that perfectly replicated their input/output behavior, I’d expect the phenomenology to remain unchanged.
That certainly seems plausible! On the other hand, since we have no solid understanding of what exactly induces qualia, I’m pretty unsure about it. Are there any limits to what functional changes could be made without altering qualia? What if we replaced the whole system with a functionally-equivalent pen-and-paper ledger? I just personally feel too uncertain of everything qualia-related to place any strong bets there.
My other reason for wanting to keep the agenda agnostic to questions about subjective experience is that with respect to AI safety, it’s almost entirely the behavior that matters. So I’d like to see people working on these problems focus on whether an LLM behaves as though it has persistent beliefs and values, rather than getting distracted by questions about whether it in some sense really has beliefs or values. I guess that’s strategic in some sense, but it’s more about trying to stay focused on a particular set of questions.
Don’t get me wrong; I really respect the people doing research into LLM consciousness and moral patienthood and I’m glad they’re doing that work (and I think they’ve taken on a much harder problem than I have). I just think that for most purposes we can investigate the functional self without involving those questions, hopefully making the work more tractable.
Makes sense—I think this is a reasonable position to hold given the uncertainty around consciousness and qualia.
Thanks for the really polite and thoughtful engagement with my comments and good luck with the research agenda! It’s a very interesting project and I’d be interested to see your progress.
Yes, this is a fair gloss of my view. I’m referring to the input/output characteristics at the relevant level of abstraction. If you replaced a group of neurons with silicon that perfectly replicated their input/output behavior, I’d expect the phenomenology to remain unchanged.
Yes, this is a great point. During surgery, you’re changing the input/output of significant chunks of neurons so you’d expect qualia to change. Similarly for learning you’re adding input/output connections due to the neural plasticity. This gets at something I’m driving at. In practice, the functional and phenomenal profiles are so tightly coupled that a change in one corresponds to a change in another. If we lesion part of the visual cortex we expect a corresponding loss of visual experience.
For this project, we want to retain a functional idea of self in LLM’s while remaining agnostic about consciousness, but, if this genuinely captures some self-like organisation, either:
It’s implemented via input/output patterns similar enough to humans that we should expect associated phenomenology, or
It’s implemented so differently that calling it “values,” “preferences,” or “self” risks anthropomorphism
If we want to insist the organisation is genuinely self-like then I think we should be resisting agnosticism about phenomenal consciousness (although I understand it makes sense to bracket it from a strategic perspective so people take the view more seriously.)
Thanks for the clarification, that totally resolved my uncertainty about what you were saying. I just wasn’t sure whether you were intending to hold input/output behavior constant.
That certainly seems plausible! On the other hand, since we have no solid understanding of what exactly induces qualia, I’m pretty unsure about it. Are there any limits to what functional changes could be made without altering qualia? What if we replaced the whole system with a functionally-equivalent pen-and-paper ledger? I just personally feel too uncertain of everything qualia-related to place any strong bets there.
My other reason for wanting to keep the agenda agnostic to questions about subjective experience is that with respect to AI safety, it’s almost entirely the behavior that matters. So I’d like to see people working on these problems focus on whether an LLM behaves as though it has persistent beliefs and values, rather than getting distracted by questions about whether it in some sense really has beliefs or values. I guess that’s strategic in some sense, but it’s more about trying to stay focused on a particular set of questions.
Don’t get me wrong; I really respect the people doing research into LLM consciousness and moral patienthood and I’m glad they’re doing that work (and I think they’ve taken on a much harder problem than I have). I just think that for most purposes we can investigate the functional self without involving those questions, hopefully making the work more tractable.
Makes sense—I think this is a reasonable position to hold given the uncertainty around consciousness and qualia.
Thanks for the really polite and thoughtful engagement with my comments and good luck with the research agenda! It’s a very interesting project and I’d be interested to see your progress.