I think what this is showing is that Chalmer’s definition of “dispositional attitudes” has a problem: It lacks any notion of the amount and kind of computational labour required to turn ‘dispositional’ attitudes into ‘occurrent’ ones. That’s why he ends up with AI systems having an uncountably infinite number of dispositional attitudes.
This pretty much matches my sense so far, although I haven’t had time to finish reading the whole thing. I wonder whether this is due to the fact that he’s used to thinking about human brains, where we’re (AFAIK) nowhere near being able to identify the representation of specific concepts, and so we might as well use the most philosophically convenient description.
Clearly ANNs are able to represent propositional content, but I haven’t seen anything that makes me think that’s the natural unit of analysis.
I could imagine his lens potentially being useful for some sorts of analysis built on top of work from mech interp, but not as a core part of mech interp itself (unless it turns out that it happens to be true that propositions and propositional attitudes are the natural decomposition for ANNs, I suppose, but even if that happened it would seem like a happy coincidence rather than something that Chalmers has identified in advance).
Clearly ANNs are able to represent propositional content, but I haven’t seen anything that makes me think that’s the natural unit of analysis.
Well, we (humans) categorize our epistemic state largely in propositional terms, e.g. in beliefs and suppositions. We even routinely communicate by uttering “statements”—which express propositions. So propositions are natural to us, which is why they are important for ANN interpretability.
Well, we (humans) categorize our epistemic state largely in propositional terms, e.g. in beliefs and suppositions.
I’m not too confident of this. It seems to me that a lot of human cognition isn’t particularly propositional, even if nearly all of it could in principle be translated into that language. For example, I think a lot of cognition is sensory awareness, or imagery, or internal dialogue. We could contort most of that into propositions and propositional attitudes (eg ‘I am experiencing a sensation of pain in my big toe’, ‘I am imagining a picnic table’), but that doesn’t particularly seem like the natural lens to view those through.
That said, I do agree that propositions and propositional attitudes would be a more useful language to interpret LLMs through than eg activation vectors of float values.
I wonder whether this is due to the fact that he’s used to thinking about human brains, where we’re (AFAIK) nowhere near being able to identify the representation of specific concepts, and so we might as well use the most philosophically convenient description.
I don’t think this description is philosophically convenient. Believing p and believing things that imply p are genuinely different states of affairs in a sensible theory of mind. Thinking through concrete mech interp examples of the former vs. the latter makes it less abstract in what sense they are different, but I think I would have objected to Chalmer’s definition even back before we knew anything about mech interp. It would just have been harder for me to articulate what exactly is wrong with it.
Something that Chalmers finds convenient, anyhow. I’m not sure how else we could view ‘dispositional beliefs’ if not as a philosophical construct; surely Chalmers doesn’t imagine that ANNs or human brains actively represent ‘p-or-q’ for all possible q.
To be fair here, from an omniscient perspective, believing P and believing things that imply P are genuinely the same thing in terms of results, but from a non-omniscient perspective, the difference matters.
This pretty much matches my sense so far, although I haven’t had time to finish reading the whole thing. I wonder whether this is due to the fact that he’s used to thinking about human brains, where we’re (AFAIK) nowhere near being able to identify the representation of specific concepts, and so we might as well use the most philosophically convenient description.
Clearly ANNs are able to represent propositional content, but I haven’t seen anything that makes me think that’s the natural unit of analysis.
I could imagine his lens potentially being useful for some sorts of analysis built on top of work from mech interp, but not as a core part of mech interp itself (unless it turns out that it happens to be true that propositions and propositional attitudes are the natural decomposition for ANNs, I suppose, but even if that happened it would seem like a happy coincidence rather than something that Chalmers has identified in advance).
Well, we (humans) categorize our epistemic state largely in propositional terms, e.g. in beliefs and suppositions. We even routinely communicate by uttering “statements”—which express propositions. So propositions are natural to us, which is why they are important for ANN interpretability.
I’m not too confident of this. It seems to me that a lot of human cognition isn’t particularly propositional, even if nearly all of it could in principle be translated into that language. For example, I think a lot of cognition is sensory awareness, or imagery, or internal dialogue. We could contort most of that into propositions and propositional attitudes (eg ‘I am experiencing a sensation of pain in my big toe’, ‘I am imagining a picnic table’), but that doesn’t particularly seem like the natural lens to view those through.
That said, I do agree that propositions and propositional attitudes would be a more useful language to interpret LLMs through than eg activation vectors of float values.
I don’t think this description is philosophically convenient. Believing p and believing things that imply p are genuinely different states of affairs in a sensible theory of mind. Thinking through concrete mech interp examples of the former vs. the latter makes it less abstract in what sense they are different, but I think I would have objected to Chalmer’s definition even back before we knew anything about mech interp. It would just have been harder for me to articulate what exactly is wrong with it.
Something that Chalmers finds convenient, anyhow. I’m not sure how else we could view ‘dispositional beliefs’ if not as a philosophical construct; surely Chalmers doesn’t imagine that ANNs or human brains actively represent ‘p-or-q’ for all possible q.
To be fair here, from an omniscient perspective, believing P and believing things that imply P are genuinely the same thing in terms of results, but from a non-omniscient perspective, the difference matters.