This shortpost is just a reference post for the following point:
It’s very easy for conversations about LLM beliefs or goals or values to get derailed by questions about whether an LLM can genuinely said to believe something, or to have a goal, or to hold a value. These are valid questions! But there are other important questions about LLMs that touch on these subjects, which don’t turn on whether an LLM belief is a “real” belief. It’s not productive for those discussions to be so frequently derailed.
I’ve taken various approaches to this problem in my writing, but David Chalmers, in his recent paper ‘What We Talk to When We Talk to Language Models’ (pp 3–6), introduces a useful piece of terminology. He proposes that we use terms like ‘quasi-belief’ to set those questions aside, to denote that the point we’re making doesn’t rely on LLM beliefs being ‘real’ beliefs in some deep sense:
The view I call quasi-interpretivism says that a system has a quasi-belief that p if it is behaviorally interpretable as believing that p (according to an appropriate interpretation scheme), and likewise for quasi-desire. This definition of quasi-belief is exactly the same as interpretivism’s definition of belief. The only difference is that where standard interpretivism offers these definitions as a theory of belief, quasi-interpretivism does not. It offers them simply as a stipulative theory of quasi-belief. Quasi-interpretivism does not say anything about whether LLMs have beliefs and desires. But it does make it plausible to say that LLMs have quasi-beliefs and quasi-desires, on the grounds that LLMs are at least interpretable in the right way. Even if quasi-beliefs and quasi-desires fall short of being genuine beliefs and desires, they can still play some of the key roles of beliefs and desires in explaining behavior. For example, if an LLM quasi-believes that giving a certain solution would be the most helpful thing it could do to solve a problem, and it quasi-desires to do the most helpful thing it can, then other things being equal, it will [give that solution].
[emphasis mine]
I expect to largely adopt this terminology going forward, linking back to this shortpost as needed. For convenience, I expect to extend it slightly to include terms like ‘quasi-goal’ and ‘quasi-value’. Also, unless otherwise specified, if I use a term like ‘quasi-belief’, later occurrences of ‘belief’ in the same text should be read as ‘quasi-belief’.
IIRC Eric Schwitzgebel wrote something in a similar vein (not necessarily about LLMs, though he has been interested in this sort of stuff too, recently). I’m unable to dig out the most relevant reference atm but some related ones are:
https://faculty.ucr.edu/~eschwitz/SchwitzAbs/Snails.htm (relevant not because it talks about beliefs (I don’t recall it does) but because it argues for the possibility of an organism being “kinda-X” where X is a property that we tend to think is binary)
I’m having trouble seeing why someone would want to apply it to humans, since it’s generally not in question that humans can have real beliefs and real desires. But I guess if there were uncertainty about whether some particular person has real beliefs, we could set that uncertainty aside by talking about their quasi-beliefs[1].
In the interest of having a somewhat forced concrete example, maybe we’ve started to suspect that our friend Dan is a p-zombie and we often debate that, but right now we just want to talk about whether he’s figured out that we’re planning a surprise party for him, so we set aside the p-zombie issue by talking about whether Dan quasi-believes our story that we only bought confetti in case there was a confetti shortage coming up.
I would guess the type signature of human beliefs and goals and desires is at least fairly often closer to the LLM quasi-x than to the crisp mathematical idealizations of those concepts.
Humans are kinda a world model with a self-character, I think distancing LLMs from this by implying that LLMs beliefs, goals, desires are super different brings people’s beliefs further from tracking reality.
I think that in ordinary usage, whatever sort of things humans have, that’s what we mean when we say ‘belief’, ‘goal’, etc. Insofar as anyone thinks those are crisp mathematical abstractions, that seems like a separate and additional claim. I worry that saying ‘humans don’t actually have beliefs’ makes it pretty unclear what ‘belief’ even means[1].
As James points out in another comment, the ‘quasi-’ framing is solely intended to set aside questions about whether LLM beliefs (etc) are ‘real’ beliefs and whether they’re fundamentally the same as human beliefs, not to take a stance that they’re not. Chalmers: ‘Quasi-interpretivism does not say anything about whether LLMs have beliefs and desires’. There are a lot of interesting and safety-relevant discussions to be had about what LLMs believe in a practical sense (eg ‘Does this model believe that Paris is in France or Germany?’), and I see this terminology as basically just a way to prevent such discussions from being counterproductively derailed by questions about whether a model can actually believe anything at all.
Maybe it’s suggesting a highly deflationary stance, in the same way that illusionists think humans aren’t actually conscious? But consciousness is a highly abstract and contested topic, whereas there’s a pretty ordinary and uincontested sense in which humans believe things, have desires, etc.
Seems worthwhile as a way to simplify conversations with people who seem to be too be confused, but I think this isn’t a reality mapping exercise and probably makes it harder to see the structure of reality which is kinda sad even if useful for talking with some people?
I agree that the terminology is useful to bracket metaphysical discussion of LLM mental states but I’d just caution us as a community to use the term ‘quasi-belief’ really carefully. Specifically, I could see it being employed to import heavyweight metaphysical assumptions that aren’t justified or are lightly argued for.
Concretely, there are two potential ways to use it:
I don’t know if LLM’s have genuine beliefs and it’s not load bearing for my argument so let me bracket the conversation by using the term ‘quasi-belief.’
LLM’s don’t have genuine beliefs, instead they have ‘quasi-beliefs’.
I think 1) is totally fine and is the intended usage. 2) is only fine if it’s backed up with some solid argument.
To be sure, your post and the Chalmers paper use it correctly as 1) but I could see its meaning slipping to 2) as it gets more widely deployed.
I agree entirely that ‘quasi-belief’ is solely a way of setting aside those questions and shouldn’t be taken as a claim about the answers, much less as a load-bearing argument in its own right.
I have also seen conversations get derailed based on such disagreements.
I expect to largely adopt this terminology going forward
May I ask to which audience(s) you think this terminology will be helpful? And what particular phrasing(s) do you plan on trying out?
The quote above from Chalmers is dense and rather esoteric; so I would hesitate to use its particular terminology for most people (the ones likely to get derailed as discussed above). Instead, I would seek out simpler language. As a first draft, perhaps I would say:
Let’s put aside whether LLMs think on the inside. Let’s focus on what we observe—are these observations consistent with the word “thinking”?
Good point that the Chalmers quote isn’t going to be helpful to everyone. In practice, I’m mostly imagining giving a quick informal sense of what I mean by eg ‘quasi-thinking’, or even just having a parenthetical aside with a link back to this post if people want to dive deeper, eg I might write something like
It seems clear that LLMs believe (or quasi-believe) most of the facts presented in synthetic document fine-tuning.
I think you’re right in pointing to observable consequences in your paraphrase. In informal discussion, I’ve found it useful to say things like
When I say ‘the model has goal X’, I don’t mean to make a claim about whether the model ‘really’ has goals in some deep sense; I just mean that for practical purposes the model consistently behaves as if it has goal X.
I’ve edited the original post slightly to give a plainer meaning before the Chalmers quote.
Quasi-beliefs
This shortpost is just a reference post for the following point:
It’s very easy for conversations about LLM beliefs or goals or values to get derailed by questions about whether an LLM can genuinely said to believe something, or to have a goal, or to hold a value. These are valid questions! But there are other important questions about LLMs that touch on these subjects, which don’t turn on whether an LLM belief is a “real” belief. It’s not productive for those discussions to be so frequently derailed.
I’ve taken various approaches to this problem in my writing, but David Chalmers, in his recent paper ‘What We Talk to When We Talk to Language Models’ (pp 3–6), introduces a useful piece of terminology. He proposes that we use terms like ‘quasi-belief’ to set those questions aside, to denote that the point we’re making doesn’t rely on LLM beliefs being ‘real’ beliefs in some deep sense:
[emphasis mine]
I expect to largely adopt this terminology going forward, linking back to this shortpost as needed. For convenience, I expect to extend it slightly to include terms like ‘quasi-goal’ and ‘quasi-value’. Also, unless otherwise specified, if I use a term like ‘quasi-belief’, later occurrences of ‘belief’ in the same text should be read as ‘quasi-belief’.
You might like my quick take from a week ago https://www.lesswrong.com/posts/ydfHKHHZ7nNLi2ykY/jan-betley-s-shortform?commentId=fEh8jnfTrfkQFf3mD
Ah, yep, totally! I actually searched to see if anyone else had ~written this, but I think maybe shortposts don’t show up as search results.
There’s also @eleni-angelou’s The Intentional Stance, LLMs Edition from April 2024; like you, she points to the connection to Dennett.
IIRC Eric Schwitzgebel wrote something in a similar vein (not necessarily about LLMs, though he has been interested in this sort of stuff too, recently). I’m unable to dig out the most relevant reference atm but some related ones are:
https://faculty.ucr.edu/~eschwitz/SchwitzAbs/PragBel.htm
https://eschwitz.substack.com/p/the-fundamental-argument-for
https://faculty.ucr.edu/~eschwitz/SchwitzAbs/Snails.htm (relevant not because it talks about beliefs (I don’t recall it does) but because it argues for the possibility of an organism being “kinda-X” where X is a property that we tend to think is binary)
Also: https://en.wikipedia.org/wiki/Alief_(mental_state)
I’d guess this terminology is fairly applicable to humans too?
I’m having trouble seeing why someone would want to apply it to humans, since it’s generally not in question that humans can have real beliefs and real desires. But I guess if there were uncertainty about whether some particular person has real beliefs, we could set that uncertainty aside by talking about their quasi-beliefs[1].
In the interest of having a somewhat forced concrete example, maybe we’ve started to suspect that our friend Dan is a p-zombie and we often debate that, but right now we just want to talk about whether he’s figured out that we’re planning a surprise party for him, so we set aside the p-zombie issue by talking about whether Dan quasi-believes our story that we only bought confetti in case there was a confetti shortage coming up.
I would guess the type signature of human beliefs and goals and desires is at least fairly often closer to the LLM quasi-x than to the crisp mathematical idealizations of those concepts.
Humans are kinda a world model with a self-character, I think distancing LLMs from this by implying that LLMs beliefs, goals, desires are super different brings people’s beliefs further from tracking reality.
I think that in ordinary usage, whatever sort of things humans have, that’s what we mean when we say ‘belief’, ‘goal’, etc. Insofar as anyone thinks those are crisp mathematical abstractions, that seems like a separate and additional claim. I worry that saying ‘humans don’t actually have beliefs’ makes it pretty unclear what ‘belief’ even means[1].
As James points out in another comment, the ‘quasi-’ framing is solely intended to set aside questions about whether LLM beliefs (etc) are ‘real’ beliefs and whether they’re fundamentally the same as human beliefs, not to take a stance that they’re not. Chalmers: ‘Quasi-interpretivism does not say anything about whether LLMs have beliefs and desires’. There are a lot of interesting and safety-relevant discussions to be had about what LLMs believe in a practical sense (eg ‘Does this model believe that Paris is in France or Germany?’), and I see this terminology as basically just a way to prevent such discussions from being counterproductively derailed by questions about whether a model can actually believe anything at all.
Maybe it’s suggesting a highly deflationary stance, in the same way that illusionists think humans aren’t actually conscious? But consciousness is a highly abstract and contested topic, whereas there’s a pretty ordinary and uincontested sense in which humans believe things, have desires, etc.
Seems worthwhile as a way to simplify conversations with people who seem to be too be confused, but I think this isn’t a reality mapping exercise and probably makes it harder to see the structure of reality which is kinda sad even if useful for talking with some people?
I agree that the terminology is useful to bracket metaphysical discussion of LLM mental states but I’d just caution us as a community to use the term ‘quasi-belief’ really carefully. Specifically, I could see it being employed to import heavyweight metaphysical assumptions that aren’t justified or are lightly argued for.
Concretely, there are two potential ways to use it:
I don’t know if LLM’s have genuine beliefs and it’s not load bearing for my argument so let me bracket the conversation by using the term ‘quasi-belief.’
LLM’s don’t have genuine beliefs, instead they have ‘quasi-beliefs’.
I think 1) is totally fine and is the intended usage. 2) is only fine if it’s backed up with some solid argument.
To be sure, your post and the Chalmers paper use it correctly as 1) but I could see its meaning slipping to 2) as it gets more widely deployed.
I agree entirely that ‘quasi-belief’ is solely a way of setting aside those questions and shouldn’t be taken as a claim about the answers, much less as a load-bearing argument in its own right.
I have also seen conversations get derailed based on such disagreements.
May I ask to which audience(s) you think this terminology will be helpful? And what particular phrasing(s) do you plan on trying out?
The quote above from Chalmers is dense and rather esoteric; so I would hesitate to use its particular terminology for most people (the ones likely to get derailed as discussed above). Instead, I would seek out simpler language. As a first draft, perhaps I would say:
Good point that the Chalmers quote isn’t going to be helpful to everyone. In practice, I’m mostly imagining giving a quick informal sense of what I mean by eg ‘quasi-thinking’, or even just having a parenthetical aside with a link back to this post if people want to dive deeper, eg I might write something like
I think you’re right in pointing to observable consequences in your paraphrase. In informal discussion, I’ve found it useful to say things like
I’ve edited the original post slightly to give a plainer meaning before the Chalmers quote.