Chris_Leong comments on Transformers Represent Belief State Geometry in their Residual Stream

Chris_Leong 17 Apr 2024 0:55 UTC
31 points
14
“The structure of synchronization is, in general, richer than the world model itself. In this sense, LLMs learn more than a world model” given that I expect this is the statement that will catch a lot of people’s attention.
Just in case this claim caught anyone else’s attention, what they mean by this is that it contains:
• A model of the world
• A model of the agent’s process for updating its belief about which state the world is in
- snewman 1 May 2024 21:54 UTC
  3 points
  0
  Parent
  I am trying to wrap my head around the high-level implications of this statement. I can come up with two interpretations:
  1. What LLMs are doing is similar to what people do as they go about their day. When I walk down the street, I am simultaneously using visual and other input to assess the state of the world around me (“that looks like a car”), running a world model based on that assessment (“the car is coming this way”), and then using some other internal mechanism to decide what to do (“I’d better move to the sidewalk”).
  2. What LLMs are doing is harder than what people do. When I converse with someone, I have some internal state, and I run some process in my head – based on that state – to generate my side of the conversation. When an LLM converses with someone, instead of maintaining internal state, needs to maintain a probability distribution over possible states, make next-token predictions according to that distribution, and simultaneously update the distribution.
  (2) seems more technically correct, but my intuition dislikes the conclusion, for reasons I am struggling to articulate. …aha, I think this may be what is bothering me: I have glossed over the distinction between input and output tokens. When an LLM is processing input tokens, it is working to synchronize its state to the state of the generator. Once it switches to output mode, there is no functional benefit to continuing to synchronize state (what is it synchronizing to?), so ideally we’d move to a simpler neural net that does not carry the weight of needing to maintain and update a probability distribution of possible states. (Glossing over the fact that LLMs as used in practice sometimes need to repeatedly transition between input and output modes.) LLMs need the capability to ease themselves into any conversation without knowing the complete history of the participant they are emulating, while people have (in principle) access to their own complete history and so don’t need to be able to jump into a random point in their life and synchronize state on the fly.
  So the implication is that the computational task faced by an LLM which can emulate Einstein is harder than the computational task of being Einstein… is that right? If so, that in turn leads to the question of whether there are alternative modalities for AI which have the advantages of LLMs (lots of high-quality training data) but don’t impose this extra burden. It also raises the question of how substantial this burden is in practice, in particular for leading-edge models.
  - AlbertGarde 13 May 2024 17:59 UTC
    2 points
    0
    Parent
    You are drawing a distinction between agents that maintain a probability distribution over possible states and those that don’t and you’re putting humans in the latter category. It seems clear to me that all agents are always doing what you describe in (2), which I think clears up what you don’t like about it.
    It also seems like humans spend varying amounts of energy on updating probability distributions vs. predicting within a specific model, but I would guess that LLMs can learn to do the same on their own.
    - snewman 13 May 2024 18:05 UTC
      2 points
      0
      Parent
      As I go about my day, I need to maintain a probability distribution over states of the world. If an LLM tries to imitate me (i.e. repeatedly predict my next output token), it needs to maintain a probability distribution, not just over states of the world, but also over my internal state (i.e. the state of the agent whose outputs it is predicting). I don’t need to keep track of multiple states that I myself might be in, but the LLM does. Seems like that makes its task more difficult?
      Or to put an entirely different frame on the the whole thing: the job of a traditional agent, such as you or me, is to make intelligent decisions. An LLM’s job is to make the exact same intelligent decision that a certain specific actor being imitated would make. Seems harder?
      - Brent 23 May 2024 21:05 UTC
        1 point
        0
        Parent
        I agree with you that the LLM’s job is harder, but I think that has a lot to do with the task being given to the human vs. LLM being different in kind. The internal states of a human (thoughts, memories, emotions, etc) can be treated as inputs in the same way vision and sound are. A lot of the difficulty will come from the LLM being given less information, similar to how a human who is blindfolded will have a harder time performing a task where vision would inform what state they are in. I would expect if an LLM was given direct access to the same memories, physical senations, emotions, etc of a human (making the task more equivalent) it could have a much easier time emulating them.
        Another analogy for what I’m trying to articulate, imagine a set of twins swapping jobs for the day, they would have a much harder time trying to imitate the other than imitate themselves. Similarly, a human will have a harder time trying to make the same decisions an LLM would make, than the LLM just being itself. The extra modelling of missing information will always make things harder. Going back to your Einstein example, this has the interesting implication that the computational task of an LLM emulating Einstein may be a harder task than an LLM just being a more intelligent agent than Einstein.
        snewman 24 May 2024 0:19 UTC
        1 point
        0
        Parent
        I think we’re saying the same thing? “The LLM being given less information [about the internal state of the actor it is imitating]” and “the LLM needs to maintain a probability distribution over possible internal states of the actor it is imitating” seem pretty equivalent.