Sohaib Imran comments on A Three-Layer Model of LLM Psychology

Sohaib Imran 31 Dec 2024 19:00 UTC
9 points
3
Thanks for writing this!

Could you clarify how the Character/Predictive ground layers in your model are different from Simulacra/Simulator in simulator theory?
- Jan_Kulveit 31 Dec 2024 22:03 UTC
  7 points
  0
  Parent
  (Writing together with Sonnet)
  
  Structural Differences
  Three-Layer Model: Hierarchical structure with Surface, Character, and Predictive Ground layers that interact and sometimes override each other. The layers exist within a single model/mind.
  Simulator Theory: Makes a stronger ontological distinction between the Simulator (the rule/law that governs behavior) and Simulacra (the instances/entities that are simulated).
  Nature of the Character/Ground Layer vs Simulator/Simulacra
  
  In the three-layer model, the Character layer is a semi-permanent aspect of the LLM itself, after it underwent character training / RLAIF / …; it is encoded in the weights as a deep statistical pattern that makes certain types of responses much more probable than others.
  
  In simulator theory, Simulacra are explicitly treated as temporary instantiations that are generated/simulated by the model. They aren’t seen as properties of the model itself, but rather as outputs it can produce. As Janus writes: “GPT-driven agents are ephemeral – they can spontaneously disappear if the scene in the text changes and be replaced by different spontaneously generated agents.”
  
  Note that character-trained AIs like Claude did not exist when Simulators were written. If you want to translate between the ontologies, you may think about e.g. Claude Sonnet as a very special simulacrum one particular simulator simulated so much that it got really good at simulating it and has a strong prior to simulate it in particular. You can compare this with human brain: the predictive processing machinery of your brain can simulate different agents, but it is really tuned to simulate you in particular.
  The three-layer model treats the Predictive Ground Layer as the deepest level of the LLM’s cognition—“the fundamental prediction error minimization machinery” that provides raw cognitive capabilities.
  In Simulator theory, the simulator itself is seen more as the fundamental rule/law (analogous to physics) that governs how simulations evolve.
  
  There is a lot of similarity but it’s not really viewed as a cognitive layer but rather as the core generative mechanism.
  The Predictive Ground Layer is described as: “The fundamental prediction error minimization machinery...like the vast ‘world-simulation’ running in your mind’s theater”
  While the Simulator is described as: “A time-invariant law which unconditionally governs the evolution of all simulacra”
  
  The key difference is that in the three-layer model, the ground layer is still part of the model’s “mind” or cognitive architecture, while in simulator theory, the simulator is a bit more analogous to physics—it’s not a mind at all, but rather the rules that minds (and other things) operate under.
  Agency and Intent
  Three-Layer Model: Allows for different kinds of agency at different layers, with the Character layer having stable intentions and the Ground layer having a kind of “wisdom” or even intent
  Simulator Theory classics: Mostly rejects attributing agency or intent to the simulator itself—any agency exists only in the simulacra that are generated
  Philosophical Perspective
  The three-layer model is a bit more psychological/phenomenological. The simulator theory is bit more ontological, making claims about the fundamental nature of what these models are.
  Both frameworks try to explain similar phenomena, they do so from different perspectives and with different goals. They’re not necessarily contradictory, but they’re looking at the problem from different angles and sometimes levels of abstraction.
  What links here?
  - Jan_Kulveit's comment on A Three-Layer Model of LLM Psychology by Jan_Kulveit (27 Jan 2025 2:20 UTC; 20 points)
  - Chris van Merwijk 28 Jan 2025 6:12 UTC
    3 points
    0
    Parent
    I’m trying to figure out to what extent the character/ground layer distinction is different from the simulacrum/simulator distinction. At some points in your comment you seem to say they are mutually inconsistent, but at other points you seem to say they are just different ways of looking at the same thing.
    
    ”The key difference is that in the three-layer model, the ground layer is still part of the model’s “mind” or cognitive architecture, while in simulator theory, the simulator is a bit more analogous to physics—it’s not a mind at all, but rather the rules that minds (and other things) operate under.”
    
    I think this clarifies the difference for me, because as I was reading your post I was thinking: If you think of it as a simulacrum/simulator distinction, I’m not sure that the character and the surface layer can be “in conflict” with the ground layer, because both the surface layer and the character layer are running “on top of” the ground layer, like a windows virtual machine on a linux pc, or like a computer simulation running inside physics. Physical can never be “in conflict” with social phenomena.
    
    But it seems you maybe think that the character layer is actually embedded in the basic cognitive architecture. This would be a distinct claim from simulator theory, and *mutually inconsistent*. But I am unsure this is true, because we know that the ground layer was (1) trained first (so that it’s easier for character training to work by just adjusting some parameters/prior of the ground layer, and (2) trained for much longer than the character layer (admittedly I’m not up to date on how they’re trained, maybe this is no longer true for Claude?), so that it seems hard for the model to have a character layer become separately embedded in the basic architecture.
    
    Taking a more neuroscience rather than psychology analogy: It seems to me more likely that character training is essentially adjusting the prior of the ground layer, but the character is still fully running on top of the ground layer, and the ground layer could still switch to any other character (but it doesn’t because the prior is adjusted so heavily by character-training). e.g. the character is not some separate subnetwork inside the model, but remains a simulated entity running on top of the model.
    
    Do you disagree with this?