shawnghu comments on Taking LLMs Seriously (As Language Models)

shawnghu 11 Jan 2026 0:17 UTC
1 point
0
There is some danger in this suggestion: it can improve the situational awareness of the LLM.
Why?
- Roman Malov 13 Jan 2026 0:32 UTC
  1 point
  0
  Parent
  By giving it more information about itself, which is self-awareness, and a big part of situational awareness.
  - shawnghu 13 Jan 2026 1:02 UTC
    1 point
    0
    Parent
    I think I see the logic. Were you thinking of making the model good at answering questions whose correct answer depend on the model itself, like “When asked a question of the form x, what proportion of the time would you tend to answer y?”
    
    The previous remark about being a microscope into its dataset seemed benign to me, e.g, if the model were already good at answering questions like “What proportion of datapoints satisfying predicates X satisfy predicate Y?”
    
    But perhaps you also argue that the latter induces some small amount of self-awareness → situational awareness?
    - Roman Malov 13 Jan 2026 11:50 UTC
      1 point
      0
      Parent
      Were you thinking of making the model good at answering questions whose correct answer depend on the model itself, like “When asked a question of the form x, what proportion of the time would you tend to answer y?”
      
      I’m not an author of this post, so I don’t know.
      
      I think one of the biggest dangers of this kind of self-awareness is that it allows models to know their level of accuracy in particular areas. Right now, they could be overconfident or underconfident in their abilities, which makes their plans less effective when actually implemented. If they are overconfident, their plan that relies on that ability would just fail; if they are underconfident, they are not using all of their capabilities.