A Scanner Brightly comments on Emergent Introspective Awareness in Large Language Models

A Scanner Brightly 31 Oct 2025 12:11 UTC
2 points
0
This paper makes me really curious:
1. Could models trained to introspect “daydream” or boost features of their choosing, a la “don’t think about aquariums”?
2. A continuation of #1, could models be given a tool to boost their own features to assist with specific tasks, (for example, features related to reasoning/hard problem solving) and would this lead to meaningful results/be useful?
3. Perhaps less fanciful than the above, could smaller models have permanent feature boostings to help focus them to narrow tasks?
- Expertium 1 Nov 2025 8:36 UTC
  2 points
  0
  Parent
  You might be interested in https://www.lesswrong.com/posts/dvbRv97GpRg5gXKrf/run-time-steering-can-surpass-post-training-reasoning-task