ajskateboarder comments on Latent Introspection (and other open-source introspection papers)

ajskateboarder 8 Apr 2026 16:12 UTC
1 point
0
(It’s also possible that this capability is being surfaced as a consequence of base-model training, but just isn’t ever useful for the base-model next token prediction training objective directly, so it gets buried even in base models.)
FWIW, there is work suggesting that this capability largely emerges from post-training/preference optimization, though I guess it might depend on the training pipeline; looks like only one model was studied for this