RussellThor comments on Human-like metacognitive skills will reduce LLM slop and aid alignment and capabilities

RussellThor 13 Feb 2026 1:44 UTC
1 point
0
I don’t think we can push much beyond that before a system will figure out pretty much everything important about itself.
Yes but how much! IMO this is important. From my point of view I already have a mildly superintelligent maths/equation manipulation assistant, with no meaningful self awareness that I notice. DeepMind is advancing science with a system with far less meta-cognition than a similarly capable human would have. Just like there is an “alignment tax” there can be a “lack of self awareness or meta-cognition penalty”. While it is clear that superhuman AI will think about itself, it also seems clear that for a given level of capability an AI could have much less such abilities and habits than a human. The extent of this is unknown, task dependent and important.
Specifically what if you trained for both capabilities and lack of meta-cognitive like abilities? This could give you an idea of what the landscape looked like.