Felix J Binder comments on How Self-Aware Are LLMs?

Felix J Binder 23 Jul 2025 3:26 UTC
1 point
0
This paper (https://arxiv.org/abs/2501.11120) is directly investigating this ability and finds that models can, in a number of different domains, explain the policy that they have been trained to follow, even when that training only consisted of examples (but not descriptions) of the policy