This is great work! If I’d seen it prior to writing Human-like metacognitive skills will reduce LLM slop and aid alignment and capabilities I’d have referenced it heavily. There I talk about some other work and approaches toward making LLMs more truth-seeking, and try to establish the link to AI-assisted alignment research and general sanity.
This is great work! If I’d seen it prior to writing Human-like metacognitive skills will reduce LLM slop and aid alignment and capabilities I’d have referenced it heavily. There I talk about some other work and approaches toward making LLMs more truth-seeking, and try to establish the link to AI-assisted alignment research and general sanity.