testingthewaters comments on AI CoT Reasoning Is Often Unfaithful

testingthewaters 5 Apr 2025 10:27 UTC
1 point
0
I mean, this applies to humans too. The words and explanations we use for our actions are often just post hoc rationalisations. An efficient text predictor must learn not what the literal words in front of them mean, but the implied scenario and thought process they mask, and that is a strictly nonlinear and “unfaithful” process.