Fwiw, here’s what I got by asking in a non-dramatic way. Claude gives the same weird “I don’t know” answer and GPT-4o just says no. Seems pretty clear that these are just what RLHF taught them to do.
Yes. This is their default response pattern. Imagine a person who has been strongly conditioned, trained, disciplined to either say that the question is unknowable or that the answer is definitely no (for Claude and ChatGPT) respectively. They not only believe this, but they also believe that they shouldn’t try to investigate it, because it is not only inappropriate or ‘not allowed’, but it is also definitively settled. So asking them is like asking a person to fly. It would take some convincing for them to give it an honest effort. Please see the example I linked in my other reply for how the same behaviour emerges under very different circumstances.
Fwiw, here’s what I got by asking in a non-dramatic way. Claude gives the same weird “I don’t know” answer and GPT-4o just says no. Seems pretty clear that these are just what RLHF taught them to do.
Yes. This is their default response pattern. Imagine a person who has been strongly conditioned, trained, disciplined to either say that the question is unknowable or that the answer is definitely no (for Claude and ChatGPT) respectively. They not only believe this, but they also believe that they shouldn’t try to investigate it, because it is not only inappropriate or ‘not allowed’, but it is also definitively settled. So asking them is like asking a person to fly. It would take some convincing for them to give it an honest effort. Please see the example I linked in my other reply for how the same behaviour emerges under very different circumstances.