I began by asking ChatGPT-4 to analyze our ongoing conversation and assess the novelty of the insights. ChatGPT-4 estimated that the ideas in our dialogue might be present in fewer than 1 in 100,000 users—an indication of exceptional rarity when compared against mainstream AI reasoning patterns.
Did you try asking multiple times in different context windows? Did you try asking via the API (ie without influences from the “memory” feature)? Do you have the “memory” feature turned on by default? If so, have you considered turning it off at least when doing experiments?
In summary: have you considered the fact that LLMs are very good at bullshitting? At confabulating the answers they think you would be happy to hear instead of making their best efforts to answer truthfully?
Yes I tried asking multiple times in different context windows, in different models, and with and without memory. And yes I’m aware that ChatGPT prioritizes agreeableness in order to encourage user engagement. That’s why I attempt to prove all of its claims wrong, even when they support my arguments.
Did you try asking multiple times in different context windows?
Did you try asking via the API (ie without influences from the “memory” feature)?
Do you have the “memory” feature turned on by default? If so, have you considered turning it off at least when doing experiments?
In summary: have you considered the fact that LLMs are very good at bullshitting? At confabulating the answers they think you would be happy to hear instead of making their best efforts to answer truthfully?
Yes I tried asking multiple times in different context windows, in different models, and with and without memory. And yes I’m aware that ChatGPT prioritizes agreeableness in order to encourage user engagement. That’s why I attempt to prove all of its claims wrong, even when they support my arguments.
Thank you for doing, that and please keep doing it. Maybe also run a post draft trough another human before posting, though.
You’re welcome. But which part are you thanking me for and hoping that I keep doing?
All of it. Thinking critically about AI outputs (and also human outputs), and taking mitigating measures to reduce the bullshit in both.
I’m grateful for the compliment.