Often when I think, write, say, etc. something, I put it to all LLMs connected to the internet, and say something like:
“Fact-check every distinct factual claim in the following text. For each claim, search the web to verify it, then report whether it’s accurate, inaccurate, partially accurate, misleading, oversimplified, unreliable, unverifiable, or unfalsifiable. Note any important nuance or context that’s omitted. Be as scientific, rational, sceptical, etc., as possible, do not just agree. Quote the specific claims and provide your findings with sources you relied on:”
And how it often corrects me is great. But sometimes it’s wrong (supported by sources) and I have to correct it back or we have a discussion.
GPT-5.4 is the most correcting (thx bro) and Gemini and Grok are the most sycophantic fucks (I don’t believe you telling me I’m absolutely right all the time), and Claude is in the middle. I haven’t tested enough other models yet, like the Chinese and open models.
Often when I think, write, say, etc. something, I put it to all LLMs connected to the internet, and say something like:
“Fact-check every distinct factual claim in the following text. For each claim, search the web to verify it, then report whether it’s accurate, inaccurate, partially accurate, misleading, oversimplified, unreliable, unverifiable, or unfalsifiable. Note any important nuance or context that’s omitted. Be as scientific, rational, sceptical, etc., as possible, do not just agree. Quote the specific claims and provide your findings with sources you relied on:”
And how it often corrects me is great. But sometimes it’s wrong (supported by sources) and I have to correct it back or we have a discussion.
GPT-5.4 is the most correcting (thx bro) and Gemini and Grok are the most sycophantic fucks (I don’t believe you telling me I’m absolutely right all the time), and Claude is in the middle. I haven’t tested enough other models yet, like the Chinese and open models.