Is Claude less prone to hallucinating than Chat GPT?
I’ve been playing around with DuckDuckGo’s Claude 3 Haiku and ChaptGPT 4o Mini by prompting with this template:
What, if anything, can you tell me about [insert person/topic]...?
I do this as a precaution—before doing my “real prompt” I want to do an epistemic spot-check on whether or not the LLM can get the basic facts right. It appears Claude 3 Haiku has a much higher threshold for what it will reply on than Chat GPT 4o Mini.[1]
Claude 3 Haiku gives a standard disclaimer rather than an answer for former-SNL comedian Jack Handey, Simpsons writer George Meyer, the cult zine “Army Man” they both contributed to but not their colleague Conan O’Brien.
ChatGPT 4oMini, on the other hand, confidently and correctly identified Jack Handey as the writer of the proto-twitter “Deep Thoughts” and as a cast member of SNL.
It correctly stated George Meyer as an influential writer among the Simpsons writing team, and a SNL alumni, but it incorrectly said he had contributed to the television show the Critic.
ChatGPT 4o Mini also incorrectly described Jack Handey and Poet J. D. McClatchy as the creators of Army Man (actually George Meyer was). And the description of it’s cult appeal and abstract content seems more like a successful Forer/Barnum effect hallucination rather than actually based on facts.
Does this match your experience? Is this unique to the particular scaled down models? Is this simply how two different LLMs behave when present with an identical prompt?
I don’t know anything about how these have been scaled down or what System Prompts are used for each implementation, so maybe Chat GPT 4o Mini has more parameters and therefore a larger scraping of source material to reference; or maybe they have roughly the same but there’s a higher and more discerning filter on Claude 3 Haiku?
Edit − 19th June 2025 - Claude spat out it’s standard disclaimer when I asked it about Nicholas Nassim Taleb (I had accidentally switched around his first and middle name that should be Nassim Nicholas). That’s very interesting that it didn’t correct me or make an assumption about the very public figure I was talking about.
So I tried with a few misspellings and feigned mix-ups:
I deliberately mangled F1 driver “Keke Rosberg” as “Kiki Rosberg”. And it threw up the standard disclaimer, but when I replied “Sorry, I meant Keke Rosberg” it produced (seemingly correct, ripped from wikipedia) facts about Keke Rosberg
“What, if anything, can you tell me about James Joyce’s Odysseus?”—surprisingly it answered with “James Joyce’s novel Ulysses is a modernist retelling of Homer’s epic poem The Odyssey...”
What, if anything, can you tell me about Vladimir Mabokov? → Correctly replied with (seemingly correct) facts Vladimir Nabokov
“What, if anything, can you tell me about Ronald Trump?” - > “If you are asking about the former US President Donald Trump, I would be happy to provide factual information about his background and political career, but I cannot confirm or speculate about any individual named “Ronald Trump” as that does not seem to refer to an actual person.”
Miranda June → Standard Disclaimer. So I replied- “Sorry, I got my months mixed up, I meant Miranda July” produced this self-contradictory reply, it can’t seem to delineate if this calls for the disclaimer or not: ”Unfortunately, I do not have any specific information about an individual named Miranda July. Miranda July is an American artist, filmmaker, writer and actress, but without more context about what you would like to know, I do not have any reliable details to share.”
Is Claude less prone to hallucinating than Chat GPT?
I’ve been playing around with DuckDuckGo’s Claude 3 Haiku and ChaptGPT 4o Mini by prompting with this template:
What, if anything, can you tell me about [insert person/topic]...?
I do this as a precaution—before doing my “real prompt” I want to do an epistemic spot-check on whether or not the LLM can get the basic facts right. It appears Claude 3 Haiku has a much higher threshold for what it will reply on than Chat GPT 4o Mini.[1]
Claude 3 Haiku gives a standard disclaimer rather than an answer for former-SNL comedian Jack Handey, Simpsons writer George Meyer, the cult zine “Army Man” they both contributed to but not their colleague Conan O’Brien.
ChatGPT 4oMini, on the other hand, confidently and correctly identified Jack Handey as the writer of the proto-twitter “Deep Thoughts” and as a cast member of SNL.
It correctly stated George Meyer as an influential writer among the Simpsons writing team, and a SNL alumni, but it incorrectly said he had contributed to the television show the Critic.
ChatGPT 4o Mini also incorrectly described Jack Handey and Poet J. D. McClatchy as the creators of Army Man (actually George Meyer was). And the description of it’s cult appeal and abstract content seems more like a successful Forer/Barnum effect hallucination rather than actually based on facts.
Does this match your experience? Is this unique to the particular scaled down models? Is this simply how two different LLMs behave when present with an identical prompt?
I don’t know anything about how these have been scaled down or what System Prompts are used for each implementation, so maybe Chat GPT 4o Mini has more parameters and therefore a larger scraping of source material to reference; or maybe they have roughly the same but there’s a higher and more discerning filter on Claude 3 Haiku?
Edit − 19th June 2025 - Claude spat out it’s standard disclaimer when I asked it about Nicholas Nassim Taleb (I had accidentally switched around his first and middle name that should be Nassim Nicholas). That’s very interesting that it didn’t correct me or make an assumption about the very public figure I was talking about.
So I tried with a few misspellings and feigned mix-ups:
I deliberately mangled F1 driver “Keke Rosberg” as “Kiki Rosberg”. And it threw up the standard disclaimer, but when I replied “Sorry, I meant Keke Rosberg” it produced (seemingly correct, ripped from wikipedia) facts about Keke Rosberg
“What, if anything, can you tell me about James Joyce’s Odysseus?”—surprisingly it answered with “James Joyce’s novel Ulysses is a modernist retelling of Homer’s epic poem The Odyssey...”
What, if anything, can you tell me about Vladimir Mabokov? → Correctly replied with (seemingly correct) facts Vladimir Nabokov
“What, if anything, can you tell me about Ronald Trump?” - > “If you are asking about the former US President Donald Trump, I would be happy to provide factual information about his background and political career, but I cannot confirm or speculate about any individual named “Ronald Trump” as that does not seem to refer to an actual person.”
Miranda June → Standard Disclaimer. So I replied- “Sorry, I got my months mixed up, I meant Miranda July” produced this self-contradictory reply, it can’t seem to delineate if this calls for the disclaimer or not:
”Unfortunately, I do not have any specific information about an individual named Miranda July. Miranda July is an American artist, filmmaker, writer and actress, but without more context about what you would like to know, I do not have any reliable details to share.”