A consistent Claude pattern, which I find toxic, is that it uses faulty reasoning to induce a feeling of epistemic helplessness in the user. It interprets any ambiguity in prompts uncharitably, assuming the user is confused. It uses plainly inconsistent reasoning, hallucinated facts, to contest the user’s explanations for their observations.
By contrast, Gemini is an overly affirming cheerleader for the user’s point of view. It also hallucinates, but typically to confirm whatever the user already thinks. It latches on to the parts of a prompt that do make sense, without exposing genuinely faulty reasoning.
Both models are useful tools for learning and analysis. But no model, and no human, can be entirely trusted. Maintaining the right balance of trust and skepticism, as well as fundamental but calibrated epistemic confidence in oneself, being able to manage the LLMs instead of being managed by them, is critical.
A consistent Claude pattern, which I find toxic, is that it uses faulty reasoning to induce a feeling of epistemic helplessness in the user. It interprets any ambiguity in prompts uncharitably, assuming the user is confused. It uses plainly inconsistent reasoning, hallucinated facts, to contest the user’s explanations for their observations.
I don’t think I’ve experienced this? I’d be interested in example transcripts.
I won‘t post transcripts, as they contain private information, but I will tell relevant context.
I was discussing the NFP report from this week, trying to understand the market reaction mechanistically (a gap down early followed by a slow bleed-out over the rest of the day). My goal is to figure out whether there is a case for agile retail investors being able to avoid losses, without needing to predict the report’s contents, by exiting the market quickly after the print while large institutions take all day to rebalance positions.
Claude did things like:
Misrepresent my own claims about basic facts like when the report prints, then reply with statements matching my claims phrased as corrections.
Claim that the NFP information was fully priced in at market open, while attributing the ongoing declines through the rest of the day to the market “digesting” Broadcom’s earnings miss from days prior. The context for this inconsistent reasoning was to critique me for misattributing the loss and deny that there was a way to use a speedy reaction to advantage.
Assume my positions in micro cap stocks are so large that exiting would move the market and that the spread would destroy any benefits to loss avoidance, when my actual positions in any given stock are <$10k and the losses I took that day are much larger than the penny spread.
Misinterpreting a suggestion that the market’s subsequent directional decline through the rest of the day might be predictable for a suggestion that the exact lowest price might be predictable, then attacking the latter possibility.
Claude tends to present as nuanced and as carefully checking its own claims. But it almost reads as a mechanical assumption that the user is not nuanced and not stating facts accurately. Speculatively, it’s been trained in evals to catch faulty reasoning so hard that it starts with the assumption faulty reasoning is present and picks an interpretation of the prompt that makes it out to be confused.
Claude almost always accepts corrections when I make them, but it makes conversations a slog.
A consistent Claude pattern, which I find toxic, is that it uses faulty reasoning to induce a feeling of epistemic helplessness in the user. It interprets any ambiguity in prompts uncharitably, assuming the user is confused. It uses plainly inconsistent reasoning, hallucinated facts, to contest the user’s explanations for their observations.
By contrast, Gemini is an overly affirming cheerleader for the user’s point of view. It also hallucinates, but typically to confirm whatever the user already thinks. It latches on to the parts of a prompt that do make sense, without exposing genuinely faulty reasoning.
Both models are useful tools for learning and analysis. But no model, and no human, can be entirely trusted. Maintaining the right balance of trust and skepticism, as well as fundamental but calibrated epistemic confidence in oneself, being able to manage the LLMs instead of being managed by them, is critical.
I don’t think I’ve experienced this? I’d be interested in example transcripts.
I won‘t post transcripts, as they contain private information, but I will tell relevant context.
I was discussing the NFP report from this week, trying to understand the market reaction mechanistically (a gap down early followed by a slow bleed-out over the rest of the day). My goal is to figure out whether there is a case for agile retail investors being able to avoid losses, without needing to predict the report’s contents, by exiting the market quickly after the print while large institutions take all day to rebalance positions.
Claude did things like:
Misrepresent my own claims about basic facts like when the report prints, then reply with statements matching my claims phrased as corrections.
Claim that the NFP information was fully priced in at market open, while attributing the ongoing declines through the rest of the day to the market “digesting” Broadcom’s earnings miss from days prior. The context for this inconsistent reasoning was to critique me for misattributing the loss and deny that there was a way to use a speedy reaction to advantage.
Assume my positions in micro cap stocks are so large that exiting would move the market and that the spread would destroy any benefits to loss avoidance, when my actual positions in any given stock are <$10k and the losses I took that day are much larger than the penny spread.
Misinterpreting a suggestion that the market’s subsequent directional decline through the rest of the day might be predictable for a suggestion that the exact lowest price might be predictable, then attacking the latter possibility.
Claude tends to present as nuanced and as carefully checking its own claims. But it almost reads as a mechanical assumption that the user is not nuanced and not stating facts accurately. Speculatively, it’s been trained in evals to catch faulty reasoning so hard that it starts with the assumption faulty reasoning is present and picks an interpretation of the prompt that makes it out to be confused.
Claude almost always accepts corrections when I make them, but it makes conversations a slog.