Garrett Baker comments on tdko’s Shortform

Garrett Baker 29 Jul 2025 14:20 UTC
4 points
2
This is interesting, Gemini 2.5 Pro has recently became my favorite model, especially over Opus (this from a long-time Claude user). I would not be surprised if I like it so much because of its lower task horizon, since its the one model I trust to not be uselessly sycophantic right now.
- Aaron Staley 29 Jul 2025 16:22 UTC
  5 points
  2
  Parent
  Coding agentic abilities are different from general chatbot abilities. Gemini is IMO the best chatbot there is (just in terms of understanding context well if you wish to analyze text/learn things/etc.). Claude on the other hand is dead last among the big 3 (a steep change from a year ago) and my guess is Anthropic isn’t trying much anymore (focusing on.. agentic coding instead)
  - Garrett Baker 29 Jul 2025 16:32 UTC
    5 points
    2
    Parent
    Hm, I notably would not trust Claude to agentically code for me either. I went from heavily using Claude Code to occasionally asking Gemini questions, and that I think has been a big improvement.
    
    Given METER’s other work, the obvious hypothesis is that Claude Code is mostly just better at manipulating me into thinking they can easily do what I want.
- uugr 29 Jul 2025 14:54 UTC
  1 point
  0
  Parent
  What’s the correlation between task horizon and useless sycophancy?
  - Garrett Baker 29 Jul 2025 15:04 UTC
    4 points
    2
    Parent
    I don’t know, subjectively it seems large, and it seems plausible they could be related.