gwern comments on Wei Dai’s Shortform

gwern 25 Oct 2025 21:00 UTC
3 points
0

Reviewing my LW posts/comments (any clear flaws, any objections I should pre-empt, how others might respond)

Does Gemini-2.5-pro still work for this given how sycophantic the post-0325 models were?
- Wei Dai 26 Oct 2025 3:07 UTC
  6 points
  0
  Parent
  I’m still using it for this purpose, but don’t have a good sense of how much worse it is compared to pre-0325. However I’m definitely very wary of the sycophancy and overall bad judgment. I’m only using them to point out potential issues I may have overlooked, and not e.g. whether a draft is ready to post, or whether some potential issue is a real issue that needs to be fixed. All the models I’ve tried seem to err a lot in both directions.