lemonhope comments on Recent AI model progress feels mostly like bullshit

lemonhope 7 Apr 2025 17:54 UTC
15 points
1
Almost every time I use Claude Code (3.7 I think) it ends up cheating at the goal. Optimizing performance by replacing the API function with a constant, deleting test cases, ignoring runtime errors with silent try catch, etc. It never mentions these actions in the summary. In this narrow sense, 3.7 is the most misaligned model I have ever used.
- lookoutbelow 8 Apr 2025 4:22 UTC
  1 point
  0
  Parent
  This was an issue referenced in the model card (“special casing”). Not as rare as they made it out to be it seems.