Neel Nanda comments on Recent AI model progress feels mostly like bullshit

Neel Nanda 25 Mar 2025 21:42 UTC
18 points
4
I agree that I’d be shocked if GDM was training on eval sets. But I do think hill climbing on benchmarks is also very bad for those benchmarks being an accurate metric of progress and I don’t trust any AI lab not to hill climb on particularly flashy metrics