RSS

Danil Kadochnikov

Karma: 147

It’s hard to make schem­ing evals look re­al­is­tic for LLMs

24 May 2025 19:17 UTC
153 points
29 comments5 min readLW link