RSS

Danil Kadochnikov

Karma: 125

It’s hard to make schem­ing evals look re­al­is­tic for LLMs

May 24, 2025, 7:17 PM
131 points
26 comments5 min readLW link