eggsyntax comments on LLM Generality is a Timeline Crux

eggsyntax 25 Jun 2024 8:36 UTC
5 points
0
Thanks for evaluating it in detail. I assumed that they at least hadn’t screwed up the problems! Editing the piece to note that the paper has problems.
Disappointingly, a significant number of existing benchmarks & evals have problems like that IIRC.
- lberglund 25 Jun 2024 13:31 UTC
  1 point
  0
  Parent
  Thanks for writing this post!