p.b. comments on p.b.’s Shortform

p.b. 5 Nov 2025 9:12 UTC
2 points
0
Hmm, actually all these checks can’t distinguish between actually unsolvable tasks and tasks that are unsolvable for further scaled up models of the current kind (with the framework and compute used in the evaluations).