RSS

Twm Stone

Karma: 108

Es­ti­mat­ing No-CoT Task-Com­ple­tion Time Hori­zons of Fron­tier AI Models

10 Jun 2026 17:58 UTC
148 points
1 comment4 min readLW link

FLAKE-Bench: Out­sourc­ing Awk­ward­ness in the Age of AI

1 Apr 2025 17:08 UTC
45 points
0 comments2 min readLW link