RSS

ariana_azarbal

Karma: 653

Es­ti­mat­ing No-CoT Task-Com­ple­tion Time Hori­zons of Fron­tier AI Models

10 Jun 2026 17:58 UTC
247 points
20 comments4 min readLW link

Con­fu­sion around the term re­ward hacking

ariana_azarbal20 Mar 2026 16:13 UTC
60 points
6 comments5 min readLW link

Re­con­tex­tu­al­iza­tion Miti­gates Speci­fi­ca­tion Gam­ing Without Mod­ify­ing the Specification

14 Oct 2025 0:53 UTC
144 points
15 comments10 min readLW link