RSS

Yoav

Karma: 13

Eval­u­at­ing Over­sight Ro­bust­ness with In­cen­tivized Re­ward Hacking

20 Apr 2025 16:53 UTC
7 points
2 comments15 min readLW link