RSS

Damon Falck

Karma: 38

A Con­cep­tual Frame­work for Ex­plo­ra­tion Hacking

12 Feb 2026 16:33 UTC
25 points
2 comments9 min readLW link

Ex­plo­ra­tion hack­ing: can rea­son­ing mod­els sub­vert RL?

30 Jul 2025 22:02 UTC
22 points
4 comments9 min readLW link