RSS

jsteinhardt

Karma: 6,188

Cog­ni­tive Se­cu­rity as an AI Safety Cause Area

jsteinhardt25 May 2026 18:30 UTC
155 points
16 comments2 min readLW link

The Case for Eval­u­at­ing Model Behaviors

jsteinhardt20 May 2026 18:42 UTC
40 points
3 comments3 min readLW link

Build­ing Tech­nol­ogy to Drive AI Governance

jsteinhardt18 Feb 2026 18:30 UTC
59 points
4 comments10 min readLW link
(bounded-regret.ghost.io)

Over­sight As­sis­tants: Turn­ing Com­pute into Understanding

jsteinhardt6 Jan 2026 0:50 UTC
85 points
7 comments9 min readLW link
(bounded-regret.ghost.io)

Scal­able End-to-End Interpretability

jsteinhardt18 Dec 2025 22:37 UTC
120 points
3 comments3 min readLW link

An­a­lyz­ing long agent tran­scripts (Do­cent)

jsteinhardt24 Mar 2025 20:49 UTC
41 points
2 comments1 min readLW link
(bounded-regret.ghost.io)