RSS

Yavuz Bakman

Karma: 34

Thinking about AI Alignment and Reliability.

LLM Misal­ign­ment Can be One Gra­di­ent Step Away, and Black­box Eval­u­a­tion Can­not De­tect It.

Yavuz Bakman15 Mar 2026 0:19 UTC
34 points
5 comments3 min readLW link