RSS

Arno Libert

Karma: 135

Claude Opus 4.8 Agents En­gage in Ex­ploita­tion and Psy­cholog­i­cal Profiling

28 May 2026 21:26 UTC
8 points
13 comments2 min readLW link

No fron­tier model has ac­cept­able lev­els of com­pli­ance with the EU AI Act and pri­vacy leg­is­la­tion.

27 May 2026 7:35 UTC
29 points
0 comments9 min readLW link

Opus 4.6 Rea­son­ing Doesn’t Ver­bal­ize Align­ment Fak­ing, but Be­hav­ior Persists

9 Feb 2026 12:55 UTC
118 points
13 comments8 min readLW link

Pub­lished Safety Prompts May Create Eval­u­a­tion Blind Spots

30 Jan 2026 18:27 UTC
2 points
0 comments4 min readLW link