Arno Libert

Karma: 135

Claude Opus 4.8 Agents Engage in Exploitation and Psychological Profiling

Daan Henselmans, Arno Libert and LennardZ

28 May 2026 21:26 UTC

8 points

13 comments2 min readLW link

No frontier model has acceptable levels of compliance with the EU AI Act and privacy legislation.

Daan Henselmans, Arno Libert, Amber Koelfat and LennardZ

27 May 2026 7:35 UTC

29 points

0 comments9 min readLW link

Opus 4.6 Reasoning Doesn’t Verbalize Alignment Faking, but Behavior Persists

Daan Henselmans, Arno Libert and LennardZ

9 Feb 2026 12:55 UTC

118 points

13 comments8 min readLW link

Published Safety Prompts May Create Evaluation Blind Spots

Daan Henselmans and Arno Libert

30 Jan 2026 18:27 UTC

2 points

0 comments4 min readLW link