RSS

Le magicien quantique

Karma: 10

:eyes:

Ex­plor­ing the multi-di­men­sional re­fusal sub­space in rea­son­ing models

Le magicien quantique27 Oct 2025 9:03 UTC
5 points
2 comments4 min readLW link

Sub­space Rer­out­ing: Us­ing Mechanis­tic In­ter­pretabil­ity to Craft Ad­ver­sar­ial At­tacks against Large Lan­guage Models

Le magicien quantique18 Mar 2025 17:55 UTC
6 points
1 comment10 min readLW link