RSS

Devina Jain

Karma: 2

Can Per­sua­sion Break AI Safety? Ex­plor­ing the In­ter­play Between Fine-Tun­ing, At­tacks, and Guardrails

Devina JainFeb 4, 2025, 7:10 PM
3 points
0 comments10 min readLW link