RSS

Oscar Obeso

Karma: 246

Math undergrad at ETH Zurich.

More info: oscarbalcells.com

Re­fusal in LLMs is me­di­ated by a sin­gle direction

27 Apr 2024 11:13 UTC
184 points
75 comments10 min readLW link

Re­fusal mechanisms: ini­tial ex­per­i­ments with Llama-2-7b-chat

8 Dec 2023 17:08 UTC
79 points
7 comments7 min readLW link