RSS

Advik

Karma: 18

I like breaking models.

Researching red-teaming as part of the MATS 10.0 program with UK AISI. Dabbled a bit with mechanistic interp, blue-teaming and reverse engineering; I’m especially interested in discovering vulnerabilities and flaws in generative models and the science of post-training.

https://​​www.a3v1k.com/​​ :)

Brit­tle model or­ganisms ob­structs de­cep­tion elic­i­ta­tion work

22 Jun 2026 10:48 UTC
19 points
2 comments7 min readLW link