Quentin FEUILLADE--MONTIXI

Karma: 320

I am a former 42.fr student, and SERI Mats 3 scholar. I am interested in studying AI with a behavioral approach (Model Ethology). I worked for METR (ARC Eval at the time I worked there) and did independent red teaming for OpenAI and Anthropic. I am an ex co-founder of PRISM Eval. I am currently working on a programming language for AI systems to help developers building more robust AI systems

Three Properties for Alignment (and Why We’re Not Training Them)

Quentin FEUILLADE--MONTIXI16 Mar 2026 20:26 UTC

8 points

5 comments3 min readLW link

The Topology of LLM Behavior

Quentin FEUILLADE--MONTIXI28 Feb 2026 0:36 UTC

28 points

8 comments5 min readLW link

(weavemind.ai)

Emergence, The Blind Spot of GenAI Interpretability?

Quentin FEUILLADE--MONTIXI10 Aug 2024 10:07 UTC

16 points

7 comments3 min readLW link

Studying The Alien Mind

Quentin FEUILLADE--MONTIXI and Niki Dupuis

5 Dec 2023 17:27 UTC

80 points

10 comments15 min readLW link

Scalable And Transferable Black-Box Jailbreaks For Language Models Via Persona Modulation

Soroush Pour, rusheb, Quentin FEUILLADE--MONTIXI, Arush and scasper

7 Nov 2023 17:59 UTC

38 points

2 comments2 min readLW link

(arxiv.org)

The Stochastic Parrot Hypothesis is debatable for the last generation of LLMs

Quentin FEUILLADE--MONTIXI and Pierre Peigné

7 Nov 2023 16:12 UTC

52 points

21 comments6 min readLW link

Preface to the Sequence on LLM Psychology

Quentin FEUILLADE--MONTIXI7 Nov 2023 16:12 UTC

33 points

0 comments2 min readLW link

PICT: A Zero-Shot Prompt Template to Automate Evaluation

Quentin FEUILLADE--MONTIXI17 Feb 2023 23:16 UTC

17 points

1 comment11 min readLW link

Using PICT against PastaGPT Jailbreaking

Quentin FEUILLADE--MONTIXI9 Feb 2023 4:30 UTC

26 points

0 comments9 min readLW link