RSS

Marius Hobbhahn

Karma: 4,539

I’m the co-founder and CEO of Apollo Research: https://​​www.apolloresearch.ai/​​
I mostly work on evals, but I am also interested in interpretability. My goal is to improve our understanding of scheming and build tools and methods to detect it.

I previously did a Ph.D. in ML at the International Max-Planck research school in Tübingen, worked part-time with Epoch and did independent AI safety research.

For more see https://​​www.mariushobbhahn.com/​​aboutme/​​

I subscribe to Crocker’s Rules

De­tect­ing Strate­gic De­cep­tion Us­ing Lin­ear Probes

6 Feb 2025 15:46 UTC
100 points
7 comments2 min readLW link
(arxiv.org)

Catas­tro­phe through Chaos

Marius Hobbhahn31 Jan 2025 14:19 UTC
174 points
16 comments12 min readLW link

What’s the short timeline plan?

Marius Hobbhahn2 Jan 2025 14:59 UTC
342 points
49 comments23 min readLW link

Abla­tions for “Fron­tier Models are Ca­pable of In-con­text Schem­ing”

17 Dec 2024 23:58 UTC
112 points
1 comment2 min readLW link

Fron­tier Models are Ca­pable of In-con­text Scheming

5 Dec 2024 22:11 UTC
203 points
24 comments7 min readLW link