Marius Hobbhahn

Karma: 3,153

I’m the co-founder and CEO of Apollo Research: https://​​​​
I mostly work on evals, but I am also interested in interpretability.

I was previously doing a Ph.D. in ML at the International Max-Planck research school in Tübingen, worked part-time with Epoch and did independent AI safety research.

For more see https://​​​​aboutme/​​

I subscribe to Crocker’s Rules

An­a­lyz­ing Deep­Mind’s Prob­a­bil­is­tic Meth­ods for Eval­u­at­ing Agent Capabilities

22 Jul 2024 16:17 UTC
54 points
0 comments16 min readLW link

[In­terim re­search re­port] Eval­u­at­ing the Goal-Direct­ed­ness of Lan­guage Models

18 Jul 2024 18:19 UTC
29 points
0 comments11 min readLW link

Me, My­self, and AI: the Si­tu­a­tional Aware­ness Dataset (SAD) for LLMs

8 Jul 2024 22:24 UTC
99 points
27 comments5 min readLW link

Apollo Re­search 1-year update

29 May 2024 17:44 UTC
92 points
0 comments7 min readLW link

The Lo­cal In­ter­ac­tion Ba­sis: Iden­ti­fy­ing Com­pu­ta­tion­ally-Rele­vant and Sparsely In­ter­act­ing Fea­tures in Neu­ral Networks

20 May 2024 17:53 UTC
103 points
4 comments3 min readLW link

We need a Science of Evals

22 Jan 2024 20:30 UTC
66 points
13 comments9 min readLW link

A starter guide for evals

8 Jan 2024 18:24 UTC
44 points
2 comments12 min readLW link

Ex­pe­riences and learn­ings from both sides of the AI safety job market

Marius Hobbhahn15 Nov 2023 15:40 UTC
109 points
4 comments18 min readLW link