RSS

Marius Hobbhahn

Karma: 2,887

I recently founded Apollo Research: https://​​www.apolloresearch.ai/​​

I was previously doing a Ph.D. in ML at the International Max-Planck research school in Tübingen, worked part-time with Epoch and did independent AI safety research.

For more see https://​​www.mariushobbhahn.com/​​aboutme/​​

I subscribe to Crocker’s Rules

We need a Science of Evals

22 Jan 2024 20:30 UTC
63 points
13 comments9 min readLW link

A starter guide for evals

8 Jan 2024 18:24 UTC
44 points
2 comments12 min readLW link
(www.apolloresearch.ai)

Ex­pe­riences and learn­ings from both sides of the AI safety job market

Marius Hobbhahn15 Nov 2023 15:40 UTC
109 points
4 comments18 min readLW link

The­o­ries of Change for AI Auditing

13 Nov 2023 19:33 UTC
59 points
0 comments18 min readLW link
(www.apolloresearch.ai)

Un­der­stand­ing strate­gic de­cep­tion and de­cep­tive alignment

25 Sep 2023 16:27 UTC
58 points
16 comments7 min readLW link
(www.apolloresearch.ai)

There should be more AI safety orgs

Marius Hobbhahn21 Sep 2023 14:53 UTC
175 points
25 comments17 min readLW link

Apollo Re­search is hiring evals and in­ter­pretabil­ity en­g­ineers & scientists

Marius Hobbhahn4 Aug 2023 10:54 UTC
25 points
0 comments2 min readLW link

An­nounc­ing Apollo Research

30 May 2023 16:17 UTC
215 points
11 comments8 min readLW link

Solv­ing the Mechanis­tic In­ter­pretabil­ity challenges: EIS VII Challenge 2

25 May 2023 15:37 UTC
71 points
1 comment13 min readLW link

Solv­ing the Mechanis­tic In­ter­pretabil­ity challenges: EIS VII Challenge 1

9 May 2023 19:41 UTC
119 points
1 comment10 min readLW link

Should we pub­lish mechanis­tic in­ter­pretabil­ity re­search?

21 Apr 2023 16:19 UTC
105 points
40 comments13 min readLW link

Clar­ify­ing mesa-optimization

21 Mar 2023 15:53 UTC
37 points
6 comments10 min readLW link

Reflec­tion Mechanisms as an Align­ment Tar­get—At­ti­tudes on “near-term” AI

2 Mar 2023 4:29 UTC
20 points
0 comments8 min readLW link

More find­ings on max­i­mal data dimension

Marius Hobbhahn2 Feb 2023 18:33 UTC
27 points
1 comment11 min readLW link

More find­ings on Me­moriza­tion and dou­ble descent

Marius Hobbhahn1 Feb 2023 18:26 UTC
53 points
2 comments19 min readLW link

The role of Bayesian ML in AI safety—an overview

Marius Hobbhahn27 Jan 2023 19:40 UTC
30 points
6 comments10 min readLW link

The next decades might be wild

Marius Hobbhahn15 Dec 2022 16:10 UTC
175 points
42 comments41 min readLW link1 review

Pre­dict­ing GPU performance

14 Dec 2022 16:27 UTC
60 points
26 comments1 min readLW link
(epochai.org)

The­o­ries of im­pact for Science of Deep Learning

Marius Hobbhahn1 Dec 2022 14:39 UTC
21 points
0 comments11 min readLW link

An­nounc­ing AI safety Men­tors and Mentees

Marius Hobbhahn23 Nov 2022 15:21 UTC
62 points
7 comments10 min readLW link