RSS

evhub(Evan Hubinger)

Karma: 6,031

I (Evan Hubinger) am a Research Fellow at MIRI. Broadly, I work on inner alignment for prosaic machine learning.

See: “What I’ll doing at MIRI.”

Pronouns: he/​him/​his

Email: evanjhub@gmail.com

Selected work:

At­tempts at For­ward­ing Speed Priors

24 Sep 2022 5:49 UTC
16 points
0 comments18 min readLW link

Toy Models of Superposition

evhub21 Sep 2022 23:48 UTC
54 points
2 comments5 min readLW link
(transformer-circuits.pub)

Path de­pen­dence in ML in­duc­tive biases

10 Sep 2022 1:38 UTC
41 points
13 comments10 min readLW link

Mon­i­tor­ing for de­cep­tive alignment

evhub8 Sep 2022 23:07 UTC
113 points
7 comments9 min readLW link

Sticky goals: a con­crete ex­per­i­ment for un­der­stand­ing de­cep­tive alignment

evhub2 Sep 2022 21:57 UTC
34 points
12 comments3 min readLW link

AI co­or­di­na­tion needs clear wins

evhub1 Sep 2022 23:41 UTC
131 points
14 comments2 min readLW link