RSS

AXRP Epi­sode 11 - At­tain­able Utility and Power with Alex Turner

DanielFilan25 Sep 2021 21:10 UTC
17 points
1 comment52 min readLW link

Cog­ni­tive Bi­ases in Large Lan­guage Models

Jan25 Sep 2021 20:59 UTC
11 points
2 comments12 min readLW link
(universalprior.substack.com)

Carte­sian Frames and Fac­tored Sets on ArXiv

Scott Garrabrant24 Sep 2021 4:58 UTC
36 points
0 comments1 min readLW link

[AN #165]: When large mod­els are more likely to lie

rohinmshah22 Sep 2021 17:30 UTC
23 points
0 comments8 min readLW link
(mailchi.mp)

Red­wood Re­search’s cur­rent project

Buck21 Sep 2021 23:30 UTC
111 points
14 comments15 min readLW link

David Wolpert on Knowledge

alexflint21 Sep 2021 1:54 UTC
33 points
3 comments13 min readLW link

An­nounc­ing the Vi­talik Bu­terin Fel­low­ships in AI Ex­is­ten­tial Safety!

DanielFilan21 Sep 2021 0:33 UTC
64 points
2 comments1 min readLW link
(grants.futureoflife.org)

AI, learn to be con­ser­va­tive, then learn to be less so: re­duc­ing side-effects, learn­ing pre­served fea­tures, and go­ing be­yond conservatism

Stuart_Armstrong20 Sep 2021 11:56 UTC
12 points
1 comment3 min readLW link

[Book Re­view] “The Align­ment Prob­lem” by Brian Christian

lsusr20 Sep 2021 6:36 UTC
55 points
12 comments6 min readLW link

Test­ing The Nat­u­ral Ab­strac­tion Hy­poth­e­sis: Pro­ject Update

johnswentworth20 Sep 2021 3:44 UTC
81 points
15 comments8 min readLW link

The the­ory-prac­tice gap

Buck17 Sep 2021 22:51 UTC
113 points
12 comments6 min readLW link

In­ves­ti­gat­ing AI Takeover Scenarios

Sammy Martin17 Sep 2021 18:47 UTC
22 points
1 comment27 min readLW link

Good­hart Ethology

Charlie Steiner17 Sep 2021 17:31 UTC
10 points
4 comments15 min readLW link

Im­mo­bile AI makes a move: anti-wire­head­ing, on­tol­ogy change, and model splintering

Stuart_Armstrong17 Sep 2021 15:24 UTC
31 points
3 comments2 min readLW link

Jit­ters No Ev­i­dence of Stu­pidity in RL

1a3orn16 Sep 2021 22:43 UTC
75 points
18 comments3 min readLW link

Eco­nomic AI Safety

jsteinhardt16 Sep 2021 20:50 UTC
35 points
3 comments5 min readLW link

How truth­ful is GPT-3? A bench­mark for lan­guage models

Owain_Evans16 Sep 2021 10:09 UTC
54 points
24 comments6 min readLW link

[AN #164]: How well can lan­guage mod­els write code?

rohinmshah15 Sep 2021 17:20 UTC
13 points
7 comments9 min readLW link
(mailchi.mp)

Or­a­cle pre­dic­tions don’t ap­ply to non-ex­is­tent worlds

Chris_Leong15 Sep 2021 9:44 UTC
10 points
25 comments3 min readLW link

Mea­sure­ment, Op­ti­miza­tion, and Take-off Speed

jsteinhardt10 Sep 2021 19:30 UTC
47 points
4 comments13 min readLW link