RSS

The “Mea­sur­ing Stick of Utility” Problem

johnswentworth25 May 2022 16:17 UTC
41 points
18 comments3 min readLW link

RL with KL penalties is bet­ter seen as Bayesian inference

25 May 2022 9:23 UTC
39 points
3 comments12 min readLW link

au­ton­omy: the miss­ing AGI in­gre­di­ent?

nostalgebraist25 May 2022 0:33 UTC
52 points
10 comments6 min readLW link

The No Free Lunch the­o­rems and their Razor

Adrià Garriga-alonso24 May 2022 6:40 UTC
30 points
2 comments9 min readLW link

Com­plex Sys­tems for AI Safety [Prag­matic AI Safety #3]

24 May 2022 0:00 UTC
21 points
0 comments21 min readLW link

Bits of Op­ti­miza­tion Can Only Be Lost Over A Distance

johnswentworth23 May 2022 18:55 UTC
15 points
14 comments2 min readLW link

AXRP Epi­sode 15 - Nat­u­ral Ab­strac­tions with John Wentworth

DanielFilan23 May 2022 5:40 UTC
29 points
0 comments57 min readLW link

Gra­da­tions of Agency

Daniel Kokotajlo23 May 2022 1:10 UTC
28 points
0 comments5 min readLW link

Ad­ver­sar­ial at­tacks and op­ti­mal control

Jan22 May 2022 18:22 UTC
15 points
6 comments8 min readLW link
(universalprior.substack.com)

[Short ver­sion] In­for­ma­tion Loss --> Basin flatness

Vivek Hebbar21 May 2022 12:59 UTC
11 points
0 comments1 min readLW link

In­for­ma­tion Loss --> Basin flatness

Vivek Hebbar21 May 2022 12:58 UTC
36 points
25 comments7 min readLW link

How RL Agents Be­have When Their Ac­tions Are Mod­ified? [Distil­la­tion post]

PabloAMC20 May 2022 18:47 UTC
20 points
0 comments8 min readLW link

We have achieved Noob Gains in AI

phdead18 May 2022 20:56 UTC
107 points
21 comments7 min readLW link

Max­ent and Ab­strac­tions: Cur­rent Best Arguments

johnswentworth18 May 2022 19:54 UTC
33 points
2 comments3 min readLW link

How to get into AI safety research

Stuart_Armstrong18 May 2022 18:05 UTC
41 points
5 comments1 min readLW link

Gato’s Gen­er­al­i­sa­tion: Pre­dic­tions and Ex­per­i­ments I’d Like to See

Oliver Sourbut18 May 2022 7:15 UTC
39 points
3 comments10 min readLW link

Ac­tion­able-guidance and roadmap recom­men­da­tions for the NIST AI Risk Man­age­ment Framework

17 May 2022 15:26 UTC
24 points
0 comments3 min readLW link

[In­tro to brain-like-AGI safety] 15. Con­clu­sion: Open prob­lems, how to help, AMA

Steven Byrnes17 May 2022 15:11 UTC
56 points
8 comments14 min readLW link

Proxy mis­speci­fi­ca­tion and the ca­pa­bil­ities vs. value learn­ing race

Sam Marks16 May 2022 18:58 UTC
12 points
1 comment4 min readLW link

Op­ti­miza­tion at a Distance

johnswentworth16 May 2022 17:58 UTC
56 points
11 comments4 min readLW link