RSS

Pa­ram­e­ter counts in Ma­chine Learning

Jsevillamol19 Jun 2021 16:04 UTC
31 points
5 comments7 min readLW link

Knowl­edge is not just pre­cip­i­ta­tion of action

alexflint18 Jun 2021 23:26 UTC
15 points
3 comments7 min readLW link

Non-poi­sonous cake: an­thropic up­dates are normal

Stuart_Armstrong18 Jun 2021 14:51 UTC
22 points
4 comments1 min readLW link

[Question] Pros and cons of work­ing on near-term tech­ni­cal AI safety and assurance

alenglander17 Jun 2021 20:17 UTC
11 points
1 comment2 min readLW link

[AN #152]: How we’ve over­es­ti­mated few-shot learn­ing capabilities

rohinmshah16 Jun 2021 17:20 UTC
20 points
6 comments8 min readLW link
(mailchi.mp)

Re­ward Is Not Enough

Steven Byrnes16 Jun 2021 13:52 UTC
76 points
14 comments10 min readLW link

[Question] Open prob­lem: how can we quan­tify player al­ign­ment in 2x2 nor­mal-form games?

TurnTrout16 Jun 2021 2:09 UTC
21 points
54 comments1 min readLW link

Vignettes Work­shop (AI Im­pacts)

Daniel Kokotajlo15 Jun 2021 12:05 UTC
44 points
1 comment1 min readLW link

Knowl­edge is not just digi­tal ab­strac­tion layers

alexflint15 Jun 2021 3:49 UTC
16 points
4 comments5 min readLW link

Look­ing Deeper at Deconfusion

adamShimi13 Jun 2021 21:29 UTC
39 points
11 comments15 min readLW link

Avoid­ing the in­stru­men­tal policy by hid­ing in­for­ma­tion about humans

paulfchristiano13 Jun 2021 20:00 UTC
31 points
2 comments2 min readLW link

An­swer­ing ques­tions hon­estly given world-model mismatches

paulfchristiano13 Jun 2021 18:00 UTC
29 points
1 comment16 min readLW link
(ai-alignment.com)

Finite Fac­tored Sets: Orthog­o­nal­ity and Time

Scott Garrabrant10 Jun 2021 1:22 UTC
31 points
1 comment4 min readLW link

Knowl­edge is not just mu­tual information

alexflint10 Jun 2021 1:01 UTC
16 points
2 comments4 min readLW link

A naive al­ign­ment strat­egy and op­ti­mism about generalization

paulfchristiano10 Jun 2021 0:10 UTC
41 points
3 comments3 min readLW link
(ai-alignment.com)

“De­ci­sion Trans­former” (Tool AIs are se­cret Agent AIs)

gwern9 Jun 2021 1:06 UTC
34 points
4 comments1 min readLW link
(sites.google.com)

AXRP Epi­sode 8 - As­sis­tance Games with Dy­lan Had­field-Menell

DanielFilan8 Jun 2021 23:20 UTC
9 points
0 comments71 min readLW link

The In­side View #3: Evan Hub­inger— ho­mo­gene­ity in take­off speeds, learned op­ti­miza­tion and interpretability

Michaël Trazzi8 Jun 2021 19:20 UTC
28 points
0 comments55 min readLW link

Sur­vey on AI ex­is­ten­tial risk scenarios

8 Jun 2021 17:12 UTC
54 points
10 comments7 min readLW link

The re­verse Good­hart problem

Stuart_Armstrong8 Jun 2021 15:48 UTC
14 points
22 comments1 min readLW link