RSS

DanielFilan

Karma: 4,862

Bot­tle Caps Aren’t Optimisers

DanielFilan31 Aug 2018 18:30 UTC
76 points
21 comments3 min readLW link1 review
(danielfilan.com)

An­nounc­ing the Vi­talik Bu­terin Fel­low­ships in AI Ex­is­ten­tial Safety!

DanielFilan21 Sep 2021 0:33 UTC
64 points
2 comments1 min readLW link
(grants.futureoflife.org)

Test Cases for Im­pact Reg­u­lari­sa­tion Methods

DanielFilan6 Feb 2019 21:50 UTC
58 points
5 comments12 min readLW link
(danielfilan.com)

AXRP Epi­sode 9 - Finite Fac­tored Sets with Scott Garrabrant

DanielFilan24 Jun 2021 22:10 UTC
56 points
2 comments58 min readLW link

Mechanis­tic Trans­parency for Ma­chine Learning

DanielFilan11 Jul 2018 0:34 UTC
54 points
9 comments4 min readLW link

Se­cu­rity Mind­set and Take­off Speeds

DanielFilan27 Oct 2020 3:20 UTC
54 points
23 comments8 min readLW link
(danielfilan.com)

An­nounc­ing AXRP, the AI X-risk Re­search Podcast

DanielFilan23 Dec 2020 20:00 UTC
54 points
6 comments1 min readLW link
(danielfilan.com)

A Per­sonal Ra­tion­al­ity Wishlist

DanielFilan27 Aug 2019 3:40 UTC
53 points
54 comments4 min readLW link
(danielfilan.com)

An An­a­lytic Per­spec­tive on AI Alignment

DanielFilan1 Mar 2020 4:10 UTC
53 points
45 comments8 min readLW link
(danielfilan.com)

Challenge: know ev­ery­thing that the best go bot knows about go

DanielFilan11 May 2021 5:10 UTC
48 points
93 comments2 min readLW link
(danielfilan.com)

A sec­ond ex­am­ple of con­di­tional or­thog­o­nal­ity in finite fac­tored sets

DanielFilan7 Jul 2021 1:40 UTC
46 points
0 comments2 min readLW link
(danielfilan.com)

Cog­ni­tive mis­takes I’ve made about COVID-19

DanielFilan27 Dec 2020 0:50 UTC
45 points
3 comments2 min readLW link
(danielfilan.com)

A sim­ple ex­am­ple of con­di­tional or­thog­o­nal­ity in finite fac­tored sets

DanielFilan6 Jul 2021 0:36 UTC
43 points
3 comments5 min readLW link
(danielfilan.com)

AXRP Epi­sode 4 - Risks from Learned Op­ti­miza­tion with Evan Hubinger

DanielFilan18 Feb 2021 0:03 UTC
41 points
10 comments86 min readLW link

What’s the chance a smart Lon­don res­i­dent dies of a Rus­sian nuke in the next month?

DanielFilan10 Mar 2022 19:20 UTC
40 points
8 comments4 min readLW link
(danielfilan.com)

[LINK] Scott Aaron­son on In­te­grated In­for­ma­tion Theory

DanielFilan22 May 2014 8:40 UTC
38 points
11 comments1 min readLW link

In­sights from ‘The Strat­egy of Con­flict’

DanielFilan4 Jan 2018 5:05 UTC
37 points
13 comments7 min readLW link

AXRP Epi­sode 12 - AI Ex­is­ten­tial Risk with Paul Christiano

DanielFilan2 Dec 2021 2:20 UTC
36 points
0 comments125 min readLW link

Ver­ifi­ca­tion and Transparency

DanielFilan8 Aug 2019 1:50 UTC
34 points
6 comments2 min readLW link
(danielfilan.com)

AXRP Epi­sode 7 - Side Effects with Vic­to­ria Krakovna

DanielFilan14 May 2021 3:50 UTC
34 points
6 comments43 min readLW link