[Question] Up­dates on FLI’s Value Alig­ment Map?

Fer32dwt34r3dfsz17 Sep 2022 22:27 UTC
17 points
4 comments1 min readLW link

Most sen­si­ble ab­strac­tion & fea­ture set for a sys­tems lan­guage?

Jasen Qin17 Sep 2022 19:49 UTC
−1 points
5 comments10 min readLW link

Sparse tri­nary weighted RNNs as a path to bet­ter lan­guage model interpretability

Am8ryllis17 Sep 2022 19:48 UTC
19 points
13 comments3 min readLW link

Ap­ply for men­tor­ship in AI Safety field-building

Akash17 Sep 2022 19:06 UTC
9 points
0 comments1 min readLW link
(forum.effectivealtruism.org)

Refine’s Third Blog Post Day/​Week

adamShimi17 Sep 2022 17:03 UTC
18 points
0 comments1 min readLW link

[Closed] Prize and fast track to al­ign­ment re­search at ALTER

Vanessa Kosoy17 Sep 2022 16:58 UTC
63 points
6 comments3 min readLW link

Re­mote Lo­gin For Turnkey De­vices?

jefftk17 Sep 2022 15:40 UTC
9 points
2 comments2 min readLW link
(www.jefftk.com)

Many ther­apy schools work with in­ner mul­ti­plic­ity (not just IFS)

17 Sep 2022 10:27 UTC
51 points
15 comments18 min readLW link

Should AI learn hu­man val­ues, hu­man norms or some­thing else?

Q Home17 Sep 2022 6:19 UTC
5 points
1 comment4 min readLW link

Take­aways from our ro­bust in­jury clas­sifier pro­ject [Red­wood Re­search]

dmz17 Sep 2022 3:55 UTC
143 points
12 comments6 min readLW link1 review

[Question] Why doesn’t China (or didn’t any­one) en­courage/​man­date elas­tomeric res­pi­ra­tors to con­trol COVID?

Wei Dai17 Sep 2022 3:07 UTC
34 points
15 comments1 min readLW link

Emer­gency Res­i­den­tial So­lar Jury-Rigging

jefftk17 Sep 2022 2:30 UTC
34 points
0 comments3 min readLW link
(www.jefftk.com)

A Bite Sized In­tro­duc­tion to ELK

Luk2718217 Sep 2022 0:28 UTC
5 points
0 comments6 min readLW link

D&D.Sci Septem­ber 2022: The Allo­ca­tion Helm

abstractapplic16 Sep 2022 23:10 UTC
32 points
33 comments1 min readLW link

Towards a philos­o­phy of safety

jasoncrawford16 Sep 2022 21:10 UTC
12 points
2 comments8 min readLW link
(rootsofprogress.org)

Refine Blog­post Day #3: The short­forms I did write

Alexander Gietelink Oldenziel16 Sep 2022 21:03 UTC
23 points
0 comments1 min readLW link

[Question] Why are we sure that AI will “want” some­thing?

shminux16 Sep 2022 20:35 UTC
31 points
57 comments1 min readLW link

Katja Grace on Slow­ing Down AI, AI Ex­pert Sur­veys And Es­ti­mat­ing AI Risk

Michaël Trazzi16 Sep 2022 17:45 UTC
40 points
2 comments3 min readLW link
(theinsideview.ai)

Levels of goals and alignment

zeshen16 Sep 2022 16:44 UTC
27 points
4 comments6 min readLW link

or­der­ing ca­pa­bil­ity thresholds

Tamsin Leake16 Sep 2022 16:36 UTC
27 points
0 comments4 min readLW link
(carado.moe)

Rep­re­sen­ta­tional Tethers: Ty­ing AI La­tents To Hu­man Ones

Paul Bricman16 Sep 2022 14:45 UTC
30 points
0 comments16 min readLW link

I wrote a fan­tasy novel to pro­mote EA: More Chapters

Timothy Underwood16 Sep 2022 9:47 UTC
18 points
0 comments47 min readLW link

Guidelines for Mad Entrepreneurs

David Udell16 Sep 2022 6:33 UTC
26 points
0 comments11 min readLW link

Afford­able Hous­ing In­vest­ment Fund

jefftk16 Sep 2022 2:30 UTC
18 points
2 comments1 min readLW link
(www.jefftk.com)

In a world with­out AI, we need gene-edit­ing to pro­tect Na­ture. (Not how you think)

Erlja Jkdf.16 Sep 2022 1:24 UTC
−11 points
2 comments1 min readLW link

As­tralCodexTen and Ra­tion­al­ity Meetup Or­ganisers’ Re­treat — Europe, Mid­dle East, and Africa 2023

Sam F. Brown15 Sep 2022 22:38 UTC
25 points
2 comments2 min readLW link
(www.rationalitymeetups.org)

A mar­ket is a neu­ral network

David Hugh-Jones15 Sep 2022 21:53 UTC
6 points
4 comments8 min readLW link

Un­der­stand­ing Con­jec­ture: Notes from Con­nor Leahy interview

Akash15 Sep 2022 18:37 UTC
107 points
23 comments15 min readLW link

How should Deep­Mind’s Chin­chilla re­vise our AI fore­casts?

Cleo Nardo15 Sep 2022 17:54 UTC
35 points
12 comments13 min readLW link

Ra­tional An­i­ma­tions’ Script Writ­ing Contest

Writer15 Sep 2022 16:56 UTC
23 points
1 comment3 min readLW link

Covid 9/​15/​22: Per­ma­nent Normal

Zvi15 Sep 2022 16:00 UTC
32 points
9 comments20 min readLW link
(thezvi.wordpress.com)

[Question] Are Hu­man Brains Univer­sal?

DragonGod15 Sep 2022 15:15 UTC
16 points
28 comments5 min readLW link

In­tel­li­gence failures and a the­ory of change for forecasting

NathanBarnard15 Sep 2022 15:02 UTC
5 points
0 comments10 min readLW link

Why de­cep­tive al­ign­ment mat­ters for AGI safety

Marius Hobbhahn15 Sep 2022 13:38 UTC
57 points
13 comments13 min readLW link

FDT defects in a re­al­is­tic Twin Pri­son­ers’ Dilemma

Sylvester Kollin15 Sep 2022 8:55 UTC
37 points
1 comment26 min readLW link

[Question] What’s the longest a sen­tient ob­server could sur­vive in the Dark Era?

Raemon15 Sep 2022 8:43 UTC
33 points
15 comments1 min readLW link

The Value of Not Be­ing an Imposter

sudo15 Sep 2022 8:32 UTC
5 points
0 comments1 min readLW link

Ca­pa­bil­ity and Agency as Corner­stones of AI risk ­— My cur­rent model

wilm15 Sep 2022 8:25 UTC
10 points
4 comments12 min readLW link

Gen­eral ad­vice for tran­si­tion­ing into The­o­ret­i­cal AI Safety

Martín Soto15 Sep 2022 5:23 UTC
11 points
0 comments10 min readLW link

Se­quenc­ing In­tro II: Adapters

jefftk15 Sep 2022 3:30 UTC
12 points
0 comments2 min readLW link
(www.jefftk.com)

[Question] How do I find tu­tors for ob­scure skills/​sub­jects (i.e. fermi es­ti­ma­tion tu­tors)

joraine15 Sep 2022 1:15 UTC
11 points
2 comments1 min readLW link

[Question] Fore­cast­ing thread: How does AI risk level vary based on timelines?

elifland14 Sep 2022 23:56 UTC
34 points
7 comments1 min readLW link

Co­or­di­nate-Free In­ter­pretabil­ity Theory

johnswentworth14 Sep 2022 23:33 UTC
50 points
16 comments5 min readLW link

Progress links and tweets, 2022-09-14

jasoncrawford14 Sep 2022 23:21 UTC
9 points
2 comments1 min readLW link
(rootsofprogress.org)

Effec­tive al­tru­ism in the gar­den of ends

Tyler Alterman14 Sep 2022 22:02 UTC
24 points
1 comment27 min readLW link

The prob­lem with the me­dia pre­sen­ta­tion of “be­liev­ing in AI”

Roman Leventov14 Sep 2022 21:05 UTC
3 points
0 comments1 min readLW link

See­ing the Schema

vitaliya14 Sep 2022 20:45 UTC
23 points
6 comments1 min readLW link

Re­spond­ing to ‘Beyond Hyper­an­thro­po­mor­phism’

ukc1001414 Sep 2022 20:37 UTC
8 points
0 comments16 min readLW link

When is in­tent al­ign­ment suffi­cient or nec­es­sary to re­duce AGI con­flict?

14 Sep 2022 19:39 UTC
40 points
0 comments9 min readLW link

When would AGIs en­gage in con­flict?

14 Sep 2022 19:38 UTC
52 points
5 comments13 min readLW link