Toy Models of Superposition

evhub21 Sep 2022 23:48 UTC
68 points
4 comments5 min readLW link1 review
(transformer-circuits.pub)

How to Train Your AGI Dragon

Oren Montano21 Sep 2022 22:28 UTC
−1 points
3 comments5 min readLW link

An is­sue with MacAskill’s Ev­i­den­tial­ist’s Wager

Martín Soto21 Sep 2022 22:02 UTC
1 point
9 comments4 min readLW link

An­nounc­ing AISIC 2022 - the AI Safety Is­rael Con­fer­ence, Oc­to­ber 19-20

Davidmanheim21 Sep 2022 19:32 UTC
13 points
0 comments1 min readLW link

Nearcast-based “de­ploy­ment prob­lem” analysis

HoldenKarnofsky21 Sep 2022 18:52 UTC
85 points
2 comments26 min readLW link

Scrap­ing train­ing data for your mind

Henrik Karlsson21 Sep 2022 16:27 UTC
47 points
4 comments8 min readLW link
(escapingflatland.substack.com)

Trends in Train­ing Dataset Sizes

Pablo Villalobos21 Sep 2022 15:47 UTC
25 points
2 comments5 min readLW link
(epochai.org)

[Question] Can you define “util­ity” in util­i­tar­i­anism with­out us­ing words for spe­cific hu­man emo­tions?

SurvivalBias21 Sep 2022 3:29 UTC
13 points
46 comments1 min readLW link

“In­fo­haz­ards” The ML Field’s Great­est Ex­cuse.

Puffy Bird21 Sep 2022 3:19 UTC
−3 points
1 comment3 min readLW link

Case Rates to Se­quenc­ing Reads

jefftk21 Sep 2022 2:00 UTC
15 points
4 comments4 min readLW link
(www.jefftk.com)

Towards de­con­fus­ing wire­head­ing and re­ward maximization

leogao21 Sep 2022 0:36 UTC
81 points
7 comments4 min readLW link

[Question] What key nu­tri­ents are re­quired for daily en­ergy?

trevor20 Sep 2022 23:30 UTC
6 points
4 comments1 min readLW link

Quan­tified In­tu­itions: An epistemics train­ing web­site in­clud­ing a new EA-themed cal­ibra­tion app

20 Sep 2022 22:25 UTC
28 points
2 comments2 min readLW link

The Redac­tion Machine

Ben20 Sep 2022 22:03 UTC
495 points
46 comments27 min readLW link1 review

You Are Not Mea­sur­ing What You Think You Are Measuring

johnswentworth20 Sep 2022 20:04 UTC
369 points
44 comments8 min readLW link2 reviews

What hap­pened to the idea of progress?

jasoncrawford20 Sep 2022 19:56 UTC
8 points
2 comments1 min readLW link
(bigthink.com)

Fea­tures and Antifeatures

Davis_Kingsley20 Sep 2022 17:54 UTC
23 points
8 comments1 min readLW link

Cryp­tocur­rency Ex­ploits Show the Im­por­tance of Proac­tive Poli­cies for AI X-Risk

eSpencer20 Sep 2022 17:53 UTC
1 point
0 comments4 min readLW link

Align­ment Org Cheat Sheet

20 Sep 2022 17:36 UTC
69 points
8 comments4 min readLW link

Do­ing over­sight from the very start of train­ing seems hard

peterbarnett20 Sep 2022 17:21 UTC
14 points
3 comments3 min readLW link

$13,000 of prizes for chang­ing our mind about who to fund (Clearer Think­ing Re­grants Fore­cast­ing Tour­na­ment)

spencerg20 Sep 2022 16:06 UTC
14 points
3 comments1 min readLW link
(manifold.markets)

Progress links and tweets, 2022-09-20

jasoncrawford20 Sep 2022 14:07 UTC
7 points
1 comment1 min readLW link
(rootsofprogress.org)

[Question] If we have Hu­man-level chat­bots, won’t we end up be­ing ruled by pos­si­ble peo­ple?

Erlja Jkdf.20 Sep 2022 13:59 UTC
5 points
13 comments1 min readLW link

Twit­ter Polls: Ev­i­dence is Evidence

Zvi20 Sep 2022 12:30 UTC
34 points
8 comments7 min readLW link
(thezvi.wordpress.com)

Some of the most im­por­tant en­trepreneur­ship skills are tacit knowledge

Ruhul20 Sep 2022 12:06 UTC
20 points
0 comments7 min readLW link

Char­ac­ter alignment

p.b.20 Sep 2022 8:27 UTC
22 points
0 comments2 min readLW link

Los­ing the root for the tree

Adam Zerner20 Sep 2022 4:53 UTC
466 points
30 comments9 min readLW link1 review

How to make your CPU as fast as a GPU—Ad­vances in Spar­sity w/​ Nir Shavit

the gears to ascension20 Sep 2022 3:48 UTC
2 points
0 comments27 min readLW link
(www.youtube.com)

Failed Ad­ven­tures in Delay

jefftk20 Sep 2022 2:20 UTC
8 points
0 comments2 min readLW link
(www.jefftk.com)

Gene drives: why the wait?

Metacelsus19 Sep 2022 23:37 UTC
121 points
50 comments3 min readLW link
(denovo.substack.com)

Prize idea: Trans­mit MIRI and Eliezer’s worldviews

elifland19 Sep 2022 21:21 UTC
47 points
18 comments2 min readLW link

Ra­tion­al­ity Dojo Ber­lin Handout

UnplannedCauliflower19 Sep 2022 20:11 UTC
19 points
0 comments7 min readLW link

A noob goes to the SERI MATS presentations

Lowell Dennings19 Sep 2022 17:35 UTC
27 points
0 comments5 min readLW link

Do bam­boos set them­selves on fire?

Malmesbury19 Sep 2022 15:34 UTC
170 points
14 comments6 min readLW link1 review

Cam­bridge LW Meetup: Authen­tic Re­lat­ing Games

Tony Wang19 Sep 2022 14:51 UTC
1 point
0 comments1 min readLW link

PIBBSS (AI al­ign­ment) is hiring for a Pro­ject Manager

Nora_Ammann19 Sep 2022 13:54 UTC
9 points
0 comments1 min readLW link

Quintin’s al­ign­ment pa­pers roundup—week 2

Quintin Pope19 Sep 2022 13:41 UTC
67 points
2 comments10 min readLW link

Some notes on solv­ing hard problems

Joe Rocca19 Sep 2022 12:58 UTC
50 points
8 comments29 min readLW link

Safety timelines: How long will it take to solve al­ign­ment?

19 Sep 2022 12:53 UTC
37 points
7 comments6 min readLW link
(forum.effectivealtruism.org)

Bel­grade, Ser­bia—LW Meetup

игорь тимофеев19 Sep 2022 12:47 UTC
3 points
0 comments1 min readLW link

The ELK Fram­ing I’ve Used

sudo19 Sep 2022 10:28 UTC
5 points
1 comment1 min readLW link

Quick Book Re­view: Cru­cial Conversations

Gordon Seidoh Worley19 Sep 2022 6:25 UTC
28 points
2 comments2 min readLW link

How my team at Light­cone some­times gets stuff done

jacobjacob19 Sep 2022 5:47 UTC
191 points
43 comments7 min readLW link1 review

EA & LW Fo­rums Weekly Sum­mary (12 − 18 Sep ’22)

Zoe Williams19 Sep 2022 5:08 UTC
11 points
0 comments13 min readLW link

Book Swap

Screwtape19 Sep 2022 2:33 UTC
11 points
0 comments2 min readLW link

Pre­tend­ing not to Notice

jefftk19 Sep 2022 2:30 UTC
46 points
12 comments2 min readLW link
(www.jefftk.com)

[To Be Re­vised]Per­haps the Mean­ing of Life, An Ad­ven­ture in Plu­ral­is­tic Morality

NoBadCake18 Sep 2022 22:37 UTC
−5 points
3 comments4 min readLW link

Lev­er­ag­ing Le­gal In­for­mat­ics to Align AI

John Nay18 Sep 2022 20:39 UTC
11 points
0 comments3 min readLW link
(forum.effectivealtruism.org)

The In­ter-Agent Facet of AI Alignment

Michael Oesterle18 Sep 2022 20:39 UTC
12 points
1 comment5 min readLW link

Bi­den should be ap­plauded for ap­point­ing Re­nee We­grzyn for ARPA-H

ChristianKl18 Sep 2022 19:57 UTC
34 points
0 comments2 min readLW link