Don’t al­ign agents to eval­u­a­tions of plans

TurnTrout26 Nov 2022 21:16 UTC
48 points
49 comments18 min readLW link

[Question] What videos should Ra­tional An­i­ma­tions make?

Writer26 Nov 2022 20:28 UTC
30 points
24 comments1 min readLW link

The First Filter

26 Nov 2022 19:37 UTC
67 points
5 comments1 min readLW link

Re­spect­ing your Lo­cal Preferences

Scott Garrabrant26 Nov 2022 19:04 UTC
74 points
1 comment4 min readLW link

[Question] Opinions on the sleep synap­tic home­osta­sis hy­poth­e­sis?

Angela Pretorius26 Nov 2022 19:01 UTC
3 points
0 comments1 min readLW link

Why square er­rors?

Aprillion26 Nov 2022 13:40 UTC
41 points
11 comments2 min readLW link

[Question] As­sum­ing that at least one re­li­gion is true, what would you ex­pect it to be?

risedive26 Nov 2022 8:34 UTC
−9 points
9 comments1 min readLW link

Three Align­ment Schemas & Their Problems

Shoshannah Tekofsky26 Nov 2022 4:25 UTC
19 points
1 comment6 min readLW link

The many types of blog posts

Adam Zerner26 Nov 2022 3:57 UTC
10 points
2 comments4 min readLW link

New Fron­tiers in Mojibake

Adam Scherlis26 Nov 2022 2:37 UTC
60 points
7 comments6 min readLW link1 review
(adam.scherlis.com)

Semi-con­duc­tor/​AI Stock Dis­cus­sion.

sapphire25 Nov 2022 23:35 UTC
28 points
25 comments1 min readLW link

NEFFA Should Allow Small Children

jefftk25 Nov 2022 23:00 UTC
10 points
2 comments2 min readLW link
(www.jefftk.com)

Pod­cast: Shoshan­nah Tekofsky on skil­ling up in AI safety, vis­it­ing Berkeley, and de­vel­op­ing novel re­search ideas

Orpheus1625 Nov 2022 20:47 UTC
37 points
2 comments9 min readLW link

The man and the tool

pedroalvarado25 Nov 2022 19:51 UTC
−1 points
0 comments4 min readLW link

[Question] What AI newslet­ters or sub­stacks about AI do you recom­mend?

wunan25 Nov 2022 19:29 UTC
6 points
1 comment1 min readLW link

Mechanis­tic anomaly de­tec­tion and ELK

paulfchristiano25 Nov 2022 18:50 UTC
138 points
22 comments21 min readLW link
(ai-alignment.com)

The Least Con­tro­ver­sial Ap­pli­ca­tion of Geo­met­ric Rationality

Scott Garrabrant25 Nov 2022 16:50 UTC
60 points
22 comments4 min readLW link

Planes are still decades away from dis­plac­ing most bird jobs

guzey25 Nov 2022 16:49 UTC
169 points
13 comments3 min readLW link

Take part in our gi­ant study of cog­ni­tive abil­ities and get a cus­tomized re­port of your strengths and weak­nesses!

spencerg25 Nov 2022 16:28 UTC
8 points
1 comment1 min readLW link
(www.guidedtrack.com)

Guardian AI (Misal­igned sys­tems are all around us.)

Jessica Rumbelow25 Nov 2022 15:55 UTC
15 points
6 comments2 min readLW link

In­tu­itions by ML re­searchers may get pro­gres­sively worse con­cern­ing likely can­di­dates for trans­for­ma­tive AI

Viktor Rehnberg25 Nov 2022 15:49 UTC
7 points
0 comments2 min readLW link

Refin­ing the Sharp Left Turn threat model, part 2: ap­ply­ing al­ign­ment techniques

25 Nov 2022 14:36 UTC
39 points
9 comments6 min readLW link
(vkrakovna.wordpress.com)

[Question] Who holds all the USDT?

ChristianKl25 Nov 2022 11:58 UTC
17 points
6 comments1 min readLW link

Fair Col­lec­tive Effi­cient Altruism

Jobst Heitzig25 Nov 2022 9:38 UTC
2 points
1 comment5 min readLW link

[Question] If hu­man­ity one day dis­cov­ers that it is a form of dis­ease that threat­ens to de­stroy the uni­verse, should it al­low it­self to be shut down?

Shmi25 Nov 2022 8:27 UTC
4 points
12 comments1 min readLW link

Could a sin­gle alien mes­sage de­stroy us?

25 Nov 2022 7:32 UTC
61 points
23 comments6 min readLW link
(youtu.be)

How do I start a pro­gram­ming ca­reer in the West?

Lao Mein25 Nov 2022 6:37 UTC
38 points
7 comments2 min readLW link

The AI Safety com­mu­nity has four main work groups, Strat­egy, Gover­nance, Tech­ni­cal and Move­ment Building

peterslattery25 Nov 2022 3:45 UTC
1 point
0 comments6 min readLW link

Less Suc­cess­ful Cider Adventures

jefftk25 Nov 2022 1:50 UTC
11 points
1 comment1 min readLW link
(www.jefftk.com)

Gliders in Lan­guage Models

Alexandre Variengien25 Nov 2022 0:38 UTC
30 points
11 comments10 min readLW link

On Kelly and altruism

philh24 Nov 2022 23:40 UTC
17 points
6 comments12 min readLW link
(reasonableapproximation.net)

Open tech­ni­cal prob­lem: A Quinean proof of Löb’s the­o­rem, for an eas­ier car­toon guide

Andrew_Critch24 Nov 2022 21:16 UTC
58 points
35 comments3 min readLW link1 review

[Question] His­tor­i­cal ex­am­ples of peo­ple gain­ing un­usual cog­ni­tive abil­ities?

Nicholas Kross24 Nov 2022 19:01 UTC
8 points
2 comments1 min readLW link

Cor­rigi­bil­ity Via Thought-Pro­cess Deference

Thane Ruthenis24 Nov 2022 17:06 UTC
18 points
5 comments9 min readLW link

Geo­met­ric Ex­plo­ra­tion, Arith­metic Exploitation

Scott Garrabrant24 Nov 2022 15:36 UTC
126 points
5 comments7 min readLW link

What I Learned Run­ning Refine

adamShimi24 Nov 2022 14:49 UTC
108 points
5 comments4 min readLW link

Covid 11/​24/​22: Thanks for Good Health

Zvi24 Nov 2022 13:00 UTC
26 points
4 comments8 min readLW link
(thezvi.wordpress.com)

Clar­ify­ing wire­head­ing terminology

leogao24 Nov 2022 4:53 UTC
67 points
6 comments1 min readLW link

LW Beta Fea­ture: Side-Comments

jimrandomh24 Nov 2022 1:55 UTC
103 points
47 comments1 min readLW link

Against “Clas­sic Style”

Cleo Nardo23 Nov 2022 22:10 UTC
69 points
31 comments4 min readLW link

South Bay ACX/​LW Meetup

IS23 Nov 2022 22:05 UTC
2 points
0 comments1 min readLW link

Meme Dialects

jefftk23 Nov 2022 21:30 UTC
26 points
1 comment2 min readLW link
(www.jefftk.com)

[Question] When do you vi­su­al­ize (or not) while do­ing math?

Alex_Altair23 Nov 2022 20:15 UTC
21 points
9 comments1 min readLW link

When AI solves a game, fo­cus on the game’s me­chan­ics, not its theme.

Cleo Nardo23 Nov 2022 19:16 UTC
89 points
7 comments2 min readLW link

The Geo­met­ric Expectation

Scott Garrabrant23 Nov 2022 18:05 UTC
168 points
22 comments4 min readLW link

“Far Co­or­di­na­tion”

DragonGod23 Nov 2022 17:14 UTC
6 points
17 comments9 min readLW link

Con­jec­ture Se­cond Hiring Round

23 Nov 2022 17:11 UTC
92 points
0 comments1 min readLW link

Con­jec­ture: a ret­ro­spec­tive af­ter 8 months of work

23 Nov 2022 17:10 UTC
180 points
9 comments8 min readLW link

Against a Gen­eral Fac­tor of Doom

Jeffrey Heninger23 Nov 2022 16:50 UTC
63 points
19 comments4 min readLW link1 review
(aiimpacts.org)

In­ject­ing some num­bers into the AGI de­bate—by Boaz Barak

Jsevillamol23 Nov 2022 16:10 UTC
12 points
0 comments3 min readLW link
(windowsontheory.org)