Un­der­stand­ing “Deep Dou­ble Des­cent”

evhubDec 6, 2019, 12:00 AM
151 points
51 comments5 min readLW link4 reviews

Co­her­ent de­ci­sions im­ply con­sis­tent utilities

Eliezer YudkowskyMay 12, 2019, 9:33 PM
149 points
83 comments26 min readLW link3 reviews

In­te­grat­ing dis­agree­ing subagents

Kaj_SotalaMay 14, 2019, 2:06 PM
147 points
15 comments21 min readLW link

The un­ex­pected difficulty of com­par­ing AlphaS­tar to humans

Richard Korzekwa Sep 18, 2019, 2:20 AM
145 points
36 comments26 min readLW link
(aiimpacts.org)

Subagents, akra­sia, and co­her­ence in humans

Kaj_SotalaMar 25, 2019, 2:24 PM
141 points
31 comments16 min readLW link

The Curse Of The Counterfactual

pjebyNov 1, 2019, 6:34 PM
140 points
35 comments19 min readLW link1 review

Un­con­scious Economics

Bird ConceptFeb 27, 2019, 12:58 PM
139 points
30 comments4 min readLW link3 reviews

The Amish, and Strate­gic Norms around Technology

RaemonMar 24, 2019, 10:16 PM
138 points
18 comments3 min readLW link2 reviews

The Re­la­tion­ship Between the Village and the Mission

RaemonMay 12, 2019, 9:09 PM
137 points
70 comments18 min readLW link

Honor­ing Petrov Day on LessWrong, in 2019

Ben PaceSep 26, 2019, 9:10 AM
137 points
168 comments4 min readLW link

Every­body Knows

ZviJul 2, 2019, 12:20 PM
137 points
21 comments4 min readLW link1 review
(thezvi.wordpress.com)

A mechanis­tic model of meditation

Kaj_SotalaNov 6, 2019, 9:37 PM
136 points
12 comments21 min readLW link

The Real Rules Have No Exceptions

Said AchmizJul 23, 2019, 3:38 AM
136 points
57 comments1 min readLW link2 reviews

Prop­a­gat­ing Facts into Aesthetics

RaemonDec 19, 2019, 4:09 AM
134 points
38 comments11 min readLW link1 review

Disen­tan­gling ar­gu­ments for the im­por­tance of AI safety

Richard_NgoJan 21, 2019, 12:41 PM
133 points
23 comments8 min readLW link

Blackmail

ZviFeb 19, 2019, 3:50 AM
133 points
55 comments16 min readLW link2 reviews
(thezvi.wordpress.com)

The Forces of Bland­ness and the Disagree­able Majority

sarahconstantinApr 28, 2019, 7:44 PM
132 points
27 comments3 min readLW link2 reviews
(srconstantin.wordpress.com)

Utility ≠ Reward

Vlad MikulikSep 5, 2019, 5:28 PM
131 points
24 comments1 min readLW link2 reviews

2019 AI Align­ment Liter­a­ture Re­view and Char­ity Comparison

LarksDec 19, 2019, 3:00 AM
130 points
18 comments62 min readLW link

Firm­ing Up Not-Ly­ing Around Its Edge-Cases Is Less Broadly Use­ful Than One Might Ini­tially Think

Zack_M_DavisDec 27, 2019, 5:09 AM
128 points
43 comments8 min readLW link2 reviews

Thoughts on Hu­man Models

Feb 21, 2019, 9:10 AM
127 points
32 comments10 min readLW link1 review

The Main Sources of AI Risk?

Mar 21, 2019, 6:28 PM
126 points
26 comments2 min readLW link

AI Safety “Suc­cess Sto­ries”

Wei DaiSep 7, 2019, 2:54 AM
126 points
27 comments4 min readLW link1 review

Se­quence in­tro­duc­tion: non-agent and mul­ti­a­gent mod­els of mind

Kaj_SotalaJan 7, 2019, 2:12 PM
125 points
16 comments7 min readLW link1 review

Writ­ing chil­dren’s pic­ture books

jessicataJun 25, 2019, 9:43 PM
125 points
22 comments5 min readLW link
(unstableontology.com)

Where to Draw the Boundaries?

Zack_M_DavisApr 13, 2019, 9:34 PM
124 points
109 comments13 min readLW link3 reviews

What Comes After Epistemic Spot Checks?

ElizabethOct 22, 2019, 5:00 PM
122 points
9 comments3 min readLW link
(acesounderglass.com)

Refram­ing Su­per­in­tel­li­gence: Com­pre­hen­sive AI Ser­vices as Gen­eral Intelligence

Rohin ShahJan 8, 2019, 7:12 AM
122 points
77 comments5 min readLW link2 reviews
(www.fhi.ox.ac.uk)

Soft take­off can still lead to de­ci­sive strate­gic advantage

Daniel KokotajloAug 23, 2019, 4:39 PM
122 points
47 comments8 min readLW link4 reviews

Gears vs Behavior

johnswentworthSep 19, 2019, 6:50 AM
120 points
14 comments7 min readLW link1 review

De­cep­tive Alignment

Jun 5, 2019, 8:16 PM
118 points
20 comments17 min readLW link

The Hard Work of Trans­la­tion (Bud­dhism)

romeostevensitApr 7, 2019, 9:04 PM
118 points
139 comments5 min readLW link3 reviews

The Tale of Alice Al­most: Strate­gies for Deal­ing With Pretty Good People

sarahconstantinFeb 27, 2019, 7:34 PM
117 points
6 comments6 min readLW link2 reviews
(srconstantin.wordpress.com)

The AI Timelines Scam

jessicataJul 11, 2019, 2:52 AM
117 points
108 comments7 min readLW link3 reviews
(unstableontology.com)

Say Wrong Things

Gordon Seidoh WorleyMay 24, 2019, 10:11 PM
116 points
13 comments4 min readLW link

No non­sense ver­sion of the “racial al­gorithm bias”

Yuxi_LiuJul 13, 2019, 3:39 PM
115 points
20 comments2 min readLW link

In­tro­duc­tion to In­tro­duc­tion to Cat­e­gory Theory

countedblessingsOct 6, 2019, 2:43 PM
114 points
19 comments2 min readLW link

Quotes from Mo­ral Mazes

ZviMay 30, 2019, 11:50 AM
114 points
27 comments53 min readLW link
(thezvi.wordpress.com)

Subagents, trauma and rationality

Kaj_SotalaAug 14, 2019, 1:14 PM
113 points
4 comments19 min readLW link

AlphaS­tar: Im­pres­sive for RL progress, not for AGI progress

orthonormalNov 2, 2019, 1:50 AM
113 points
58 comments2 min readLW link1 review

S-Curves for Trend Forecasting

Matt GoldenbergJan 23, 2019, 6:17 PM
113 points
23 comments7 min readLW link4 reviews

CO2 Strip­per Post­mortem Thoughts

DiffractorNov 30, 2019, 9:20 PM
113 points
37 comments8 min readLW link

Com­plex Be­hav­ior from Sim­ple (Sub)Agents

moridinamaelMay 10, 2019, 9:44 PM
113 points
14 comments9 min readLW link1 review

[Question] Where are peo­ple think­ing and talk­ing about global co­or­di­na­tion for AI safety?

Wei DaiMay 22, 2019, 6:24 AM
112 points
22 comments1 min readLW link

What is op­er­a­tions?

Swimmer963 (Miranda Dixon-Luinenburg) Sep 26, 2019, 2:16 PM
112 points
9 comments7 min readLW link

What I’ll be do­ing at MIRI

evhubNov 12, 2019, 11:19 PM
112 points
6 comments1 min readLW link

Sys­tem 2 as work­ing-mem­ory aug­mented Sys­tem 1 reasoning

Kaj_SotalaSep 25, 2019, 8:39 AM
110 points
23 comments16 min readLW link

Subagents, in­tro­spec­tive aware­ness, and blending

Kaj_SotalaMar 2, 2019, 12:53 PM
110 points
19 comments9 min readLW link

An­nounc­ing the Cen­ter for Ap­plied Postrationality

Pee DoomApr 2, 2019, 1:17 AM
110 points
14 comments1 min readLW link

Uber Self-Driv­ing Crash

jefftkNov 7, 2019, 3:00 PM
109 points
1 comment2 min readLW link
(www.jefftk.com)