Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Best of LessWrong
Tag
Last edit:
9 Feb 2023 2:01 UTC
by
Raemon
Relevant
New
Old
Why did everything take so long?
KatjaGrace
29 Dec 2017 1:00 UTC
33
points
17
comments
1
min read
LW
link
(meteuphoric.wordpress.com)
The Loudest Alarm Is Probably False
orthonormal
2 Jan 2018 16:38 UTC
171
points
28
comments
2
min read
LW
link
1
review
Babble
alkjash
10 Jan 2018 21:56 UTC
195
points
32
comments
5
min read
LW
link
2
reviews
(radimentary.wordpress.com)
Prune
alkjash
12 Jan 2018 22:50 UTC
68
points
10
comments
4
min read
LW
link
(radimentary.wordpress.com)
An Untrollable Mathematician
abramdemski
23 Jan 2018 18:46 UTC
23
points
1
comment
3
min read
LW
link
Robustness to Scale
Scott Garrabrant
21 Feb 2018 22:55 UTC
128
points
23
comments
2
min read
LW
link
1
review
The Intelligent Social Web
Valentine
22 Feb 2018 18:55 UTC
224
points
112
comments
12
min read
LW
link
2
reviews
Arguments about fast takeoff
paulfchristiano
25 Feb 2018 4:53 UTC
89
points
65
comments
2
min read
LW
link
1
review
(sideways-view.com)
My attempt to explain Looking, insight meditation, and enlightenment in non-mysterious terms
Kaj_Sotala
8 Mar 2018 7:37 UTC
223
points
131
comments
17
min read
LW
link
2
reviews
On the Loss and Preservation of Knowledge
Samo Burja
8 Mar 2018 18:40 UTC
66
points
20
comments
10
min read
LW
link
(medium.com)
The Costly Coordination Mechanism of Common Knowledge
Ben Pace
15 Mar 2018 20:20 UTC
194
points
31
comments
19
min read
LW
link
2
reviews
Naming the Nameless
sarahconstantin
22 Mar 2018 0:35 UTC
119
points
43
comments
13
min read
LW
link
3
reviews
A Sketch of Good Communication
Ben Pace
31 Mar 2018 22:48 UTC
185
points
35
comments
3
min read
LW
link
1
review
Specification gaming examples in AI
Vika
3 Apr 2018 12:30 UTC
45
points
9
comments
1
min read
LW
link
2
reviews
Local Validity as a Key to Sanity and Civilization
Eliezer Yudkowsky
7 Apr 2018 4:25 UTC
193
points
67
comments
13
min read
LW
link
5
reviews
A voting theory primer for rationalists
Jameson Quinn
12 Apr 2018 15:15 UTC
229
points
98
comments
17
min read
LW
link
2
reviews
Noticing the Taste of Lotus
Valentine
27 Apr 2018 20:05 UTC
203
points
81
comments
3
min read
LW
link
3
reviews
Research: Rescuers during the Holocaust
Martin Sustrik
30 Apr 2018 6:15 UTC
88
points
10
comments
9
min read
LW
link
1
review
Open question: are minimal circuits daemon-free?
paulfchristiano
5 May 2018 22:40 UTC
83
points
70
comments
2
min read
LW
link
1
review
Varieties Of Argumentative Experience
Scott Alexander
8 May 2018 8:20 UTC
93
points
11
comments
18
min read
LW
link
2
reviews
(slatestarcodex.com)
Challenges to Christiano’s capability amplification proposal
Eliezer Yudkowsky
19 May 2018 18:18 UTC
124
points
54
comments
23
min read
LW
link
1
review
Inadequate Equilibria vs. Governance of the Commons
Martin Sustrik
25 May 2018 13:17 UTC
182
points
17
comments
14
min read
LW
link
2
reviews
Meta-Honesty: Firming Up Honesty Around Its Edge-Cases
Eliezer Yudkowsky
29 May 2018 0:59 UTC
134
points
152
comments
27
min read
LW
link
4
reviews
Toolbox-thinking and Law-thinking
Eliezer Yudkowsky
31 May 2018 21:28 UTC
160
points
49
comments
12
min read
LW
link
Beyond Astronomical Waste
Wei Dai
7 Jun 2018 21:04 UTC
125
points
41
comments
3
min read
LW
link
Paul’s research agenda FAQ
zhukeepa
1 Jul 2018 6:25 UTC
126
points
74
comments
19
min read
LW
link
1
review
Prediction Markets: When Do They Work?
Zvi
26 Jul 2018 12:30 UTC
162
points
17
comments
10
min read
LW
link
(thezvi.wordpress.com)
Historical mathematicians exhibit a birth order effect too
Eli Tyre
21 Aug 2018 1:52 UTC
141
points
19
comments
6
min read
LW
link
2
reviews
Birth order effect found in Nobel Laureates in Physics
Bucky
4 Sep 2018 12:17 UTC
61
points
25
comments
5
min read
LW
link
1
review
Towards a New Impact Measure
TurnTrout
18 Sep 2018 17:21 UTC
100
points
159
comments
33
min read
LW
link
2
reviews
The Tails Coming Apart As Metaphor For Life
Scott Alexander
25 Sep 2018 19:10 UTC
155
points
38
comments
7
min read
LW
link
4
reviews
(slatestarcodex.com)
Anti-social Punishment
Martin Sustrik
27 Sep 2018 7:08 UTC
296
points
66
comments
6
min read
LW
link
3
reviews
The Rocket Alignment Problem
Eliezer Yudkowsky
4 Oct 2018 0:38 UTC
216
points
41
comments
15
min read
LW
link
2
reviews
Being a Robust Agent
Raemon
18 Oct 2018 7:00 UTC
145
points
32
comments
7
min read
LW
link
2
reviews
Embedded Agents
abramdemski
and
Scott Garrabrant
29 Oct 2018 19:53 UTC
221
points
41
comments
1
min read
LW
link
2
reviews
Is Clickbait Destroying Our General Intelligence?
Eliezer Yudkowsky
16 Nov 2018 23:06 UTC
189
points
61
comments
5
min read
LW
link
2
reviews
Is Science Slowing Down?
Scott Alexander
27 Nov 2018 3:30 UTC
125
points
77
comments
9
min read
LW
link
1
review
(slatestarcodex.com)
Coherence arguments do not entail goal-directed behavior
Rohin Shah
3 Dec 2018 3:26 UTC
123
points
69
comments
7
min read
LW
link
3
reviews
The Pavlov Strategy
sarahconstantin
20 Dec 2018 16:20 UTC
247
points
13
comments
4
min read
LW
link
(srconstantin.wordpress.com)
Spaghetti Towers
eukaryote
22 Dec 2018 5:29 UTC
187
points
28
comments
3
min read
LW
link
1
review
(eukaryotewritesblog.com)
[Question]
What makes people intellectually active?
abramdemski
29 Dec 2018 22:29 UTC
116
points
71
comments
1
min read
LW
link
Reframing Superintelligence: Comprehensive AI Services as General Intelligence
Rohin Shah
8 Jan 2019 7:12 UTC
121
points
77
comments
5
min read
LW
link
2
reviews
(www.fhi.ox.ac.uk)
Building up to an Internal Family Systems model
Kaj_Sotala
26 Jan 2019 12:25 UTC
264
points
86
comments
28
min read
LW
link
2
reviews
“Other people are wrong” vs “I am right”
Buck
22 Feb 2019 20:01 UTC
246
points
20
comments
9
min read
LW
link
2
reviews
Rule Thinkers In, Not Out
Scott Alexander
27 Feb 2019 2:40 UTC
221
points
67
comments
4
min read
LW
link
4
reviews
(slatestarcodex.com)
Unconscious Economics
jacobjacob
27 Feb 2019 12:58 UTC
136
points
30
comments
4
min read
LW
link
3
reviews
In My Culture
[DEACTIVATED] Duncan Sabien
7 Mar 2019 7:22 UTC
66
points
59
comments
1
min read
LW
link
2
reviews
(medium.com)
Alignment Research Field Guide
abramdemski
8 Mar 2019 19:57 UTC
264
points
9
comments
17
min read
LW
link
2
reviews
You Get About Five Words
Raemon
12 Mar 2019 20:30 UTC
199
points
76
comments
1
min read
LW
link
6
reviews
Literature Review: Distributed Teams
Elizabeth
16 Apr 2019 1:19 UTC
106
points
37
comments
6
min read
LW
link
1
review
Asymmetric Justice
Zvi
25 Apr 2019 16:00 UTC
230
points
101
comments
5
min read
LW
link
2
reviews
(thezvi.wordpress.com)
Coherent decisions imply consistent utilities
Eliezer Yudkowsky
12 May 2019 21:33 UTC
148
points
81
comments
26
min read
LW
link
3
reviews
Yes Requires the Possibility of No
Scott Garrabrant
17 May 2019 22:39 UTC
261
points
55
comments
2
min read
LW
link
2
reviews
Book Review: The Secret Of Our Success
Scott Alexander
5 Jun 2019 6:50 UTC
158
points
19
comments
25
min read
LW
link
2
reviews
(slatestarcodex.com)
Steelmanning Divination
Vaniver
5 Jun 2019 22:53 UTC
191
points
48
comments
6
min read
LW
link
2
reviews
The Schelling Choice is “Rabbit”, not “Stag”
Raemon
8 Jun 2019 0:24 UTC
157
points
52
comments
12
min read
LW
link
3
reviews
Mistakes with Conservation of Expected Evidence
abramdemski
8 Jun 2019 23:07 UTC
212
points
25
comments
12
min read
LW
link
1
review
Reason isn’t magic
Benquo
18 Jun 2019 4:04 UTC
152
points
19
comments
2
min read
LW
link
3
reviews
(benjaminrosshoffman.com)
Being the (Pareto) Best in the World
johnswentworth
24 Jun 2019 18:36 UTC
402
points
57
comments
3
min read
LW
link
3
reviews
Do you fear the rock or the hard place?
Ruby
20 Jul 2019 22:01 UTC
72
points
10
comments
5
min read
LW
link
3
reviews
Forum participation as a research strategy
Wei Dai
30 Jul 2019 18:09 UTC
151
points
45
comments
3
min read
LW
link
1
review
How to Ignore Your Emotions (while also thinking you’re awesome at emotions)
Hazard
31 Jul 2019 13:34 UTC
351
points
74
comments
4
min read
LW
link
4
reviews
Power Buys You Distance From The Crime
Elizabeth
2 Aug 2019 20:50 UTC
189
points
75
comments
7
min read
LW
link
1
review
(acesounderglass.com)
Gears vs Behavior
johnswentworth
19 Sep 2019 6:50 UTC
107
points
13
comments
7
min read
LW
link
1
review
Heads I Win, Tails?—Never Heard of Her; Or, Selective Reporting and the Tragedy of the Green Rationalists
Zack_M_Davis
24 Sep 2019 4:12 UTC
299
points
40
comments
8
min read
LW
link
2
reviews
Noticing Frame Differences
Raemon
30 Sep 2019 1:24 UTC
208
points
39
comments
9
min read
LW
link
2
reviews
Book summary: Unlocking the Emotional Brain
Kaj_Sotala
8 Oct 2019 19:11 UTC
316
points
48
comments
21
min read
LW
link
3
reviews
Chris Olah’s views on AGI safety
evhub
1 Nov 2019 20:13 UTC
206
points
38
comments
12
min read
LW
link
2
reviews
Book Review: Design Principles of Biological Circuits
johnswentworth
5 Nov 2019 6:49 UTC
209
points
24
comments
12
min read
LW
link
1
review
The Credit Assignment Problem
abramdemski
8 Nov 2019 2:50 UTC
98
points
40
comments
17
min read
LW
link
1
review
Evolution of Modularity
johnswentworth
14 Nov 2019 6:49 UTC
174
points
12
comments
2
min read
LW
link
1
review
Gears-Level Models are Capital Investments
johnswentworth
22 Nov 2019 22:41 UTC
170
points
28
comments
7
min read
LW
link
1
review
Mental Mountains
Scott Alexander
27 Nov 2019 5:30 UTC
144
points
14
comments
15
min read
LW
link
1
review
(slatestarcodex.com)
Paper-Reading for Gears
johnswentworth
4 Dec 2019 21:02 UTC
159
points
6
comments
4
min read
LW
link
1
review
Seeking Power is Often Convergently Instrumental in MDPs
TurnTrout
and
Logan Riggs
5 Dec 2019 2:33 UTC
162
points
39
comments
17
min read
LW
link
2
reviews
(arxiv.org)
Understanding “Deep Double Descent”
evhub
6 Dec 2019 0:00 UTC
148
points
51
comments
5
min read
LW
link
4
reviews
Propagating Facts into Aesthetics
Raemon
19 Dec 2019 4:09 UTC
109
points
35
comments
11
min read
LW
link
1
review
What cognitive biases feel like from the inside
chaosmage
3 Jan 2020 14:24 UTC
249
points
32
comments
4
min read
LW
link
CFAR Participant Handbook now available to all
[DEACTIVATED] Duncan Sabien
3 Jan 2020 15:43 UTC
248
points
40
comments
1
min read
LW
link
2
reviews
Reality-Revealing and Reality-Masking Puzzles
AnnaSalamon
16 Jan 2020 16:15 UTC
258
points
57
comments
13
min read
LW
link
1
review
The Road to Mazedom
Zvi
18 Jan 2020 14:10 UTC
94
points
25
comments
7
min read
LW
link
2
reviews
(thezvi.wordpress.com)
Coordination as a Scarce Resource
johnswentworth
25 Jan 2020 23:32 UTC
231
points
22
comments
4
min read
LW
link
2
reviews
What Money Cannot Buy
johnswentworth
1 Feb 2020 20:11 UTC
318
points
49
comments
4
min read
LW
link
1
review
Seeing the Smoke
Jacob Falkovich
28 Feb 2020 18:26 UTC
198
points
29
comments
5
min read
LW
link
1
review
Cortés, Pizarro, and Afonso as Precedents for Takeover
Daniel Kokotajlo
1 Mar 2020 3:49 UTC
168
points
78
comments
11
min read
LW
link
1
review
Interfaces as a Scarce Resource
johnswentworth
5 Mar 2020 18:20 UTC
187
points
15
comments
12
min read
LW
link
1
review
Credibility of the CDC on SARS-CoV-2
Elizabeth
and
jimrandomh
7 Mar 2020 2:00 UTC
226
points
119
comments
6
min read
LW
link
1
review
Can crimes be discussed literally?
Benquo
22 Mar 2020 20:17 UTC
102
points
38
comments
2
min read
LW
link
3
reviews
(benjaminrosshoffman.com)
Transportation as a Constraint
johnswentworth
6 Apr 2020 4:58 UTC
176
points
32
comments
6
min read
LW
link
1
review
Choosing the Zero Point
orthonormal
6 Apr 2020 23:44 UTC
170
points
24
comments
3
min read
LW
link
2
reviews
An Orthodox Case Against Utility Functions
abramdemski
7 Apr 2020 19:18 UTC
152
points
65
comments
8
min read
LW
link
2
reviews
Discontinuous progress in history: an update
KatjaGrace
14 Apr 2020 0:00 UTC
186
points
25
comments
31
min read
LW
link
1
review
(aiimpacts.org)
How uniform is the neocortex?
zhukeepa
4 May 2020 2:16 UTC
79
points
23
comments
11
min read
LW
link
1
review
A non-mystical explanation of “no-self” (three characteristics series)
Kaj_Sotala
8 May 2020 10:37 UTC
105
points
65
comments
20
min read
LW
link
1
review
Studies On Slack
Scott Alexander
13 May 2020 5:00 UTC
151
points
34
comments
24
min read
LW
link
1
review
(slatestarcodex.com)
An overview of 11 proposals for building safe advanced AI
evhub
29 May 2020 20:38 UTC
205
points
36
comments
38
min read
LW
link
2
reviews
Covid-19: My Current Model
Zvi
31 May 2020 17:40 UTC
188
points
74
comments
19
min read
LW
link
1
review
(thezvi.wordpress.com)
Inaccessible information
paulfchristiano
3 Jun 2020 5:10 UTC
83
points
17
comments
14
min read
LW
link
2
reviews
(ai-alignment.com)
Simulacra Levels and their Interactions
Zvi
15 Jun 2020 13:10 UTC
197
points
50
comments
17
min read
LW
link
1
review
(thezvi.wordpress.com)
The ground of optimization
Alex Flint
20 Jun 2020 0:38 UTC
245
points
80
comments
27
min read
LW
link
1
review
Swiss Political System: More than You ever Wanted to Know (I.)
Martin Sustrik
19 Jul 2020 1:11 UTC
171
points
39
comments
24
min read
LW
link
2
reviews
“Can you keep this confidential? How do you know?”
Raemon
21 Jul 2020 0:33 UTC
159
points
41
comments
3
min read
LW
link
2
reviews
Inner Alignment: Explain like I’m 12 Edition
Rafael Harth
1 Aug 2020 15:24 UTC
179
points
46
comments
13
min read
LW
link
2
reviews
Alignment By Default
johnswentworth
12 Aug 2020 18:54 UTC
173
points
94
comments
11
min read
LW
link
2
reviews
Search versus design
Alex Flint
16 Aug 2020 16:53 UTC
100
points
40
comments
36
min read
LW
link
1
review
Why haven’t we celebrated any major achievements lately?
jasoncrawford
17 Aug 2020 20:34 UTC
194
points
69
comments
12
min read
LW
link
2
reviews
(rootsofprogress.org)
Radical Probabilism
abramdemski
18 Aug 2020 21:14 UTC
176
points
47
comments
35
min read
LW
link
1
review
Introduction To The Infra-Bayesianism Sequence
Diffractor
and
Vanessa Kosoy
26 Aug 2020 20:31 UTC
108
points
62
comments
14
min read
LW
link
2
reviews
microCOVID.org: A tool to estimate COVID risk from common activities
catherio
29 Aug 2020 23:01 UTC
169
points
36
comments
1
min read
LW
link
1
review
(microcovid.org)
My computational framework for the brain
Steven Byrnes
14 Sep 2020 14:19 UTC
150
points
26
comments
13
min read
LW
link
1
review
Most Prisoner’s Dilemmas are Stag Hunts; Most Stag Hunts are Schelling Problems
abramdemski
14 Sep 2020 22:13 UTC
177
points
36
comments
10
min read
LW
link
3
reviews
Draft report on AI timelines
Ajeya Cotra
18 Sep 2020 23:47 UTC
214
points
56
comments
1
min read
LW
link
1
review
AGI safety from first principles: Introduction
Richard_Ngo
28 Sep 2020 19:53 UTC
121
points
18
comments
2
min read
LW
link
1
review
The Felt Sense: What, Why and How
Kaj_Sotala
5 Oct 2020 15:57 UTC
149
points
23
comments
14
min read
LW
link
1
review
The Alignment Problem: Machine Learning and Human Values
Rohin Shah
6 Oct 2020 17:41 UTC
120
points
7
comments
6
min read
LW
link
1
review
(www.amazon.com)
The Treacherous Path to Rationality
Jacob Falkovich
9 Oct 2020 15:34 UTC
204
points
115
comments
11
min read
LW
link
1
review
The Solomonoff Prior is Malign
Mark Xu
14 Oct 2020 1:33 UTC
168
points
52
comments
16
min read
LW
link
3
reviews
The date of AI Takeover is not the day the AI takes over
Daniel Kokotajlo
22 Oct 2020 10:41 UTC
145
points
32
comments
2
min read
LW
link
1
review
Introduction to Cartesian Frames
Scott Garrabrant
22 Oct 2020 13:00 UTC
153
points
32
comments
22
min read
LW
link
1
review
Is Success the Enemy of Freedom? (Full)
alkjash
26 Oct 2020 20:25 UTC
291
points
68
comments
9
min read
LW
link
1
review
(radimentary.wordpress.com)
When Money Is Abundant, Knowledge Is The Real Wealth
johnswentworth
3 Nov 2020 17:34 UTC
317
points
61
comments
5
min read
LW
link
3
reviews
Nuclear war is unlikely to cause human extinction
Jeffrey Ladish
7 Nov 2020 5:42 UTC
124
points
47
comments
11
min read
LW
link
3
reviews
The Pointers Problem: Human Values Are A Function Of Humans’ Latent Variables
johnswentworth
18 Nov 2020 17:47 UTC
123
points
49
comments
11
min read
LW
link
2
reviews
Some AI research areas and their relevance to existential safety
Andrew_Critch
19 Nov 2020 3:18 UTC
204
points
37
comments
50
min read
LW
link
2
reviews
Pain is not the unit of Effort
alkjash
24 Nov 2020 20:00 UTC
517
points
89
comments
5
min read
LW
link
2
reviews
(radimentary.wordpress.com)
To listen well, get curious
benkuhn
13 Dec 2020 0:20 UTC
351
points
37
comments
4
min read
LW
link
1
review
(www.benkuhn.net)
Motive Ambiguity
Zvi
15 Dec 2020 18:10 UTC
172
points
58
comments
4
min read
LW
link
2
reviews
(thezvi.wordpress.com)
The First Sample Gives the Most Information
Mark Xu
24 Dec 2020 20:39 UTC
132
points
16
comments
1
min read
LW
link
1
review
(markxu.com)
Why Neural Networks Generalise, and Why They Are (Kind of) Bayesian
Joar Skalse
29 Dec 2020 13:33 UTC
74
points
58
comments
1
min read
LW
link
1
review
Against GDP as a metric for timelines and takeoff speeds
Daniel Kokotajlo
29 Dec 2020 17:42 UTC
140
points
19
comments
14
min read
LW
link
1
review
Anti-Aging: State of the Art
JackH
31 Dec 2020 19:07 UTC
371
points
176
comments
11
min read
LW
link
1
review
Cryonics signup guide #1: Overview
mingyuan
6 Jan 2021 0:25 UTC
150
points
33
comments
6
min read
LW
link
1
review
Science in a High-Dimensional World
johnswentworth
8 Jan 2021 17:52 UTC
285
points
53
comments
7
min read
LW
link
1
review
Leaky Delegation: You are not a Commodity
Darmani
25 Jan 2021 2:04 UTC
297
points
34
comments
15
min read
LW
link
1
review
Simulacrum 3 As Stag-Hunt Strategy
johnswentworth
26 Jan 2021 19:40 UTC
179
points
37
comments
4
min read
LW
link
3
reviews
Catching the Spark
LoganStrohl
30 Jan 2021 23:23 UTC
111
points
21
comments
36
min read
LW
link
1
review
Elephant seal 2
KatjaGrace
2 Feb 2021 9:40 UTC
57
points
5
comments
1
min read
LW
link
2
reviews
(worldspiritsockpuppet.com)
Making Vaccine
johnswentworth
3 Feb 2021 20:24 UTC
574
points
249
comments
6
min read
LW
link
3
reviews
Your Cheerful Price
Eliezer Yudkowsky
13 Feb 2021 5:41 UTC
262
points
82
comments
17
min read
LW
link
6
reviews
“PR” is corrosive; “reputation” is not.
AnnaSalamon
14 Feb 2021 3:32 UTC
307
points
93
comments
2
min read
LW
link
3
reviews
Utility Maximization = Description Length Minimization
johnswentworth
18 Feb 2021 18:04 UTC
208
points
44
comments
5
min read
LW
link
Fun with +12 OOMs of Compute
Daniel Kokotajlo
1 Mar 2021 13:30 UTC
224
points
86
comments
12
min read
LW
link
2
reviews
Seven Years of Spaced Repetition Software in the Classroom
tanagrabeast
4 Mar 2021 2:42 UTC
265
points
38
comments
34
min read
LW
link
1
review
Trapped Priors As A Basic Problem Of Rationality
Scott Alexander
12 Mar 2021 20:02 UTC
141
points
32
comments
14
min read
LW
link
3
reviews
Strong Evidence is Common
Mark Xu
13 Mar 2021 22:04 UTC
244
points
49
comments
1
min read
LW
link
4
reviews
(markxu.com)
Jean Monnet: The Guerilla Bureaucrat
Martin Sustrik
20 Mar 2021 10:37 UTC
175
points
25
comments
18
min read
LW
link
1
review
My research methodology
paulfchristiano
22 Mar 2021 21:20 UTC
159
points
38
comments
16
min read
LW
link
1
review
(ai-alignment.com)
Rationalism before the Sequences
Eric Raymond
30 Mar 2021 14:04 UTC
581
points
81
comments
11
min read
LW
link
2
reviews
What Multipolar Failure Looks Like, and Robust Agent-Agnostic Processes (RAAPs)
Andrew_Critch
31 Mar 2021 23:50 UTC
272
points
64
comments
22
min read
LW
link
1
review
Notes from “Don’t Shoot the Dog”
juliawise
2 Apr 2021 16:34 UTC
244
points
11
comments
12
min read
LW
link
1
review
Another (outer) alignment failure story
paulfchristiano
7 Apr 2021 20:12 UTC
241
points
38
comments
12
min read
LW
link
1
review
Highlights from The Autobiography of Andrew Carnegie
jasoncrawford
8 Apr 2021 22:03 UTC
92
points
9
comments
19
min read
LW
link
1
review
(rootsofprogress.org)
Specializing in Problems We Don’t Understand
johnswentworth
10 Apr 2021 22:40 UTC
159
points
29
comments
8
min read
LW
link
1
review
There’s no such thing as a tree (phylogenetically)
eukaryote
3 May 2021 3:47 UTC
333
points
58
comments
8
min read
LW
link
2
reviews
(eukaryotewritesblog.com)
Saving Time
Scott Garrabrant
18 May 2021 20:11 UTC
156
points
20
comments
4
min read
LW
link
1
review
Finite Factored Sets
Scott Garrabrant
23 May 2021 20:52 UTC
146
points
95
comments
24
min read
LW
link
1
review
Taboo “Outside View”
Daniel Kokotajlo
17 Jun 2021 9:36 UTC
348
points
33
comments
8
min read
LW
link
3
reviews
The Point of Trade
Eliezer Yudkowsky
22 Jun 2021 17:56 UTC
171
points
76
comments
4
min read
LW
link
1
review
Frequent arguments about alignment
John Schulman
23 Jun 2021 0:46 UTC
99
points
17
comments
5
min read
LW
link
Slack Has Positive Externalities For Groups
johnswentworth
29 Jul 2021 15:03 UTC
90
points
11
comments
5
min read
LW
link
2
reviews
What 2026 looks like
Daniel Kokotajlo
6 Aug 2021 16:14 UTC
473
points
150
comments
16
min read
LW
link
1
review
The Death of Behavioral Economics
habryka
22 Aug 2021 22:39 UTC
153
points
24
comments
1
min read
LW
link
2
reviews
(www.thebehavioralscientist.com)
How To Write Quickly While Maintaining Epistemic Rigor
johnswentworth
28 Aug 2021 17:52 UTC
428
points
38
comments
4
min read
LW
link
3
reviews
Grokking the Intentional Stance
jbkjr
31 Aug 2021 15:49 UTC
43
points
22
comments
20
min read
LW
link
1
review
All Possible Views About Humanity’s Future Are Wild
HoldenKarnofsky
3 Sep 2021 20:19 UTC
146
points
37
comments
8
min read
LW
link
1
review
How factories were made safe
jasoncrawford
12 Sep 2021 19:58 UTC
181
points
46
comments
18
min read
LW
link
1
review
(rootsofprogress.org)
This Can’t Go On
HoldenKarnofsky
18 Sep 2021 23:50 UTC
73
points
55
comments
7
min read
LW
link
2
reviews
Selection Theorems: A Program For Understanding Agents
johnswentworth
28 Sep 2021 5:03 UTC
123
points
28
comments
6
min read
LW
link
2
reviews
What Do GDP Growth Curves Really Mean?
johnswentworth
7 Oct 2021 21:58 UTC
219
points
64
comments
8
min read
LW
link
2
reviews
Shoulder Advisors 101
[DEACTIVATED] Duncan Sabien
9 Oct 2021 5:30 UTC
193
points
124
comments
14
min read
LW
link
2
reviews
Cup-Stacking Skills (or, Reflexive Involuntary Mental Motions)
[DEACTIVATED] Duncan Sabien
11 Oct 2021 7:16 UTC
117
points
36
comments
7
min read
LW
link
2
reviews
Lies, Damn Lies, and Fabricated Options
[DEACTIVATED] Duncan Sabien
17 Oct 2021 2:47 UTC
288
points
132
comments
14
min read
LW
link
2
reviews
Self-Integrity and the Drowning Child
Eliezer Yudkowsky
24 Oct 2021 20:57 UTC
329
points
85
comments
5
min read
LW
link
1
review
Ruling Out Everything Else
[DEACTIVATED] Duncan Sabien
27 Oct 2021 21:50 UTC
190
points
51
comments
21
min read
LW
link
2
reviews
Feature Selection
Zack_M_Davis
1 Nov 2021 0:22 UTC
315
points
24
comments
16
min read
LW
link
1
review
Comments on Carlsmith’s “Is power-seeking AI an existential risk?”
So8res
13 Nov 2021 4:29 UTC
138
points
14
comments
40
min read
LW
link
1
review
You are probably underestimating how good self-love can be
Charlie Rogers-Smith
14 Nov 2021 0:41 UTC
145
points
19
comments
12
min read
LW
link
1
review
Ngo and Yudkowsky on alignment difficulty
Eliezer Yudkowsky
and
Richard_Ngo
15 Nov 2021 20:31 UTC
250
points
148
comments
99
min read
LW
link
1
review
Split and Commit
[DEACTIVATED] Duncan Sabien
21 Nov 2021 6:27 UTC
178
points
33
comments
7
min read
LW
link
1
review
EfficientZero: How It Works
1a3orn
26 Nov 2021 15:17 UTC
292
points
50
comments
29
min read
LW
link
1
review
Frame Control
Aella
27 Nov 2021 22:59 UTC
314
points
282
comments
23
min read
LW
link
2
reviews
The Rationalists of the 1950s (and before) also called themselves “Rationalists”
Owain_Evans
28 Nov 2021 20:17 UTC
187
points
30
comments
3
min read
LW
link
1
review
Lars Doucet’s Georgism series on Astral Codex Ten
Sune
4 Dec 2021 19:43 UTC
13
points
2
comments
1
min read
LW
link
1
review
(astralcodexten.substack.com)
The Plan
johnswentworth
10 Dec 2021 23:41 UTC
254
points
78
comments
14
min read
LW
link
1
review
ARC’s first technical report: Eliciting Latent Knowledge
paulfchristiano
,
Mark Xu
and
Ajeya Cotra
14 Dec 2021 20:09 UTC
225
points
90
comments
1
min read
LW
link
3
reviews
(docs.google.com)
Worst-case thinking in AI alignment
Buck
23 Dec 2021 1:29 UTC
162
points
18
comments
6
min read
LW
link
2
reviews
No comments.
Back to top