Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Page
1
AGI Ruin: A List of Lethalities
Eliezer Yudkowsky
5 Jun 2022 22:05 UTC
945
points
711
comments
30
min read
LW
link
3
reviews
Where I agree and disagree with Eliezer
paulfchristiano
19 Jun 2022 19:15 UTC
901
points
224
comments
18
min read
LW
link
2
reviews
What an actually pessimistic containment strategy looks like
lc
5 Apr 2022 0:19 UTC
680
points
138
comments
6
min read
LW
link
2
reviews
Simulators
janus
2 Sep 2022 12:45 UTC
642
points
168
comments
41
min read
LW
link
8
reviews
(generative.ink)
Let’s think about slowing down AI
KatjaGrace
22 Dec 2022 17:40 UTC
551
points
182
comments
38
min read
LW
link
3
reviews
(aiimpacts.org)
The Redaction Machine
Ben
20 Sep 2022 22:03 UTC
507
points
48
comments
27
min read
LW
link
1
review
Luck based medicine: my resentful story of becoming a medical miracle
Elizabeth
16 Oct 2022 17:40 UTC
490
points
121
comments
12
min read
LW
link
3
reviews
(acesounderglass.com)
Losing the root for the tree
Adam Zerner
20 Sep 2022 4:53 UTC
482
points
31
comments
9
min read
LW
link
1
review
Counter-theses on Sleep
Natália
21 Mar 2022 23:21 UTC
447
points
135
comments
15
min read
LW
link
1
review
It’s Probably Not Lithium
Natália
28 Jun 2022 21:24 UTC
442
points
187
comments
28
min read
LW
link
1
review
chinchilla’s wild implications
nostalgebraist
31 Jul 2022 1:18 UTC
424
points
128
comments
10
min read
LW
link
1
review
(My understanding of) What Everyone in Technical Alignment is Doing and Why
Thomas Larsen
and
elifland
29 Aug 2022 1:23 UTC
413
points
90
comments
37
min read
LW
link
1
review
You Are Not Measuring What You Think You Are Measuring
johnswentworth
20 Sep 2022 20:04 UTC
413
points
45
comments
8
min read
LW
link
2
reviews
It Looks Like You’re Trying To Take Over The World
gwern
9 Mar 2022 16:35 UTC
408
points
120
comments
1
min read
LW
link
1
review
(www.gwern.net)
DeepMind alignment team opinions on AGI ruin arguments
Vika
12 Aug 2022 21:06 UTC
396
points
37
comments
14
min read
LW
link
1
review
Reflections on six months of fatherhood
jasoncrawford
31 Jan 2022 5:28 UTC
388
points
24
comments
4
min read
LW
link
1
review
(jasoncrawford.org)
Lies Told To Children
Eliezer Yudkowsky
14 Apr 2022 11:25 UTC
381
points
94
comments
7
min read
LW
link
1
review
Reward is not the optimization target
TurnTrout
25 Jul 2022 0:03 UTC
376
points
123
comments
10
min read
LW
link
3
reviews
A Mechanistic Interpretability Analysis of Grokking
Neel Nanda
and
Tom Lieberum
15 Aug 2022 2:41 UTC
373
points
48
comments
36
min read
LW
link
1
review
(colab.research.google.com)
Counterarguments to the basic AI x-risk case
KatjaGrace
14 Oct 2022 13:00 UTC
371
points
124
comments
34
min read
LW
link
1
review
(aiimpacts.org)
Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover
Ajeya Cotra
18 Jul 2022 19:06 UTC
368
points
95
comments
75
min read
LW
link
1
review
Accounting For College Costs
johnswentworth
1 Apr 2022 17:28 UTC
367
points
41
comments
7
min read
LW
link
MIRI announces new “Death With Dignity” strategy
Eliezer Yudkowsky
2 Apr 2022 0:43 UTC
363
points
546
comments
18
min read
LW
link
1
review
Security Mindset: Lessons from 20+ years of Software Security Failures Relevant to AGI Alignment
elspood
21 Jun 2022 23:55 UTC
362
points
42
comments
7
min read
LW
link
1
review
Staring into the abyss as a core life skill
benkuhn
22 Dec 2022 15:30 UTC
357
points
22
comments
12
min read
LW
link
1
review
(www.benkuhn.net)
What DALL-E 2 can and cannot do
Swimmer963 (Miranda Dixon-Luinenburg)
1 May 2022 23:51 UTC
353
points
303
comments
9
min read
LW
link
Beware boasting about non-existent forecasting track records
Jotto999
20 May 2022 19:20 UTC
339
points
112
comments
5
min read
LW
link
1
review
What should you change in response to an “emergency”? And AI risk
AnnaSalamon
18 Jul 2022 1:11 UTC
339
points
60
comments
6
min read
LW
link
1
review
Optimality is the tiger, and agents are its teeth
Veedrac
2 Apr 2022 0:46 UTC
337
points
46
comments
16
min read
LW
link
1
review
Why I think strong general AI is coming soon
porby
28 Sep 2022 5:40 UTC
337
points
141
comments
34
min read
LW
link
1
review
Looking back on my alignment PhD
TurnTrout
1 Jul 2022 3:19 UTC
334
points
66
comments
11
min read
LW
link
Models Don’t “Get Reward”
Sam Ringer
30 Dec 2022 10:37 UTC
317
points
62
comments
5
min read
LW
link
1
review
On how various plans miss the hard bits of the alignment challenge
So8res
12 Jul 2022 2:49 UTC
314
points
89
comments
29
min read
LW
link
3
reviews
Six Dimensions of Operational Adequacy in AGI Projects
Eliezer Yudkowsky
30 May 2022 17:00 UTC
310
points
66
comments
13
min read
LW
link
1
review
Epistemic Legibility
Elizabeth
9 Feb 2022 18:10 UTC
310
points
30
comments
20
min read
LW
link
1
review
(acesounderglass.com)
Why Agent Foundations? An Overly Abstract Explanation
johnswentworth
25 Mar 2022 23:17 UTC
309
points
58
comments
8
min read
LW
link
1
review
A challenge for AGI organizations, and a challenge for readers
Rob Bensinger
and
Eliezer Yudkowsky
1 Dec 2022 23:11 UTC
302
points
33
comments
2
min read
LW
link
Two-year update on my personal AI timelines
Ajeya Cotra
2 Aug 2022 23:07 UTC
293
points
60
comments
16
min read
LW
link
What Are You Tracking In Your Head?
johnswentworth
28 Jun 2022 19:30 UTC
289
points
83
comments
4
min read
LW
link
1
review
Sazen
Duncan Sabien (Inactive)
21 Dec 2022 7:54 UTC
287
points
83
comments
12
min read
LW
link
2
reviews
Mysteries of mode collapse
janus
8 Nov 2022 10:37 UTC
284
points
57
comments
14
min read
LW
link
1
review
We Choose To Align AI
johnswentworth
1 Jan 2022 20:06 UTC
283
points
16
comments
3
min read
LW
link
1
review
Don’t die with dignity; instead play to your outs
Jeffrey Ladish
6 Apr 2022 7:53 UTC
281
points
60
comments
5
min read
LW
link
Is AI Progress Impossible To Predict?
alyssavance
15 May 2022 18:30 UTC
277
points
39
comments
2
min read
LW
link
Toni Kurz and the Insanity of Climbing Mountains
GeneSmith
3 Jul 2022 20:51 UTC
273
points
67
comments
11
min read
LW
link
2
reviews
A central AI alignment problem: capabilities generalization, and the sharp left turn
So8res
15 Jun 2022 13:10 UTC
273
points
55
comments
10
min read
LW
link
1
review
Humans are very reliable agents
alyssavance
16 Jun 2022 22:02 UTC
269
points
35
comments
3
min read
LW
link
12 interesting things I learned studying the discovery of nature’s laws
Ben Pace
19 Feb 2022 23:39 UTC
268
points
40
comments
9
min read
LW
link
1
review
Comment reply: my low-quality thoughts on why CFAR didn’t get farther with a “real/efficacious art of rationality”
AnnaSalamon
9 Jun 2022 2:12 UTC
263
points
63
comments
17
min read
LW
link
1
review
Changing the world through slack & hobbies
Steven Byrnes
21 Jul 2022 18:11 UTC
262
points
13
comments
10
min read
LW
link
Back to top
Next