Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Page
1
AGI Ruin: A List of Lethalities
Eliezer Yudkowsky
5 Jun 2022 22:05 UTC
886
points
690
comments
30
min read
LW
link
3
reviews
Where I agree and disagree with Eliezer
paulfchristiano
19 Jun 2022 19:15 UTC
870
points
219
comments
18
min read
LW
link
2
reviews
What an actually pessimistic containment strategy looks like
lc
5 Apr 2022 0:19 UTC
667
points
138
comments
6
min read
LW
link
2
reviews
Simulators
janus
2 Sep 2022 12:45 UTC
594
points
161
comments
41
min read
LW
link
8
reviews
(generative.ink)
Let’s think about slowing down AI
KatjaGrace
22 Dec 2022 17:40 UTC
543
points
183
comments
38
min read
LW
link
3
reviews
(aiimpacts.org)
The Redaction Machine
Ben
20 Sep 2022 22:03 UTC
494
points
46
comments
27
min read
LW
link
1
review
Luck based medicine: my resentful story of becoming a medical miracle
Elizabeth
16 Oct 2022 17:40 UTC
474
points
119
comments
12
min read
LW
link
3
reviews
(acesounderglass.com)
Losing the root for the tree
Adam Zerner
20 Sep 2022 4:53 UTC
465
points
30
comments
9
min read
LW
link
1
review
It’s Probably Not Lithium
Natália
28 Jun 2022 21:24 UTC
441
points
186
comments
28
min read
LW
link
1
review
Counter-theses on Sleep
Natália
21 Mar 2022 23:21 UTC
439
points
131
comments
15
min read
LW
link
1
review
(My understanding of) What Everyone in Technical Alignment is Doing and Why
Thomas Larsen
and
elifland
29 Aug 2022 1:23 UTC
412
points
89
comments
38
min read
LW
link
1
review
chinchilla’s wild implications
nostalgebraist
31 Jul 2022 1:18 UTC
410
points
128
comments
11
min read
LW
link
1
review
It Looks Like You’re Trying To Take Over The World
gwern
9 Mar 2022 16:35 UTC
402
points
120
comments
1
min read
LW
link
1
review
(www.gwern.net)
Reflections on six months of fatherhood
jasoncrawford
31 Jan 2022 5:28 UTC
385
points
24
comments
4
min read
LW
link
1
review
(jasoncrawford.org)
DeepMind alignment team opinions on AGI ruin arguments
Vika
12 Aug 2022 21:06 UTC
376
points
37
comments
14
min read
LW
link
1
review
Lies Told To Children
Eliezer Yudkowsky
14 Apr 2022 11:25 UTC
370
points
94
comments
7
min read
LW
link
1
review
Counterarguments to the basic AI x-risk case
KatjaGrace
14 Oct 2022 13:00 UTC
369
points
124
comments
34
min read
LW
link
1
review
(aiimpacts.org)
You Are Not Measuring What You Think You Are Measuring
johnswentworth
20 Sep 2022 20:04 UTC
368
points
44
comments
8
min read
LW
link
2
reviews
A Mechanistic Interpretability Analysis of Grokking
Neel Nanda
and
Tom Lieberum
15 Aug 2022 2:41 UTC
368
points
47
comments
36
min read
LW
link
1
review
(colab.research.google.com)
Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover
Ajeya Cotra
18 Jul 2022 19:06 UTC
364
points
94
comments
75
min read
LW
link
1
review
Security Mindset: Lessons from 20+ years of Software Security Failures Relevant to AGI Alignment
elspood
21 Jun 2022 23:55 UTC
360
points
42
comments
7
min read
LW
link
1
review
Accounting For College Costs
johnswentworth
1 Apr 2022 17:28 UTC
357
points
41
comments
7
min read
LW
link
What DALL-E 2 can and cannot do
Swimmer963 (Miranda Dixon-Luinenburg)
1 May 2022 23:51 UTC
353
points
303
comments
9
min read
LW
link
Reward is not the optimization target
TurnTrout
25 Jul 2022 0:03 UTC
347
points
122
comments
10
min read
LW
link
3
reviews
MIRI announces new “Death With Dignity” strategy
Eliezer Yudkowsky
2 Apr 2022 0:43 UTC
339
points
543
comments
18
min read
LW
link
1
review
Beware boasting about non-existent forecasting track records
Jotto999
20 May 2022 19:20 UTC
331
points
112
comments
5
min read
LW
link
1
review
What should you change in response to an “emergency”? And AI risk
AnnaSalamon
18 Jul 2022 1:11 UTC
328
points
60
comments
6
min read
LW
link
1
review
Why I think strong general AI is coming soon
porby
28 Sep 2022 5:40 UTC
325
points
139
comments
34
min read
LW
link
1
review
Looking back on my alignment PhD
TurnTrout
1 Jul 2022 3:19 UTC
318
points
63
comments
11
min read
LW
link
Staring into the abyss as a core life skill
benkuhn
22 Dec 2022 15:30 UTC
318
points
21
comments
12
min read
LW
link
1
review
(www.benkuhn.net)
Epistemic Legibility
Elizabeth
9 Feb 2022 18:10 UTC
305
points
30
comments
20
min read
LW
link
1
review
(acesounderglass.com)
Models Don’t “Get Reward”
Sam Ringer
30 Dec 2022 10:37 UTC
305
points
61
comments
5
min read
LW
link
1
review
Optimality is the tiger, and agents are its teeth
Veedrac
2 Apr 2022 0:46 UTC
301
points
42
comments
16
min read
LW
link
1
review
A challenge for AGI organizations, and a challenge for readers
Rob Bensinger
and
Eliezer Yudkowsky
1 Dec 2022 23:11 UTC
301
points
33
comments
2
min read
LW
link
Six Dimensions of Operational Adequacy in AGI Projects
Eliezer Yudkowsky
30 May 2022 17:00 UTC
299
points
66
comments
13
min read
LW
link
1
review
On how various plans miss the hard bits of the alignment challenge
So8res
12 Jul 2022 2:49 UTC
299
points
88
comments
29
min read
LW
link
3
reviews
Why Agent Foundations? An Overly Abstract Explanation
johnswentworth
25 Mar 2022 23:17 UTC
294
points
56
comments
8
min read
LW
link
1
review
Two-year update on my personal AI timelines
Ajeya Cotra
2 Aug 2022 23:07 UTC
288
points
60
comments
16
min read
LW
link
Mysteries of mode collapse
janus
8 Nov 2022 10:37 UTC
281
points
56
comments
14
min read
LW
link
1
review
Is AI Progress Impossible To Predict?
alyssavance
15 May 2022 18:30 UTC
277
points
39
comments
2
min read
LW
link
We Choose To Align AI
johnswentworth
1 Jan 2022 20:06 UTC
277
points
16
comments
3
min read
LW
link
1
review
A central AI alignment problem: capabilities generalization, and the sharp left turn
So8res
15 Jun 2022 13:10 UTC
274
points
52
comments
10
min read
LW
link
1
review
Sazen
[DEACTIVATED] Duncan Sabien
21 Dec 2022 7:54 UTC
274
points
83
comments
12
min read
LW
link
2
reviews
What Are You Tracking In Your Head?
johnswentworth
28 Jun 2022 19:30 UTC
271
points
80
comments
4
min read
LW
link
1
review
Don’t die with dignity; instead play to your outs
Jeffrey Ladish
6 Apr 2022 7:53 UTC
270
points
59
comments
5
min read
LW
link
Toni Kurz and the Insanity of Climbing Mountains
GeneSmith
3 Jul 2022 20:51 UTC
266
points
65
comments
11
min read
LW
link
2
reviews
Humans are very reliable agents
alyssavance
16 Jun 2022 22:02 UTC
264
points
35
comments
3
min read
LW
link
12 interesting things I learned studying the discovery of nature’s laws
Ben Pace
19 Feb 2022 23:39 UTC
261
points
40
comments
9
min read
LW
link
1
review
Changing the world through slack & hobbies
Steven Byrnes
21 Jul 2022 18:11 UTC
258
points
13
comments
10
min read
LW
link
So, geez there’s a lot of AI content these days
Raemon
6 Oct 2022 21:32 UTC
255
points
140
comments
6
min read
LW
link
Back to top
Next