Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Page
1
AGI Ruin: A List of Lethalities
Eliezer Yudkowsky
Jun 5, 2022, 10:05 PM
940
points
708
comments
30
min read
LW
link
3
reviews
Where I agree and disagree with Eliezer
paulfchristiano
Jun 19, 2022, 7:15 PM
900
points
223
comments
18
min read
LW
link
2
reviews
It’s Probably Not Lithium
Natália
Jun 28, 2022, 9:24 PM
442
points
187
comments
28
min read
LW
link
1
review
Security Mindset: Lessons from 20+ years of Software Security Failures Relevant to AGI Alignment
elspood
Jun 21, 2022, 11:55 PM
362
points
42
comments
7
min read
LW
link
1
review
What Are You Tracking In Your Head?
johnswentworth
Jun 28, 2022, 7:30 PM
289
points
83
comments
4
min read
LW
link
1
review
A central AI alignment problem: capabilities generalization, and the sharp left turn
So8res
Jun 15, 2022, 1:10 PM
272
points
55
comments
10
min read
LW
link
1
review
Humans are very reliable agents
alyssavance
Jun 16, 2022, 10:02 PM
269
points
35
comments
3
min read
LW
link
Comment reply: my low-quality thoughts on why CFAR didn’t get farther with a “real/efficacious art of rationality”
AnnaSalamon
Jun 9, 2022, 2:12 AM
263
points
63
comments
17
min read
LW
link
1
review
Slow motion videos as AI risk intuition pumps
Andrew_Critch
Jun 14, 2022, 7:31 PM
241
points
41
comments
2
min read
LW
link
1
review
Contra Hofstadter on GPT-3 Nonsense
rictic
Jun 15, 2022, 9:53 PM
237
points
24
comments
2
min read
LW
link
AGI Safety FAQ / all-dumb-questions-allowed thread
Aryeh Englander
Jun 7, 2022, 5:47 AM
227
points
526
comments
4
min read
LW
link
The prototypical catastrophic AI action is getting root access to its datacenter
Buck
Jun 2, 2022, 11:46 PM
180
points
13
comments
2
min read
LW
link
1
review
The inordinately slow spread of good AGI conversations in ML
Rob Bensinger
Jun 21, 2022, 4:09 PM
173
points
62
comments
8
min read
LW
link
Announcing the Inverse Scaling Prize ($250k Prize Pool)
Ethan Perez
,
Ian McKenzie
and
Sam Bowman
Jun 27, 2022, 3:58 PM
171
points
14
comments
7
min read
LW
link
AI Could Defeat All Of Us Combined
HoldenKarnofsky
Jun 9, 2022, 3:50 PM
170
points
42
comments
17
min read
LW
link
(www.cold-takes.com)
On A List of Lethalities
Zvi
Jun 13, 2022, 12:30 PM
165
points
50
comments
54
min read
LW
link
1
review
(thezvi.wordpress.com)
A transparency and interpretability tech tree
evhub
Jun 16, 2022, 11:44 PM
163
points
11
comments
18
min read
LW
link
1
review
Deep Learning Systems Are Not Less Interpretable Than Logic/Probability/Etc
johnswentworth
Jun 4, 2022, 5:41 AM
160
points
55
comments
2
min read
LW
link
1
review
Godzilla Strategies
johnswentworth
Jun 11, 2022, 3:44 PM
159
points
72
comments
3
min read
LW
link
Why all the fuss about recursive self-improvement?
So8res
Jun 12, 2022, 8:53 PM
158
points
62
comments
7
min read
LW
link
1
review
Limits to Legibility
Jan_Kulveit
Jun 29, 2022, 5:42 PM
157
points
11
comments
5
min read
LW
link
1
review
Nonprofit Boards are Weird
HoldenKarnofsky
Jun 23, 2022, 2:40 PM
156
points
26
comments
20
min read
LW
link
1
review
(www.cold-takes.com)
LessWrong Has Agree/Disagree Voting On All New Comment Threads
Ben Pace
Jun 24, 2022, 12:43 AM
154
points
219
comments
2
min read
LW
link
1
review
Staying Split: Sabatini and Social Justice
Duncan Sabien (Inactive)
Jun 8, 2022, 8:32 AM
153
points
28
comments
21
min read
LW
link
Steam
abramdemski
Jun 20, 2022, 5:38 PM
149
points
13
comments
5
min read
LW
link
1
review
[Question]
why assume AGIs will optimize for fixed goals?
nostalgebraist
Jun 10, 2022, 1:28 AM
147
points
60
comments
4
min read
LW
link
2
reviews
Public beliefs vs. Private beliefs
Eli Tyre
Jun 1, 2022, 9:33 PM
146
points
30
comments
5
min read
LW
link
A descriptive, not prescriptive, overview of current AI Alignment Research
Jan
,
Logan Riggs
,
jacquesthibs
and
janus
Jun 6, 2022, 9:59 PM
139
points
21
comments
7
min read
LW
link
Contra EY: Can AGI destroy us without trial & error?
nsokolsky
Jun 13, 2022, 6:26 PM
137
points
72
comments
15
min read
LW
link
Announcing the LessWrong Curated Podcast
Ben Pace
and
Solenoid_Entity
Jun 22, 2022, 10:16 PM
137
points
27
comments
1
min read
LW
link
AI-Written Critiques Help Humans Notice Flaws
paulfchristiano
Jun 25, 2022, 5:22 PM
137
points
5
comments
3
min read
LW
link
(openai.com)
Will Capabilities Generalise More?
Ramana Kumar
Jun 29, 2022, 5:12 PM
133
points
39
comments
4
min read
LW
link
Confused why a “capabilities research is good for alignment progress” position isn’t discussed more
Kaj_Sotala
Jun 2, 2022, 9:41 PM
130
points
27
comments
4
min read
LW
link
Intergenerational trauma impeding cooperative existential safety efforts
Andrew_Critch
Jun 3, 2022, 8:13 AM
129
points
29
comments
3
min read
LW
link
“Pivotal Acts” means something specific
Raemon
Jun 7, 2022, 9:56 PM
127
points
23
comments
2
min read
LW
link
Let’s See You Write That Corrigibility Tag
Eliezer Yudkowsky
Jun 19, 2022, 9:11 PM
125
points
70
comments
1
min read
LW
link
Scott Aaronson is joining OpenAI to work on AI safety
peterbarnett
Jun 18, 2022, 4:06 AM
117
points
31
comments
1
min read
LW
link
(scottaaronson.blog)
CFAR Handbook: Introduction
CFAR!Duncan
Jun 28, 2022, 4:53 PM
116
points
12
comments
1
min read
LW
link
Leaving Google, Joining the Nucleic Acid Observatory
jefftk
Jun 10, 2022, 5:00 PM
114
points
4
comments
3
min read
LW
link
(www.jefftk.com)
Conversation with Eliezer: What do you want the system to do?
Orpheus16
Jun 25, 2022, 5:36 PM
114
points
38
comments
2
min read
LW
link
Who models the models that model models? An exploration of GPT-3′s in-context model fitting ability
Lovre
Jun 7, 2022, 7:37 PM
112
points
16
comments
9
min read
LW
link
Relationship Advice Repository
Ruby
Jun 20, 2022, 2:39 PM
109
points
36
comments
38
min read
LW
link
wrapper-minds are the enemy
nostalgebraist
Jun 17, 2022, 1:58 AM
104
points
43
comments
8
min read
LW
link
Yes, AI research will be substantially curtailed if a lab causes a major disaster
lc
Jun 14, 2022, 10:17 PM
103
points
31
comments
2
min read
LW
link
The Mountain Troll
lsusr
Jun 11, 2022, 9:14 AM
103
points
26
comments
2
min read
LW
link
Units of Exchange
CFAR!Duncan
Jun 28, 2022, 4:53 PM
99
points
28
comments
11
min read
LW
link
Pivotal outcomes and pivotal processes
Andrew_Critch
Jun 17, 2022, 11:43 PM
97
points
31
comments
4
min read
LW
link
Announcing Epoch: A research organization investigating the road to Transformative AI
Jsevillamol
,
Pablo Villalobos
,
Tamay
,
lennart
,
Marius Hobbhahn
and
anson.ho
Jun 27, 2022, 1:55 PM
97
points
2
comments
2
min read
LW
link
(epochai.org)
My current take on Internal Family Systems “parts”
Kaj_Sotala
26 Jun 2022 17:40 UTC
96
points
11
comments
3
min read
LW
link
(kajsotala.fi)
Contest: An Alien Message
DaemonicSigil
27 Jun 2022 5:54 UTC
95
points
100
comments
1
min read
LW
link
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel