Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Page
1
ELK Proposal—Make the Reporter care about the Predictor’s beliefs
Adam Jermyn
and
Nicholas Schiefer
Jun 11, 2022, 10:53 PM
8
points
0
comments
6
min read
LW
link
[Question]
Why has no person / group ever taken over the world?
Aryeh Englander
Jun 11, 2022, 8:51 PM
25
points
19
comments
1
min read
LW
link
[Question]
Are there English-speaking meetups in Frankfurt/Munich/Zurich?
Grant Demaree
Jun 11, 2022, 8:02 PM
6
points
2
comments
1
min read
LW
link
Beauty and the Beast
Tomás B.
Jun 11, 2022, 6:59 PM
38
points
8
comments
6
min read
LW
link
Poorly-Aimed Death Rays
Thane Ruthenis
Jun 11, 2022, 6:29 PM
48
points
5
comments
4
min read
LW
link
AGI Safety Communications Initiative
ines
Jun 11, 2022, 5:34 PM
7
points
0
comments
1
min read
LW
link
A gaming group for rationality-aware people
dhatas
Jun 11, 2022, 4:04 PM
7
points
0
comments
1
min read
LW
link
[Question]
Why don’t you introduce really impressive people you personally know to AI alignment (more often)?
Verden
Jun 11, 2022, 3:59 PM
33
points
14
comments
1
min read
LW
link
Godzilla Strategies
johnswentworth
Jun 11, 2022, 3:44 PM
159
points
72
comments
3
min read
LW
link
Steganography and the CycleGAN—alignment failure case study
Jan Czechowski
Jun 11, 2022, 9:41 AM
34
points
0
comments
4
min read
LW
link
The Mountain Troll
lsusr
Jun 11, 2022, 9:14 AM
103
points
26
comments
2
min read
LW
link
Show LW: YodaTimer.com
Adam Zerner
Jun 11, 2022, 8:52 AM
27
points
4
comments
1
min read
LW
link
How fast can we perform a forward pass?
jsteinhardt
Jun 10, 2022, 11:30 PM
53
points
9
comments
15
min read
LW
link
(bounded-regret.ghost.io)
Summary of “AGI Ruin: A List of Lethalities”
Stephen McAleese
Jun 10, 2022, 10:35 PM
45
points
2
comments
8
min read
LW
link
How dangerous is human-level AI?
Alex_Altair
Jun 10, 2022, 5:38 PM
21
points
4
comments
8
min read
LW
link
Another plausible scenario of AI risk: AI builds military infrastructure while collaborating with humans, defects later.
avturchin
Jun 10, 2022, 5:24 PM
10
points
2
comments
1
min read
LW
link
Leaving Google, Joining the Nucleic Acid Observatory
jefftk
Jun 10, 2022, 5:00 PM
114
points
4
comments
3
min read
LW
link
(www.jefftk.com)
On The Spectrum, On The Guest List: (v) The Fleur Room
party girl
Jun 10, 2022, 2:50 PM
8
points
1
comment
14
min read
LW
link
(onthespectrumontheguestlist.substack.com)
Progress Report 6: get the tool working
Nathan Helm-Burger
Jun 10, 2022, 11:18 AM
4
points
0
comments
2
min read
LW
link
[Question]
Is AI Alignment Impossible?
Heighn
Jun 10, 2022, 10:08 AM
3
points
3
comments
1
min read
LW
link
I No Longer Believe Intelligence to be “Magical”
DragonGod
Jun 10, 2022, 8:58 AM
28
points
34
comments
6
min read
LW
link
[linkpost] The final AI benchmark: BIG-bench
RomanS
Jun 10, 2022, 8:53 AM
25
points
21
comments
1
min read
LW
link
[Question]
Could Patent-Trolling delay AI timelines?
Pablo Repetto
Jun 10, 2022, 2:53 AM
1
point
3
comments
1
min read
LW
link
[Question]
Kolmogorov’s AI Forecast
interstice
Jun 10, 2022, 2:36 AM
9
points
1
comment
1
min read
LW
link
Tao, Kontsevich & others on HLAI in Math
interstice
Jun 10, 2022, 2:25 AM
41
points
5
comments
2
min read
LW
link
(www.youtube.com)
A plausible story about AI risk.
DeLesley Hutchins
Jun 10, 2022, 2:08 AM
16
points
2
comments
4
min read
LW
link
Open Problems in AI X-Risk [PAIS #5]
Dan H
and
TW123
Jun 10, 2022, 2:08 AM
61
points
6
comments
36
min read
LW
link
[Question]
why assume AGIs will optimize for fixed goals?
nostalgebraist
Jun 10, 2022, 1:28 AM
147
points
60
comments
4
min read
LW
link
2
reviews
Bureaucracy of AIs
Logan Zoellner
Jun 9, 2022, 11:03 PM
17
points
6
comments
14
min read
LW
link
You Only Get One Shot: an Intuition Pump for Embedded Agency
Oliver Sourbut
Jun 9, 2022, 9:38 PM
24
points
4
comments
2
min read
LW
link
[Question]
Forestalling Atmospheric Ignition
Lone Pine
Jun 9, 2022, 8:49 PM
11
points
9
comments
1
min read
LW
link
How Do Selection Theorems Relate To Interpretability?
johnswentworth
Jun 9, 2022, 7:39 PM
60
points
14
comments
3
min read
LW
link
Progress links and tweets, 2022-06-08
jasoncrawford
Jun 9, 2022, 7:13 PM
11
points
0
comments
1
min read
LW
link
(rootsofprogress.org)
If no near-term alignment strategy, research should aim for the long-term
harsimony
Jun 9, 2022, 7:10 PM
7
points
1
comment
1
min read
LW
link
Operationalizing two tasks in Gary Marcus’s AGI challenge
Bill Benzon
Jun 9, 2022, 6:31 PM
12
points
3
comments
8
min read
LW
link
Why it’s bad to kill Grandma
dynomight
Jun 9, 2022, 6:12 PM
29
points
14
comments
8
min read
LW
link
(dynomight.substack.com)
[Question]
Modeling humanity’s robustness to GCRs?
T431
Jun 9, 2022, 5:34 PM
2
points
2
comments
2
min read
LW
link
[Question]
If there was a millennium equivalent prize for AI alignment, what would the problems be?
Yair Halberstadt
Jun 9, 2022, 4:56 PM
17
points
4
comments
1
min read
LW
link
Book Review: How the World Became Rich
Davis Kedrosky
Jun 9, 2022, 4:55 PM
14
points
0
comments
10
min read
LW
link
(daviskedrosky.substack.com)
Covid 6/9/22: Nice
Zvi
Jun 9, 2022, 4:30 PM
26
points
2
comments
12
min read
LW
link
(thezvi.wordpress.com)
Website For Yoda Timers
Adam Zerner
Jun 9, 2022, 4:28 PM
16
points
1
comment
1
min read
LW
link
AI Could Defeat All Of Us Combined
HoldenKarnofsky
Jun 9, 2022, 3:50 PM
170
points
42
comments
17
min read
LW
link
(www.cold-takes.com)
The “mind-body vicious cycle” model of RSI & back pain
Steven Byrnes
Jun 9, 2022, 12:30 PM
91
points
32
comments
12
min read
LW
link
[Linkpost & Discussion] AI Trained on 4Chan Becomes ‘Hate Speech Machine’ [and outperforms GPT-3 on TruthfulQA Benchmark?!]
Yitz
Jun 9, 2022, 10:59 AM
16
points
5
comments
2
min read
LW
link
(www.vice.com)
Comment reply: my low-quality thoughts on why CFAR didn’t get farther with a “real/efficacious art of rationality”
AnnaSalamon
Jun 9, 2022, 2:12 AM
263
points
63
comments
17
min read
LW
link
1
review
Today in AI Risk History: The Terminator (1984 film) was released.
Impassionata
Jun 9, 2022, 1:32 AM
−3
points
6
comments
1
min read
LW
link
There’s probably a tradeoff between AI capability and safety, and we should act like it
David Johnston
Jun 9, 2022, 12:17 AM
3
points
3
comments
1
min read
LW
link
[Question]
Has anyone actually tried to convince Terry Tao or other top mathematicians to work on alignment?
P.
8 Jun 2022 22:26 UTC
64
points
51
comments
4
min read
LW
link
Entitlement as a major amplifier of unhappiness
VipulNaik
8 Jun 2022 22:08 UTC
29
points
6
comments
7
min read
LW
link
[Question]
Silly Online Rules
Gunnar_Zarncke
8 Jun 2022 20:40 UTC
8
points
12
comments
1
min read
LW
link
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel