Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Page
3
Adversarial training, importance sampling, and anti-adversarial training for AI whistleblowing
Buck
Jun 2, 2022, 11:48 PM
42
points
0
comments
3
min read
LW
link
Tao, Kontsevich & others on HLAI in Math
interstice
Jun 10, 2022, 2:25 AM
41
points
5
comments
2
min read
LW
link
(www.youtube.com)
Linkpost: Robin Hanson—Why Not Wait On AI Risk?
Yair Halberstadt
Jun 24, 2022, 2:23 PM
41
points
14
comments
1
min read
LW
link
(www.overcomingbias.com)
Blake Richards on Why he is Skeptical of Existential Risk from AI
Michaël Trazzi
Jun 14, 2022, 7:09 PM
41
points
12
comments
4
min read
LW
link
(theinsideview.ai)
Georgism, in theory
Stuart_Armstrong
Jun 15, 2022, 3:20 PM
40
points
22
comments
4
min read
LW
link
Key Papers in Language Model Safety
aog
Jun 20, 2022, 3:00 PM
40
points
1
comment
22
min read
LW
link
D&D.Sci June 2022: A Goddess Tried To Reincarnate Me Into A Fantasy World, But I Insisted On Using Data Science To Select An Optimal Combination Of Cheat Skills!
abstractapplic
Jun 4, 2022, 1:28 AM
40
points
22
comments
3
min read
LW
link
A Litany Missing from the Canon
benwr
Jun 17, 2022, 1:39 AM
39
points
3
comments
1
min read
LW
link
(www.benwr.net)
Four reasons I find AI safety emotionally compelling
KatWoods
and
AmberDawn
Jun 28, 2022, 2:10 PM
39
points
3
comments
4
min read
LW
link
Another Calming Example
jefftk
Jun 3, 2022, 2:20 AM
39
points
13
comments
2
min read
LW
link
(www.jefftk.com)
The table of different sampling assumptions in anthropics
avturchin
Jun 29, 2022, 10:41 AM
39
points
5
comments
12
min read
LW
link
[Yann Lecun] A Path Towards Autonomous Machine Intelligence
DragonGod
Jun 27, 2022, 7:24 PM
38
points
14
comments
1
min read
LW
link
(openreview.net)
Grokking “Forecasting TAI with biological anchors”
anson.ho
Jun 6, 2022, 6:58 PM
38
points
0
comments
14
min read
LW
link
Beauty and the Beast
Tomás B.
Jun 11, 2022, 6:59 PM
38
points
8
comments
6
min read
LW
link
Gradient hacking: definitions and examples
Richard_Ngo
Jun 29, 2022, 9:35 PM
38
points
2
comments
5
min read
LW
link
Vael Gates: Risks from Advanced AI (June 2022)
Vael Gates
Jun 14, 2022, 12:54 AM
38
points
2
comments
30
min read
LW
link
[Question]
What’s the “This AI is of moral concern.” fire alarm?
Quintin Pope
Jun 13, 2022, 8:05 AM
37
points
56
comments
2
min read
LW
link
Quick Look: Asymptomatic Herpes Shedding
Elizabeth
Jun 4, 2022, 9:40 PM
37
points
4
comments
2
min read
LW
link
(acesounderglass.com)
Scott Aaronson and Steven Pinker Debate AI Scaling
Liron
Jun 28, 2022, 4:04 PM
37
points
7
comments
1
min read
LW
link
(scottaaronson.blog)
Why agents are powerful
Daniel Kokotajlo
Jun 6, 2022, 1:37 AM
37
points
7
comments
7
min read
LW
link
Announcing the Clearer Thinking Regrants program
spencerg
Jun 17, 2022, 1:14 PM
36
points
1
comment
1
min read
LW
link
[Link] Adversarially trained neural representations may already be as robust as corresponding biological neural representations
Gunnar_Zarncke
Jun 24, 2022, 8:51 PM
35
points
9
comments
1
min read
LW
link
Optimization and Adequacy in Five Bullets
james.lucassen
Jun 6, 2022, 5:48 AM
35
points
2
comments
4
min read
LW
link
(jlucassen.com)
Alignment Risk Doesn’t Require Superintelligence
JustisMills
Jun 15, 2022, 3:12 AM
35
points
4
comments
2
min read
LW
link
D&D.Sci June 2022 Evaluation and Ruleset
abstractapplic
Jun 13, 2022, 10:31 AM
34
points
11
comments
4
min read
LW
link
Steganography and the CycleGAN—alignment failure case study
Jan Czechowski
Jun 11, 2022, 9:41 AM
34
points
0
comments
4
min read
LW
link
[Question]
Are long-form dating profiles productive?
AABoyles
Jun 27, 2022, 5:03 PM
34
points
32
comments
1
min read
LW
link
[Question]
How much does cybersecurity reduce AI risk?
Darmani
Jun 12, 2022, 10:13 PM
34
points
23
comments
1
min read
LW
link
[Question]
Why don’t you introduce really impressive people you personally know to AI alignment (more often)?
Verden
Jun 11, 2022, 3:59 PM
33
points
14
comments
1
min read
LW
link
To what extent have ideas and scientific discoveries gotten harder to find?
lsusr
Jun 18, 2022, 7:15 AM
33
points
10
comments
6
min read
LW
link
Reflection Mechanisms as an Alignment target: A survey
Marius Hobbhahn
,
elandgre
and
Beth Barnes
Jun 22, 2022, 3:05 PM
32
points
1
comment
14
min read
LW
link
Google’s new text-to-image model—Parti, a demonstration of scaling benefits
Kayden
Jun 22, 2022, 8:00 PM
32
points
4
comments
1
min read
LW
link
A claim that Google’s LaMDA is sentient
Ben Livengood
Jun 12, 2022, 4:18 AM
31
points
133
comments
1
min read
LW
link
[Question]
How are compute assets distributed in the world?
Chris van Merwijk
Jun 12, 2022, 10:13 PM
30
points
7
comments
1
min read
LW
link
[Question]
Why don’t we think we’re in the simplest universe with intelligent life?
ADifferentAnonymous
Jun 18, 2022, 3:05 AM
30
points
33
comments
1
min read
LW
link
Assessing AlephAlphas Multimodal Model
p.b.
Jun 28, 2022, 9:28 AM
30
points
5
comments
3
min read
LW
link
Common but neglected risk factors that may let you get Paxlovid
DirectedEvolution
Jun 21, 2022, 7:34 AM
29
points
8
comments
4
min read
LW
link
Covid 6/16/22: Do Not Hand it to Them
Zvi
Jun 16, 2022, 2:40 PM
29
points
5
comments
7
min read
LW
link
(thezvi.wordpress.com)
Entitlement as a major amplifier of unhappiness
VipulNaik
Jun 8, 2022, 10:08 PM
29
points
6
comments
7
min read
LW
link
Forecasting Fusion Power
Daniel Kokotajlo
Jun 18, 2022, 12:04 AM
29
points
8
comments
1
min read
LW
link
(astralcodexten.substack.com)
Juneberry Cake
jefftk
Jun 19, 2022, 1:40 AM
29
points
0
comments
1
min read
LW
link
(www.jefftk.com)
A Butterfly’s View of Probability
Gabriel Wu
Jun 15, 2022, 2:14 AM
29
points
17
comments
11
min read
LW
link
Why it’s bad to kill Grandma
dynomight
Jun 9, 2022, 6:12 PM
29
points
14
comments
8
min read
LW
link
(dynomight.substack.com)
Was the Industrial Revolution The Industrial Revolution?
Davis Kedrosky
Jun 14, 2022, 2:48 PM
29
points
0
comments
12
min read
LW
link
(daviskedrosky.substack.com)
Wielding civilization
dominicq
Jun 1, 2022, 7:11 AM
29
points
2
comments
2
min read
LW
link
[Link-post] On Deference and Yudkowsky’s AI Risk Estimates
bmg
Jun 19, 2022, 5:25 PM
29
points
8
comments
1
min read
LW
link
Investigating causal understanding in LLMs
Marius Hobbhahn
and
Tom Lieberum
Jun 14, 2022, 1:57 PM
28
points
6
comments
13
min read
LW
link
[Question]
Is CIRL a promising agenda?
Chris_Leong
Jun 23, 2022, 5:12 PM
28
points
16
comments
1
min read
LW
link
Intelligence in Commitment Races
David Udell
Jun 24, 2022, 2:30 PM
28
points
8
comments
5
min read
LW
link
Limits of Bodily Autonomy
jefftk
Jun 27, 2022, 7:50 PM
28
points
18
comments
1
min read
LW
link
(www.jefftk.com)
Back to first
Previous
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel