Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
2
[Question]
What constraints does deep learning place on alignment plans?
Garrett Baker
May 3, 2023, 8:40 PM
9
points
0
comments
1
min read
LW
link
AGI rising: why we are in a new era of acute risk and increasing public awareness, and what to do now
Greg C
May 3, 2023, 8:26 PM
25
points
12
comments
LW
link
Formalizing the “AI x-risk is unlikely because it is ridiculous” argument
Christopher King
May 3, 2023, 6:56 PM
48
points
17
comments
3
min read
LW
link
[Question]
List of notable people who believe in AI X-risk?
vlad.proex
May 3, 2023, 6:46 PM
14
points
4
comments
1
min read
LW
link
[Question]
LessWrong exporting?
axiomAdministrator
May 3, 2023, 6:34 PM
0
points
3
comments
1
min read
LW
link
Progress links and tweets, 2023-05-03
jasoncrawford
May 3, 2023, 4:23 PM
13
points
0
comments
2
min read
LW
link
(rootsofprogress.org)
Personhood is a Religious Belief
jan Sijan
May 3, 2023, 4:16 PM
−41
points
28
comments
6
min read
LW
link
Slowing AI: Crunch time
Zach Stein-Perlman
May 3, 2023, 3:00 PM
11
points
1
comment
2
min read
LW
link
Finding Neurons in a Haystack: Case Studies with Sparse Probing
wesg
and
Neel Nanda
May 3, 2023, 1:30 PM
33
points
6
comments
2
min read
LW
link
1
review
(arxiv.org)
Monthly Roundup #6: May 2023
Zvi
May 3, 2023, 12:50 PM
31
points
12
comments
24
min read
LW
link
(thezvi.wordpress.com)
[Question]
How much do personal biases in risk assessment affect assessment of AI risks?
Gordon Seidoh Worley
May 3, 2023, 6:12 AM
10
points
8
comments
1
min read
LW
link
Communication strategies for autism, with examples
stonefly
May 3, 2023, 5:25 AM
16
points
2
comments
7
min read
LW
link
Understand how other people think: a theory of worldviews.
spencerg
May 3, 2023, 3:57 AM
2
points
8
comments
LW
link
“Copilot” type AI integration could lead to training data needed for AGI
anithite
May 3, 2023, 12:57 AM
8
points
0
comments
2
min read
LW
link
Averting Catastrophe: Decision Theory for COVID-19, Climate Change, and Potential Disasters of All Kinds
JakubK
May 2, 2023, 10:50 PM
10
points
0
comments
LW
link
A Case for the Least Forgiving Take On Alignment
Thane Ruthenis
May 2, 2023, 9:34 PM
100
points
85
comments
22
min read
LW
link
Are Emergent Abilities of Large Language Models a Mirage? [linkpost]
Matthew Barnett
May 2, 2023, 9:01 PM
53
points
19
comments
1
min read
LW
link
(arxiv.org)
Does descaling a kettle help? Theory and practice
philh
May 2, 2023, 8:20 PM
35
points
25
comments
8
min read
LW
link
(reasonableapproximation.net)
Avoiding xrisk from AI doesn’t mean focusing on AI xrisk
Stuart_Armstrong
May 2, 2023, 7:27 PM
67
points
7
comments
3
min read
LW
link
AI Safety Newsletter #4: AI and Cybersecurity, Persuasive AIs, Weaponization, and Geoffrey Hinton talks AI risks
ozhang
,
Dan H
and
Orpheus16
May 2, 2023, 6:41 PM
32
points
0
comments
5
min read
LW
link
(newsletter.safe.ai)
My best system yet: text-based project management
jt
May 2, 2023, 5:44 PM
6
points
8
comments
5
min read
LW
link
[Question]
What’s the state of AI safety in Japan?
ChristianKl
May 2, 2023, 5:06 PM
5
points
1
comment
1
min read
LW
link
Five Worlds of AI (by Scott Aaronson and Boaz Barak)
mishka
May 2, 2023, 1:23 PM
22
points
6
comments
1
min read
LW
link
1
review
(scottaaronson.blog)
Systems that cannot be unsafe cannot be safe
Davidmanheim
May 2, 2023, 8:53 AM
62
points
27
comments
2
min read
LW
link
AGI safety career advice
Richard_Ngo
May 2, 2023, 7:36 AM
132
points
24
comments
13
min read
LW
link
An Impossibility Proof Relevant to the Shutdown Problem and Corrigibility
Audere
May 2, 2023, 6:52 AM
66
points
13
comments
9
min read
LW
link
Some Thoughts on Virtue Ethics for AIs
peligrietzer
May 2, 2023, 5:46 AM
83
points
8
comments
4
min read
LW
link
Technological unemployment as another test for rationalist winning
RomanHauksson
May 2, 2023, 4:16 AM
14
points
5
comments
1
min read
LW
link
The Moral Copernican Principle
Legionnaire
May 2, 2023, 3:25 AM
5
points
7
comments
2
min read
LW
link
Open & Welcome Thread—May 2023
Ruby
May 2, 2023, 2:58 AM
22
points
41
comments
1
min read
LW
link
Summaries of top forum posts (24th − 30th April 2023)
Zoe Williams
May 2, 2023, 2:30 AM
12
points
1
comment
LW
link
AXRP Episode 21 - Interpretability for Engineers with Stephen Casper
DanielFilan
May 2, 2023, 12:50 AM
12
points
1
comment
66
min read
LW
link
Getting Your Eyes On
LoganStrohl
May 2, 2023, 12:33 AM
65
points
11
comments
14
min read
LW
link
What 2025 looks like
Ruby
May 1, 2023, 10:53 PM
75
points
17
comments
15
min read
LW
link
[Question]
Natural Selection vs Gradient Descent
CuriousApe11
May 1, 2023, 10:16 PM
4
points
3
comments
1
min read
LW
link
A[I] Zombie Apocalypse Is Already Upon Us
NickHarris
May 1, 2023, 10:02 PM
−6
points
4
comments
2
min read
LW
link
Geoff Hinton Quits Google
Adam Shai
May 1, 2023, 9:03 PM
98
points
14
comments
1
min read
LW
link
The Apprentice Thread 2
hath
May 1, 2023, 8:09 PM
50
points
19
comments
1
min read
LW
link
Budapest, Hungary – ACX Meetups Everywhere Spring 2023
Richard Horvath
,
Timothy Underwood
and
marta_k
May 1, 2023, 5:36 PM
4
points
0
comments
1
min read
LW
link
In favor of steelmanning
jp
May 1, 2023, 5:12 PM
36
points
6
comments
LW
link
Shah (DeepMind) and Leahy (Conjecture) Discuss Alignment Cruxes
OliviaJ
,
Rohin Shah
,
Connor Leahy
and
Andrea_Miotti
May 1, 2023, 4:47 PM
96
points
10
comments
30
min read
LW
link
Distinguishing misuse is difficult and uncomfortable
lemonhope
May 1, 2023, 4:23 PM
17
points
3
comments
1
min read
LW
link
[Question]
Does agency necessarily imply self-preservation instinct?
Mislav Jurić
May 1, 2023, 4:06 PM
5
points
8
comments
1
min read
LW
link
What Boston Can Teach Us About What a Woman Is
ymeskhout
1 May 2023 15:34 UTC
18
points
45
comments
12
min read
LW
link
The Rocket Alignment Problem, Part 2
Zvi
1 May 2023 14:30 UTC
40
points
20
comments
9
min read
LW
link
(thezvi.wordpress.com)
Socialist Democratic-Republic GAME: 12 Amendments to the Constitutions of the Free World
monkymind
1 May 2023 13:13 UTC
−34
points
0
comments
1
min read
LW
link
[Question]
Where is all this evidence of UFOs?
Logan Zoellner
1 May 2023 12:13 UTC
29
points
42
comments
1
min read
LW
link
LessWrong Community Weekend 2023 [Applications now closed]
Henry Prowbell
1 May 2023 9:31 UTC
43
points
0
comments
6
min read
LW
link
LessWrong Community Weekend 2023 [Applications now closed]
Henry Prowbell
1 May 2023 9:08 UTC
89
points
0
comments
6
min read
LW
link
[Question]
In AI Risk what is the base model of the AI?
jmh
1 May 2023 3:25 UTC
3
points
1
comment
1
min read
LW
link
Previous
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel