Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
1
Thoughts On (Solving) Deep Deception
Jozdien
Oct 21, 2023, 10:40 PM
72
points
6
comments
6
min read
LW
link
Best effort beliefs
Adam Zerner
Oct 21, 2023, 10:05 PM
14
points
9
comments
4
min read
LW
link
How toy models of ontology changes can be misleading
Stuart_Armstrong
Oct 21, 2023, 9:13 PM
42
points
0
comments
2
min read
LW
link
Soups as Spreads
jefftk
Oct 21, 2023, 8:30 PM
22
points
0
comments
1
min read
LW
link
(www.jefftk.com)
Which COVID booster to get?
Sameerishere
Oct 21, 2023, 7:43 PM
8
points
0
comments
2
min read
LW
link
Alignment Implications of LLM Successes: a Debate in One Act
Zack_M_Davis
Oct 21, 2023, 3:22 PM
265
points
56
comments
13
min read
LW
link
2
reviews
How to find a good moving service
Ziyue Wang
Oct 21, 2023, 4:59 AM
8
points
0
comments
3
min read
LW
link
Apply for MATS Winter 2023-24!
utilistrutil
,
Ryan Kidd
and
LauraVaughan
Oct 21, 2023, 2:27 AM
104
points
6
comments
5
min read
LW
link
[Question]
Can we isolate neurons that recognize features vs. those which have some other role?
Joshua Clancy
Oct 21, 2023, 12:30 AM
4
points
2
comments
3
min read
LW
link
Muddling Along Is More Likely Than Dystopia
Jeffrey Heninger
Oct 20, 2023, 9:25 PM
88
points
10
comments
8
min read
LW
link
What’s Hard About The Shutdown Problem
johnswentworth
Oct 20, 2023, 9:13 PM
98
points
33
comments
4
min read
LW
link
Holly Elmore and Rob Miles dialogue on AI Safety Advocacy
Bird Concept
,
Robert Miles
and
Holly_Elmore
Oct 20, 2023, 9:04 PM
162
points
30
comments
27
min read
LW
link
TOMORROW: the largest AI Safety protest ever!
Holly_Elmore
Oct 20, 2023, 6:15 PM
105
points
26
comments
2
min read
LW
link
The Overkill Conspiracy Hypothesis
ymeskhout
Oct 20, 2023, 4:51 PM
26
points
8
comments
7
min read
LW
link
I Would Have Solved Alignment, But I Was Worried That Would Advance Timelines
307th
Oct 20, 2023, 4:37 PM
122
points
33
comments
9
min read
LW
link
Internal Target Information for AI Oversight
Paul Colognese
Oct 20, 2023, 2:53 PM
15
points
0
comments
5
min read
LW
link
On the proper date for solstice celebrations
jchan
Oct 20, 2023, 1:55 PM
16
points
0
comments
4
min read
LW
link
Are (at least some) Large Language Models Holographic Memory Stores?
Bill Benzon
Oct 20, 2023, 1:07 PM
11
points
4
comments
6
min read
LW
link
Mechanistic interpretability of LLM analogy-making
Sergii
Oct 20, 2023, 12:53 PM
2
points
0
comments
4
min read
LW
link
(grgv.xyz)
How To Socialize With Psycho(logist)s
Sable
Oct 20, 2023, 11:33 AM
37
points
11
comments
3
min read
LW
link
(affablyevil.substack.com)
Revealing Intentionality In Language Models Through AdaVAE Guided Sampling
jdp
Oct 20, 2023, 7:32 AM
119
points
15
comments
22
min read
LW
link
Features and Adversaries in MemoryDT
Joseph Bloom
and
Jay Bailey
Oct 20, 2023, 7:32 AM
31
points
6
comments
25
min read
LW
link
AI Safety Hub Serbia Soft Launch
DusanDNesic
Oct 20, 2023, 7:11 AM
64
points
1
comment
3
min read
LW
link
(forum.effectivealtruism.org)
Announcing new round of “Key Phenomena in AI Risk” Reading Group
DusanDNesic
and
Nora_Ammann
Oct 20, 2023, 7:11 AM
15
points
2
comments
1
min read
LW
link
Unpacking the dynamics of AGI conflict that suggest the necessity of a premptive pivotal act
Eli Tyre
Oct 20, 2023, 6:48 AM
63
points
2
comments
8
min read
LW
link
Genocide isn’t Decolonization
robotelvis
Oct 20, 2023, 4:14 AM
33
points
19
comments
5
min read
LW
link
(messyprogress.substack.com)
Trying to understand John Wentworth’s research agenda
johnswentworth
,
habryka
and
David Lorell
Oct 20, 2023, 12:05 AM
93
points
13
comments
12
min read
LW
link
Boost your productivity, happiness and health with this one weird trick
ajc586
Oct 19, 2023, 11:30 PM
9
points
9
comments
1
min read
LW
link
A Good Explanation of Differential Gears
Johannes C. Mayer
Oct 19, 2023, 11:07 PM
48
points
4
comments
1
min read
LW
link
(youtu.be)
Evening Wiki(pedia) Workout
mcint
Oct 19, 2023, 9:29 PM
1
point
1
comment
1
min read
LW
link
New roles on my team: come build Open Phil’s technical AI safety program with me!
Ajeya Cotra
Oct 19, 2023, 4:47 PM
83
points
6
comments
4
min read
LW
link
[Question]
Infinite tower of meta-probability
fryolysis
Oct 19, 2023, 4:44 PM
6
points
5
comments
3
min read
LW
link
A NotKillEveryoneIsm Argument for Accelerating Deep Learning Research
Logan Zoellner
Oct 19, 2023, 4:28 PM
−6
points
6
comments
5
min read
LW
link
(midwitalignment.substack.com)
Knowledge Base 5: Business model
iwis
Oct 19, 2023, 4:06 PM
−4
points
2
comments
1
min read
LW
link
AI #34: Chipping Away at Chip Exports
Zvi
Oct 19, 2023, 3:00 PM
36
points
19
comments
59
min read
LW
link
(thezvi.wordpress.com)
Is Yann LeCun strawmanning AI x-risks?
Chris_Leong
Oct 19, 2023, 11:35 AM
26
points
4
comments
1
min read
LW
link
[Video] Too much Empiricism kills you
Johannes C. Mayer
Oct 19, 2023, 5:08 AM
19
points
0
comments
1
min read
LW
link
(youtu.be)
Are humans misaligned with evolution?
TekhneMakre
and
jacob_cannell
Oct 19, 2023, 3:14 AM
42
points
13
comments
18
min read
LW
link
Brains, Planes, Blimps, and Algorithms
ai dan
Oct 18, 2023, 9:26 PM
1
point
0
comments
6
min read
LW
link
The (partial) fallacy of dumb superintelligence
Seth Herd
Oct 18, 2023, 9:25 PM
38
points
5
comments
4
min read
LW
link
[Question]
Does AI governance needs a “Federalist papers” debate?
azsantosk
Oct 18, 2023, 9:08 PM
40
points
4
comments
1
min read
LW
link
Metaculus Launches Conditional Cup to Explore Linked Forecasts
ChristianWilliams
Oct 18, 2023, 8:41 PM
9
points
0
comments
LW
link
(www.metaculus.com)
AI Safety 101 : Reward Misspecification
markov
Oct 18, 2023, 8:39 PM
32
points
4
comments
31
min read
LW
link
2023 East Coast Rationalist Megameetup
Screwtape
18 Oct 2023 20:33 UTC
8
points
0
comments
1
min read
LW
link
Superforecasting the premises in “Is power-seeking AI an existential risk?”
Joe Carlsmith
18 Oct 2023 20:23 UTC
31
points
3
comments
5
min read
LW
link
The Real Fanfic Is The Friends We Made Along The Way
Eneasz
18 Oct 2023 19:21 UTC
92
points
1
comment
27
min read
LW
link
1
review
(deathisbad.substack.com)
AISN #24: Kissinger Urges US-China Cooperation on AI, China’s New AI Law, US Export Controls, International Institutions, and Open Source AI
Dan H
and
Corin Katzke
18 Oct 2023 17:06 UTC
14
points
0
comments
6
min read
LW
link
(newsletter.safe.ai)
Back to the Past to the Future
Prometheus
18 Oct 2023 16:51 UTC
5
points
0
comments
1
min read
LW
link
How to Eradicate Global Extreme Poverty [RA video with fundraiser!]
aggliu
and
Writer
18 Oct 2023 15:51 UTC
50
points
5
comments
9
min read
LW
link
(youtu.be)
On Interpretability’s Robustness
WCargo
18 Oct 2023 13:18 UTC
11
points
0
comments
4
min read
LW
link
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel