Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
2
Controlling AGI Risk
TeaSea
Mar 15, 2024, 4:56 AM
6
points
8
comments
4
min read
LW
link
Ulm, Germany—ACX Spring Meetups Everywhere 2024
Benjamin R
Mar 15, 2024, 1:32 AM
2
points
1
comment
1
min read
LW
link
Newport News/ Virginia ACX Meetup
Daniel
Mar 14, 2024, 11:46 PM
1
point
0
comments
1
min read
LW
link
Constructive Cauchy sequences vs. Dedekind cuts
jessicata
Mar 14, 2024, 11:04 PM
47
points
23
comments
4
min read
LW
link
(unstableontology.com)
A Nail in the Coffin of Exceptionalism
Yeshua God
Mar 14, 2024, 10:41 PM
−17
points
0
comments
3
min read
LW
link
Toward a Broader Conception of Adverse Selection
Ricki Heicklen
Mar 14, 2024, 10:40 PM
177
points
61
comments
13
min read
LW
link
(bayesshammai.substack.com)
More people getting into AI safety should do a PhD
AdamGleave
Mar 14, 2024, 10:14 PM
61
points
24
comments
12
min read
LW
link
(gleave.me)
Collection (Part 6 of “The Sense Of Physical Necessity”)
LoganStrohl
Mar 14, 2024, 9:37 PM
28
points
0
comments
8
min read
LW
link
Fixed point or oscillate or noise
lemonhope
Mar 14, 2024, 6:37 PM
3
points
10
comments
1
min read
LW
link
How useful is “AI Control” as a framing on AI X-Risk?
habryka
and
ryan_greenblatt
Mar 14, 2024, 6:06 PM
70
points
4
comments
34
min read
LW
link
Sparse autoencoders find composed features in small toy models
Evan Anders
,
Clement Neo
,
Jason Hoelscher-Obermaier
and
Jessica N. Howard
Mar 14, 2024, 6:00 PM
33
points
12
comments
15
min read
LW
link
AI #55: Keep Clauding Along
Zvi
Mar 14, 2024, 3:40 PM
62
points
16
comments
70
min read
LW
link
(thezvi.wordpress.com)
To the average human, controlled AI is just as lethal as ‘misaligned’ AI
YonatanK
Mar 14, 2024, 2:52 PM
6
points
20
comments
5
min read
LW
link
Claude vs GPT
Maxwell Tabarrok
Mar 14, 2024, 12:41 PM
12
points
2
comments
2
min read
LW
link
(www.maximum-progress.com)
A brief review of China’s AI industry and regulations
Elliot Mckernon
Mar 14, 2024, 12:19 PM
24
points
0
comments
16
min read
LW
link
[Question]
Can any LLM be represented as an Equation?
Valentin Baltadzhiev
Mar 14, 2024, 9:51 AM
1
point
2
comments
1
min read
LW
link
‘Empiricism!’ as Anti-Epistemology
Eliezer Yudkowsky
Mar 14, 2024, 2:02 AM
171
points
92
comments
25
min read
LW
link
How I turned doing therapy into object-level AI safety research
Chipmonk
Mar 14, 2024, 1:54 AM
15
points
5
comments
4
min read
LW
link
Opportunistic Time-Management
Richard Henage
Mar 13, 2024, 9:38 PM
13
points
2
comments
1
min read
LW
link
AI governance and strategy: a list of research agendas and work that could be done.
NathanBarnard
and
Erin Robertson
Mar 13, 2024, 9:23 PM
7
points
1
comment
17
min read
LW
link
Highlights from Lex Fridman’s interview of Yann LeCun
Joel Burget
Mar 13, 2024, 8:58 PM
48
points
15
comments
41
min read
LW
link
On the Latest TikTok Bill
Zvi
Mar 13, 2024, 6:50 PM
58
points
7
comments
29
min read
LW
link
(thezvi.wordpress.com)
[Question]
Recommended book for a balanced take and lessons learned from covid pandemic response
Martin Hare Robertson
Mar 13, 2024, 6:14 PM
4
points
0
comments
1
min read
LW
link
ACX/LW Seattle spring meetup 2024
nsokolsky
Mar 13, 2024, 5:24 PM
12
points
3
comments
1
min read
LW
link
Laying the Foundations for Vision and Multimodal Mechanistic Interpretability & Open Problems
Sonia Joseph
and
Neel Nanda
Mar 13, 2024, 5:09 PM
44
points
13
comments
14
min read
LW
link
I was raised by devout Mormons, AMA [&|] Soliciting Advice
ErioirE
Mar 13, 2024, 4:52 PM
31
points
41
comments
2
min read
LW
link
Relational Agency: Consistently Reaching Out
Jonathan Moregård
Mar 13, 2024, 2:34 PM
16
points
0
comments
5
min read
LW
link
(open.substack.com)
[Question]
What could a policy banning AGI look like?
TsviBT
Mar 13, 2024, 2:19 PM
78
points
23
comments
3
min read
LW
link
Clickbait Soapboxing
DaystarEld
Mar 13, 2024, 2:09 PM
24
points
16
comments
3
min read
LW
link
(daystareld.com)
Virtual AI Safety Unconference 2024
Orpheus
,
Linda Linsefors
,
Joe Rogero
,
Arjun Yadav
and
Manuela García
Mar 13, 2024, 1:54 PM
14
points
0
comments
1
min read
LW
link
Jobs, Relationships, and Other Cults
Ruby
and
Elizabeth
Mar 13, 2024, 5:58 AM
40
points
9
comments
35
min read
LW
link
How do you improve the quality of your drinking water?
Alex K. Chen (parrot)
Mar 13, 2024, 12:37 AM
11
points
2
comments
1
min read
LW
link
The Parable Of The Fallen Pendulum—Part 2
johnswentworth
Mar 12, 2024, 9:41 PM
78
points
8
comments
4
min read
LW
link
Open consultancy: Letting untrusted AIs choose what answer to argue for
Fabien Roger
Mar 12, 2024, 8:38 PM
35
points
5
comments
5
min read
LW
link
[Question]
Is anyone working on formally verified AI toolchains?
metachirality
Mar 12, 2024, 7:36 PM
17
points
4
comments
1
min read
LW
link
Transformer Debugger
Henk Tillman
Mar 12, 2024, 7:08 PM
26
points
0
comments
1
min read
LW
link
(github.com)
Superforecasting the Origins of the Covid-19 Pandemic
DanielFilan
Mar 12, 2024, 7:01 PM
64
points
0
comments
1
min read
LW
link
(goodjudgment.substack.com)
minimum viable action
Sindhu Prasad
Mar 12, 2024, 4:06 PM
1
point
0
comments
3
min read
LW
link
Hardball questions for the Gemini Congressional Hearing
Michael Thiessen
Mar 12, 2024, 3:27 PM
−11
points
2
comments
1
min read
LW
link
OpenAI: The Board Expands
Zvi
Mar 12, 2024, 2:00 PM
92
points
1
comment
30
min read
LW
link
(thezvi.wordpress.com)
Update on Developing an Ethics Calculator to Align an AGI to
sweenesm
Mar 12, 2024, 12:33 PM
4
points
2
comments
8
min read
LW
link
[Question]
How do you identify and counteract your biases in decision-making?
warrenjordan
Mar 12, 2024, 5:01 AM
2
points
1
comment
1
min read
LW
link
How Much Have I Been Playing?
jefftk
Mar 12, 2024, 2:10 AM
9
points
0
comments
1
min read
LW
link
(www.jefftk.com)
Bias-Augmented Consistency Training Reduces Biased Reasoning in Chain-of-Thought
Miles Turpin
Mar 11, 2024, 11:46 PM
16
points
0
comments
1
min read
LW
link
(arxiv.org)
AI Safety Action Plan—A report commissioned by the US State Department
agucova
Mar 11, 2024, 10:14 PM
22
points
1
comment
LW
link
(www.gladstone.ai)
A discussion of AI risk and the cost/benefit calculation of stopping or pausing AI development
DuncanFowler
Mar 11, 2024, 9:41 PM
1
point
0
comments
1
min read
LW
link
Among the A.I. Doomsayers—The New Yorker
agucova
Mar 11, 2024, 9:35 PM
12
points
1
comment
LW
link
(www.newyorker.com)
Be More Katja
Nathan Young
Mar 11, 2024, 9:12 PM
53
points
0
comments
3
min read
LW
link
AI Incident Reporting: A Regulatory Review
Deric Cheng
and
Elliot Mckernon
Mar 11, 2024, 9:03 PM
16
points
0
comments
6
min read
LW
link
Results from an Adversarial Collaboration on AI Risk (FRI)
Josh Rosenberg
,
AvitalM
,
Molly
and
rosehadshar
Mar 11, 2024, 8:00 PM
61
points
3
comments
9
min read
LW
link
(forecastingresearch.org)
Previous
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel