Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Page
1
AlignedCut: Visual Concepts Discovery on Brain-Guided Universal Feature Space
Bogdan Ionut Cirstea
Sep 14, 2024, 11:23 PM
17
points
1
comment
1
min read
LW
link
(arxiv.org)
How you can help pass important AI legislation with 10 minutes of effort
ThomasW
Sep 14, 2024, 10:10 PM
59
points
2
comments
2
min read
LW
link
[Question]
Calibration training for ‘percentile rankings’?
david reinstein
Sep 14, 2024, 9:51 PM
3
points
0
comments
2
min read
LW
link
OpenAI o1, Llama 4, and AlphaZero of LLMs
Vladimir_Nesov
Sep 14, 2024, 9:27 PM
83
points
25
comments
1
min read
LW
link
Forever Leaders
Justice Howard
Sep 14, 2024, 8:55 PM
6
points
9
comments
1
min read
LW
link
Emergent Authorship: Creativity à la Communing
gswonk
Sep 14, 2024, 7:02 PM
1
point
0
comments
3
min read
LW
link
Compression Moves for Prediction
adamShimi
Sep 14, 2024, 5:51 PM
20
points
0
comments
7
min read
LW
link
(epistemologicalfascinations.substack.com)
Pay-on-results personal growth: first success
Chipmonk
Sep 14, 2024, 3:39 AM
63
points
8
comments
4
min read
LW
link
(chrislakin.blog)
Avoiding the Bog of Moral Hazard for AI
Nathan Helm-Burger
Sep 13, 2024, 9:24 PM
19
points
13
comments
2
min read
LW
link
[Question]
If I ask an LLM to think step by step, how big are the steps?
ryan_b
Sep 13, 2024, 8:30 PM
7
points
1
comment
1
min read
LW
link
Estimating Tail Risk in Neural Networks
Mark Xu
Sep 13, 2024, 8:00 PM
68
points
9
comments
23
min read
LW
link
(www.alignment.org)
If-Then Commitments for AI Risk Reduction [by Holden Karnofsky]
habryka
Sep 13, 2024, 7:38 PM
28
points
0
comments
20
min read
LW
link
(carnegieendowment.org)
Can startups be impactful in AI safety?
Esben Kran
and
Archana Vaidheeswaran
Sep 13, 2024, 7:00 PM
15
points
0
comments
6
min read
LW
link
I just can’t agree with AI safety. Why am I wrong?
Ya Polkovnik
Sep 13, 2024, 5:48 PM
0
points
5
comments
2
min read
LW
link
Keeping it (less than) real: Against ℶ₂ possible people or worlds
quiet_NaN
Sep 13, 2024, 5:29 PM
17
points
3
comments
9
min read
LW
link
Why I’m bearish on mechanistic interpretability: the shards are not in the network
tailcalled
Sep 13, 2024, 5:09 PM
22
points
40
comments
1
min read
LW
link
Increasing the Span of the Set of Ideas
Jeffrey Heninger
Sep 13, 2024, 3:52 PM
6
points
1
comment
9
min read
LW
link
How difficult is AI Alignment?
Sammy Martin
Sep 13, 2024, 3:47 PM
44
points
6
comments
23
min read
LW
link
The Great Data Integration Schlep
sarahconstantin
Sep 13, 2024, 3:40 PM
275
points
19
comments
9
min read
LW
link
(sarahconstantin.substack.com)
“Real AGI”
Seth Herd
Sep 13, 2024, 2:13 PM
20
points
20
comments
3
min read
LW
link
AI, centralization, and the One Ring
owencb
Sep 13, 2024, 2:00 PM
80
points
12
comments
8
min read
LW
link
(strangecities.substack.com)
Evidence against Learned Search in a Chess-Playing Neural Network
p.b.
Sep 13, 2024, 11:59 AM
57
points
3
comments
6
min read
LW
link
My career exploration: Tools for building confidence
lynettebye
Sep 13, 2024, 11:37 AM
20
points
0
comments
20
min read
LW
link
Contra papers claiming superhuman AI forecasting
nikos
,
Peter Mühlbacher
,
Lawrence Phillips
and
dschwarz
Sep 12, 2024, 6:10 PM
182
points
16
comments
7
min read
LW
link
OpenAI o1
Zach Stein-Perlman
Sep 12, 2024, 5:30 PM
147
points
41
comments
1
min read
LW
link
How to Give in to Threats (without incentivizing them)
Mikhail Samin
Sep 12, 2024, 3:55 PM
67
points
31
comments
5
min read
LW
link
Open Problems in AIXI Agent Foundations
Cole Wyeth
Sep 12, 2024, 3:38 PM
42
points
2
comments
10
min read
LW
link
On the destruction of America’s best high school
Chris_Leong
Sep 12, 2024, 3:30 PM
−6
points
7
comments
1
min read
LW
link
(scottaaronson.blog)
Optimising under arbitrarily many constraint equations
dkl9
Sep 12, 2024, 2:59 PM
6
points
0
comments
3
min read
LW
link
(dkl9.net)
AI #81: Alpha Proteo
Zvi
Sep 12, 2024, 1:00 PM
59
points
3
comments
35
min read
LW
link
(thezvi.wordpress.com)
[Question]
When can I be numerate?
FinalFormal2
Sep 12, 2024, 4:05 AM
25
points
4
comments
1
min read
LW
link
A Nonconstructive Existence Proof of Aligned Superintelligence
Roko
Sep 12, 2024, 3:20 AM
0
points
80
comments
1
min read
LW
link
(transhumanaxiology.substack.com)
Collapsing the Belief/Knowledge Distinction
Jeremias
Sep 11, 2024, 9:24 PM
−7
points
8
comments
1
min read
LW
link
Programming Refusal with Conditional Activation Steering
Bruce W. Lee
Sep 11, 2024, 8:57 PM
41
points
0
comments
11
min read
LW
link
(brucewlee.com)
Checking public figures on whether they “answered the question” quick analysis from Harris/Trump debate, and a proposal
david reinstein
Sep 11, 2024, 8:25 PM
7
points
4
comments
1
min read
LW
link
(open.substack.com)
AI Safety Newsletter #41: The Next Generation of Compute Scale Plus, Ranking Models by Susceptibility to Jailbreaking, and Machine Ethics
Corin Katzke
,
Corin Katzke
,
Julius
,
andrewz
and
Dan H
Sep 11, 2024, 7:14 PM
5
points
1
comment
5
min read
LW
link
(newsletter.safe.ai)
Refactoring cryonics as structural brain preservation
Andy_McKenzie
Sep 11, 2024, 6:36 PM
101
points
14
comments
3
min read
LW
link
[Question]
Is this a Pivotal Weak Act? Creating bacteria that decompose metal
doomyeser
Sep 11, 2024, 6:07 PM
9
points
9
comments
3
min read
LW
link
How to discover the nature of sentience, and ethics
Gustavo Ramires
Sep 11, 2024, 5:22 PM
−2
points
5
comments
5
min read
LW
link
Seeking Mechanism Designer for Research into Internalizing Catastrophic Externalities
c.trout
Sep 11, 2024, 3:09 PM
24
points
2
comments
3
min read
LW
link
Could Things Be Very Different?—How Historical Inertia Might Blind Us To Optimal Solutions
James Stephen Brown
Sep 11, 2024, 9:53 AM
5
points
0
comments
8
min read
LW
link
(nonzerosum.games)
Reformative Hypocrisy, and Paying Close Enough Attention to Selectively Reward It.
Andrew_Critch
Sep 11, 2024, 4:41 AM
53
points
11
comments
3
min read
LW
link
A necessary Membrane formalism feature
ThomasCederborg
10 Sep 2024 21:33 UTC
20
points
6
comments
11
min read
LW
link
Formalizing the Informal (event invite)
abramdemski
10 Sep 2024 19:22 UTC
42
points
0
comments
1
min read
LW
link
AI #80: Never Have I Ever
Zvi
10 Sep 2024 17:50 UTC
46
points
20
comments
39
min read
LW
link
(thezvi.wordpress.com)
The Best Lay Argument is not a Simple English Yud Essay
J Bostock
10 Sep 2024 17:34 UTC
253
points
15
comments
5
min read
LW
link
Economics Roundup #3
Zvi
10 Sep 2024 13:50 UTC
44
points
9
comments
20
min read
LW
link
(thezvi.wordpress.com)
Amplify is hiring! Work with us to support field-building initiatives through digital marketing
gergogaspar
10 Sep 2024 8:56 UTC
0
points
1
comment
4
min read
LW
link
What bootstraps intelligence?
invertedpassion
10 Sep 2024 7:11 UTC
2
points
2
comments
1
min read
LW
link
Physical Therapy Sucks (but have you tried hiding it in some peanut butter?)
Declan Molony
10 Sep 2024 5:54 UTC
16
points
12
comments
1
min read
LW
link
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel