Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Page
1
[Question]
How familiar is the Lesswrong community as a whole with the concept of Reward-modelling?
Oxidize
Apr 9, 2025, 11:33 PM
1
point
8
comments
1
min read
LW
link
What can we learn from expert AGI forecasts?
Benjamin_Todd
Apr 9, 2025, 9:34 PM
5
points
0
comments
5
min read
LW
link
(80000hours.org)
Thoughts on AI 2027
Max Harms
Apr 9, 2025, 9:26 PM
222
points
61
comments
21
min read
LW
link
(intelligence.org)
The case for AGI by 2030
Benjamin_Todd
Apr 9, 2025, 8:35 PM
40
points
6
comments
42
min read
LW
link
(80000hours.org)
Anti-automation policy as a bottleneck to economic growth
mhampton
Apr 9, 2025, 8:12 PM
4
points
0
comments
4
min read
LW
link
Reasoning models don’t always say what they think
Joe Benton
,
Ethan Perez
,
Vlad Mikulik
and
Fabien Roger
Apr 9, 2025, 7:48 PM
28
points
4
comments
1
min read
LW
link
(www.anthropic.com)
Reverse engineering the memory layout of GPU inference
Paul Bricman
Apr 9, 2025, 3:40 PM
5
points
0
comments
6
min read
LW
link
(noemaresearch.com)
How to defeat superintelligence, the Sta-Hi way
kilgoar
Apr 9, 2025, 1:58 PM
−8
points
0
comments
3
min read
LW
link
Llama Does Not Look Good 4 Anything
Zvi
Apr 9, 2025, 1:20 PM
31
points
1
comment
16
min read
LW
link
(thezvi.wordpress.com)
Learned pain as a leading cause of chronic pain
SoerenMind
Apr 9, 2025, 11:57 AM
203
points
38
comments
9
min read
LW
link
Does the universe’s recognition of measurement provide stronger evidence for being in a simulation than universal fine-tuning?
amelia
Apr 9, 2025, 8:20 AM
0
points
2
comments
4
min read
LW
link
Taxonomy of possibility
dkl9
Apr 9, 2025, 4:24 AM
13
points
1
comment
5
min read
LW
link
(dkl9.net)
Short Timelines Don’t Devalue Long Horizon Research
Vladimir_Nesov
Apr 9, 2025, 12:42 AM
167
points
24
comments
1
min read
LW
link
A Platform for Falsifiable Conjectures and Public Refutation — Would This Be Useful?
PetrusNonius
Apr 8, 2025, 9:09 PM
1
point
1
comment
1
min read
LW
link
Quantifying SAE Quality with Feature Steerability Metrics
phenomanon
Apr 8, 2025, 8:55 PM
2
points
0
comments
4
min read
LW
link
MATS is hiring!
Ryan Kidd
and
VVN
Apr 8, 2025, 8:45 PM
8
points
0
comments
6
min read
LW
link
birds and mammals independently evolved intelligence
bhauth
Apr 8, 2025, 8:00 PM
73
points
23
comments
1
min read
LW
link
(www.quantamagazine.org)
Alignment Faking Revisited: Improved Classifiers and Open Source Extensions
John Hughes
,
abhayesian
,
Akbir Khan
and
Fabien Roger
Apr 8, 2025, 5:32 PM
146
points
20
comments
12
min read
LW
link
London Working Group for Short/Medium Term AI Risks
scronkfinkle
Apr 8, 2025, 5:32 PM
5
points
0
comments
2
min read
LW
link
Thinking Machines
Knight Lee
Apr 8, 2025, 5:27 PM
3
points
0
comments
6
min read
LW
link
Digital Error Correction and Lock-In
alamerton
Apr 8, 2025, 3:46 PM
1
point
0
comments
5
min read
LW
link
(alfielamerton.substack.com)
[Question]
What faithfulness metrics should general claims about CoT faithfulness be based upon?
Rauno Arike
Apr 8, 2025, 3:27 PM
24
points
0
comments
4
min read
LW
link
AI 2027: Responses
Zvi
Apr 8, 2025, 12:50 PM
109
points
3
comments
30
min read
LW
link
(thezvi.wordpress.com)
The first AI war will be in your computer
Viliam
Apr 8, 2025, 9:28 AM
43
points
10
comments
3
min read
LW
link
Who wants to bet me $25k at 1:7 odds that there won’t be an AI market crash in the next year?
Remmelt
Apr 8, 2025, 8:31 AM
32
points
19
comments
1
min read
LW
link
A Pathway to Fully Autonomous Therapists
Declan Molony
Apr 8, 2025, 4:10 AM
5
points
2
comments
6
min read
LW
link
Rethinking Friction: Equity and Motivation Across Domains
eltimbalino
Apr 8, 2025, 3:58 AM
−1
points
0
comments
2
min read
LW
link
(www.lesswrong.com)
On different discussion traditions
Eugene Shcherbinin
Apr 7, 2025, 11:00 PM
1
point
0
comments
2
min read
LW
link
Misinformation is the default, and information is the government telling you your tap water is safe to drink
danielechlin
Apr 7, 2025, 10:28 PM
10
points
2
comments
9
min read
LW
link
Log-linear Scaling is Worth the Cost due to Gains in Long-Horizon Tasks
shash42
Apr 7, 2025, 9:50 PM
16
points
2
comments
1
min read
LW
link
Paper Highlights, March ’25
gasteigerjo
Apr 7, 2025, 8:17 PM
8
points
0
comments
9
min read
LW
link
(aisafetyfrontier.substack.com)
Factory farming intelligent minds
Odd anon
Apr 7, 2025, 8:05 PM
2
points
5
comments
20
min read
LW
link
What alignment-relevant abilities might Terence Tao lack?
Towards_Keeperhood
Apr 7, 2025, 7:44 PM
12
points
2
comments
3
min read
LW
link
[Question]
Are there any (semi-)detailed future scenarios where we win?
Jan Betley
Apr 7, 2025, 7:13 PM
15
points
3
comments
1
min read
LW
link
Austin Chen on Winning, Risk-Taking, and FTX
Elizabeth
Apr 7, 2025, 7:00 PM
35
points
3
comments
1
min read
LW
link
(acesounderglass.com)
An Unbiased Evaluation of My Debate with Thane Ruthenis—Run It Yourself
funnyfranco
Apr 7, 2025, 6:56 PM
−24
points
14
comments
2
min read
LW
link
American College Admissions Doesn’t Need to Be So Competitive
Arjun Panickssery
Apr 7, 2025, 5:35 PM
48
points
20
comments
6
min read
LW
link
(arjunpanickssery.substack.com)
Coupling for Decouplers
Jacob Falkovich
Apr 7, 2025, 3:40 PM
15
points
3
comments
8
min read
LW
link
Moonlight Reflected
Jacob Falkovich
Apr 7, 2025, 3:35 PM
11
points
0
comments
9
min read
LW
link
Navigation by Moonlight
Jacob Falkovich
Apr 7, 2025, 3:32 PM
24
points
39
comments
8
min read
LW
link
You Are Not a Thought Experiment
Jacob Falkovich
Apr 7, 2025, 3:27 PM
5
points
0
comments
9
min read
LW
link
Love is Love, Science is Fake
Jacob Falkovich
Apr 7, 2025, 3:19 PM
17
points
2
comments
10
min read
LW
link
Coupling for Decouplers — Intro
Jacob Falkovich
Apr 7, 2025, 3:12 PM
9
points
0
comments
1
min read
LW
link
The world according to ChatGPT
Richard_Kennaway
7 Apr 2025 13:44 UTC
11
points
0
comments
2
min read
LW
link
AI 2027: Dwarkesh’s Podcast with Daniel Kokotajlo and Scott Alexander
Zvi
7 Apr 2025 13:40 UTC
67
points
2
comments
26
min read
LW
link
(thezvi.wordpress.com)
Arguing all sides with ChatGPT 4.5
Richard_Kennaway
7 Apr 2025 13:10 UTC
6
points
0
comments
8
min read
LW
link
The Same Heaven
Lukas Petersson
7 Apr 2025 12:57 UTC
3
points
1
comment
5
min read
LW
link
(lukaspetersson.com)
Breaking down the MEAT of Alignment
JasonBrown
7 Apr 2025 8:47 UTC
7
points
2
comments
11
min read
LW
link
Well-foundedness as an organizing principle of healthy minds and societies
Richard_Ngo
7 Apr 2025 0:31 UTC
35
points
7
comments
6
min read
LW
link
(www.mindthefuture.info)
Arusha Perpetual Chicken—an unlikely iterated game
James Stephen Brown
6 Apr 2025 22:56 UTC
15
points
1
comment
5
min read
LW
link
(nonzerosum.games)
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel