Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
1
It’s Okay to Feel Bad for a Bit
moridinamael
May 10, 2025, 11:24 PM
133
points
26
comments
3
min read
LW
link
G.D. as Capitalist Evolution, and the claim for humanity’s (temporary) upper hand
Martin Vlach
May 10, 2025, 9:18 PM
8
points
3
comments
1
min read
LW
link
Book Review: “Encounters with Einstein” by Heisenberg
Baram Sosis
May 10, 2025, 8:55 PM
31
points
6
comments
7
min read
LW
link
Where is the YIMBY movement for healthcare?
jasoncrawford
May 10, 2025, 8:36 PM
20
points
10
comments
2
min read
LW
link
(newsletter.rootsofprogress.org)
Become a Superintelligence Yourself
Yaroslav Granowski
May 10, 2025, 8:20 PM
1
point
0
comments
5
min read
LW
link
A Look Inside a Frequentist
Eggs
May 10, 2025, 3:18 PM
5
points
10
comments
3
min read
LW
link
Open-source weaponry
samuelshadrach
May 10, 2025, 1:11 PM
3
points
0
comments
3
min read
LW
link
(samuelshadrach.com)
Glass box learners want to be black box
Cole Wyeth
May 10, 2025, 11:05 AM
46
points
10
comments
4
min read
LW
link
Takes and loose predictions on AI progress and some key problems
zef
May 10, 2025, 10:11 AM
5
points
0
comments
5
min read
LW
link
(halcyoncyborg.substack.com)
Corbent – A Master Plan for Next‑Generation Direct Air Capture
Rudaiba
May 10, 2025, 4:09 AM
11
points
15
comments
19
min read
LW
link
What if we just…didn’t build AGI? An Argument Against Inevitability
Nate Sharpe
May 10, 2025, 3:37 AM
7
points
7
comments
14
min read
LW
link
(natezsharpe.substack.com)
Mind the Coherence Gap: Lessons from Steering Llama with Goodfire
eitan sprejer
May 9, 2025, 9:29 PM
4
points
1
comment
6
min read
LW
link
My Experience With EMDR
Sable
May 9, 2025, 9:25 PM
22
points
0
comments
11
min read
LW
link
(affablyevil.substack.com)
AI’s Hidden Game: Understanding Strategic Deception in AI and Why It Matters for Our Future
EmilyinAI
May 9, 2025, 8:01 PM
4
points
0
comments
6
min read
LW
link
Muddling Through Some Thoughts on the Nature of Historiography
E.G. Blee-Goldman
May 9, 2025, 7:04 PM
2
points
0
comments
4
min read
LW
link
A Guide to AI 2027
koenrane
May 9, 2025, 5:14 PM
0
points
1
comment
28
min read
LW
link
Let’s stop making “Intelligence scale” graphs with humans and AI
Expertium
May 9, 2025, 4:01 PM
3
points
15
comments
1
min read
LW
link
Slow corporations as an intuition pump for AI R&D automation
ryan_greenblatt
and
elifland
May 9, 2025, 2:49 PM
91
points
23
comments
9
min read
LW
link
Cheaters Gonna Cheat Cheat Cheat Cheat Cheat
Zvi
May 9, 2025, 2:30 PM
52
points
4
comments
22
min read
LW
link
(thezvi.wordpress.com)
Humans vs LLM, memes as theorems
Yaroslav Granowski
May 9, 2025, 1:26 PM
1
point
0
comments
1
min read
LW
link
Moving towards a question-based planning framework, instead of task lists
casualphysicsenjoyer
May 9, 2025, 12:18 PM
4
points
1
comment
8
min read
LW
link
(substack.com)
Jim Babcock’s Mainline Doom Scenario: Human-Level AI Can’t Control Its Successor
Liron
and
jimrandomh
May 9, 2025, 5:20 AM
28
points
4
comments
62
min read
LW
link
(www.youtube.com)
Attend the 2025 Reproductive Frontiers Summit, June 10-12
TsviBT
and
Rachel Reid
May 9, 2025, 5:17 AM
59
points
0
comments
3
min read
LW
link
Interest In Conflict Is Instrumentally Convergent
Screwtape
May 9, 2025, 2:16 AM
65
points
58
comments
10
min read
LW
link
Is ChatGPT actually fixed now?
sjadler
May 8, 2025, 11:34 PM
17
points
0
comments
1
min read
LW
link
(stevenadler.substack.com)
Post EAG London AI x-Safety Co-working Retreat
plex
May 8, 2025, 11:00 PM
10
points
0
comments
1
min read
LW
link
a brief critique of reduction
Vadim Golub
May 8, 2025, 10:43 PM
−17
points
4
comments
2
min read
LW
link
Video & transcript: Challenges for Safe & Beneficial Brain-Like AGI
Steven Byrnes
May 8, 2025, 9:11 PM
24
points
0
comments
18
min read
LW
link
Appendix: Interpretable by Design—Constraint Sets with Disjoint Limit Points
Ronak_Mehta
May 8, 2025, 9:09 PM
2
points
0
comments
2
min read
LW
link
Interpretable by Design—Constraint Sets with Disjoint Limit Points
Ronak_Mehta
May 8, 2025, 9:08 PM
23
points
1
comment
9
min read
LW
link
(ronakrm.github.io)
Is there a Half-Life for the Success Rates of AI Agents?
Matrice Jacobine
May 8, 2025, 8:10 PM
8
points
0
comments
1
min read
LW
link
(www.tobyord.com)
Misalignment and Strategic Underperformance: An Analysis of Sandbagging and Exploration Hacking
Buck
and
Julian Stastny
May 8, 2025, 7:06 PM
75
points
1
comment
15
min read
LW
link
Behold the Pale Child (escaping Moloch’s Mad Maze)
rogersbacon
May 8, 2025, 4:36 PM
8
points
16
comments
11
min read
LW
link
(www.secretorum.life)
An alignment safety case sketch based on debate
Marie_DB
,
Jacob Pfau
,
Benjamin Hilton
and
Geoffrey Irving
May 8, 2025, 3:02 PM
57
points
19
comments
25
min read
LW
link
(arxiv.org)
Mechanistic Interpretability Via Learning Differential Equations: AI Safety Camp Project Intermediate Report.
Valentin2026
,
ayoakin
,
Eduard Kovalets
,
tz3r0n4r
,
Soumyadeep Bose
,
Utkarsh Priyadarshi
,
Varun Piram
and
Axel Ahlqvist
May 8, 2025, 2:45 PM
6
points
0
comments
7
min read
LW
link
AI #115: The Evil Applications Division
Zvi
May 8, 2025, 1:40 PM
32
points
3
comments
62
min read
LW
link
(thezvi.wordpress.com)
The Steganographic Potentials of Language Models
Artyom Karpov
,
Tinuade
and
SCho
May 8, 2025, 11:23 AM
9
points
0
comments
1
min read
LW
link
Our bet on whether the AI market will crash
Remmelt
and
mabramov
May 8, 2025, 9:56 AM
22
points
2
comments
1
min read
LW
link
Concept-anchored representation engineering for alignment
Sandy Fraser
May 8, 2025, 8:59 AM
5
points
0
comments
3
min read
LW
link
Orthogonality Thesis in layman’s terms.
Michael (@lethal_ai)
May 8, 2025, 8:31 AM
1
point
0
comments
2
min read
LW
link
Arkose may be closing, but you can help
Victoria Brook
May 8, 2025, 7:28 AM
8
points
0
comments
2
min read
LW
link
Healing powers of meditation or the role of attention in humoral regulation.
Yaroslav Granowski
8 May 2025 6:48 UTC
7
points
0
comments
1
min read
LW
link
Orienting Toward Wizard Power
johnswentworth
8 May 2025 5:23 UTC
514
points
135
comments
5
min read
LW
link
Relational Alignment: Trust, Repair, and the Emotional Work of AI
Priyanka Bharadwaj
8 May 2025 2:44 UTC
3
points
0
comments
3
min read
LW
link
There’s more low-hanging fruit in interdisciplinary work thanks to LLMs
ChristianKl
7 May 2025 19:48 UTC
26
points
2
comments
1
min read
LW
link
OpenAI Claims Nonprofit Will Retain Nominal Control
Zvi
7 May 2025 19:40 UTC
65
points
4
comments
11
min read
LW
link
(thezvi.wordpress.com)
Social status games might have “compute weight class” in the future
Raemon
7 May 2025 18:56 UTC
31
points
5
comments
2
min read
LW
link
Events of Low Probability: Buridan’s Principle
Nikita Gladkov
7 May 2025 18:46 UTC
12
points
0
comments
10
min read
LW
link
[Question]
Which journalists would you give quotes to? [one journalist per comment, agree vote for trustworthy]
Nathan Young
7 May 2025 18:39 UTC
12
points
26
comments
1
min read
LW
link
Progress = Fewer Bad Moments
Chipmonk
7 May 2025 17:33 UTC
24
points
9
comments
2
min read
LW
link
(chrislakin.blog)
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel