Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
1
How Your Physiology Affects the Mind’s Projection Fallacy
YanLyutnev
Dec 14, 2024, 9:10 PM
−1
points
0
comments
6
min read
LW
link
Introducing the Evidence Color Wheel
Larry Lee
Dec 14, 2024, 4:08 PM
6
points
0
comments
3
min read
LW
link
An Illustrated Summary of “Robust Agents Learn Causal World Model”
Dalcy
Dec 14, 2024, 3:02 PM
67
points
2
comments
10
min read
LW
link
Best-of-N Jailbreaking
John Hughes
,
saraprice
,
Aengus Lynch
,
Rylan Schaeffer
,
Fazl
,
Henry Sleight
,
Ethan Perez
and
mrinank_sharma
Dec 14, 2024, 4:58 AM
78
points
5
comments
2
min read
LW
link
(arxiv.org)
D&D.Sci Dungeonbuilding: the Dungeon Tournament
aphyer
Dec 14, 2024, 4:30 AM
49
points
16
comments
3
min read
LW
link
Creating Interpretable Latent Spaces with Gradient Routing
Jacob G-W
Dec 14, 2024, 4:00 AM
26
points
6
comments
2
min read
LW
link
(jacobgw.com)
Probability of death by suicide by a 26 year old
John Wiseman
Dec 14, 2024, 3:33 AM
−25
points
4
comments
1
min read
LW
link
Matryoshka Sparse Autoencoders
Noa Nabeshima
Dec 14, 2024, 2:52 AM
98
points
15
comments
11
min read
LW
link
[Question]
What is MIRI currently doing?
Roko
Dec 14, 2024, 2:39 AM
32
points
14
comments
1
min read
LW
link
The o1 System Card Is Not About o1
Zvi
Dec 13, 2024, 8:30 PM
116
points
5
comments
16
min read
LW
link
(thezvi.wordpress.com)
Arch-anarchy and The Fable of the Dragon-Tyrant
Peter lawless
Dec 13, 2024, 8:15 PM
−10
points
0
comments
1
min read
LW
link
Communications in Hard Mode (My new job at MIRI)
tanagrabeast
Dec 13, 2024, 8:13 PM
204
points
25
comments
5
min read
LW
link
First Thoughts on Detachmentism
Jacob Peterson
Dec 13, 2024, 1:19 AM
−11
points
5
comments
9
min read
LW
link
How to Build Heaven: A Constrained Boltzmann Brain Generator
High Tides
Dec 13, 2024, 1:04 AM
−8
points
3
comments
5
min read
LW
link
Representing Irrationality in Game Theory
Larry Lee
Dec 13, 2024, 12:50 AM
−1
points
3
comments
11
min read
LW
link
“Charity” as a conflationary alliance term
Jan_Kulveit
Dec 12, 2024, 9:49 PM
35
points
2
comments
5
min read
LW
link
Just one more exposure bro
Chipmonk
Dec 12, 2024, 9:37 PM
52
points
6
comments
2
min read
LW
link
(chrislakin.blog)
The Dangers of Mirrored Life
Niko_McCarty
and
fin
Dec 12, 2024, 8:58 PM
119
points
9
comments
29
min read
LW
link
(www.asimov.press)
Effective Networking as Sending Hard to Fake Signals
vaishnav92
Dec 12, 2024, 8:32 PM
26
points
2
comments
7
min read
LW
link
(www.optimaloutliers.com)
Mini PAPR Review
jefftk
Dec 12, 2024, 7:10 PM
10
points
0
comments
2
min read
LW
link
(www.jefftk.com)
Biological risk from the mirror world
jasoncrawford
Dec 12, 2024, 7:07 PM
334
points
38
comments
7
min read
LW
link
(newsletter.rootsofprogress.org)
Naturalistic dualism
Arturo Macias
Dec 12, 2024, 4:19 PM
−4
points
0
comments
4
min read
LW
link
AI #94: Not Now, Google
Zvi
Dec 12, 2024, 3:40 PM
49
points
3
comments
64
min read
LW
link
(thezvi.wordpress.com)
Consciousness, Intelligence, and AI – Some Quick Notes [call it a mini-ramble]
Bill Benzon
Dec 12, 2024, 3:04 PM
−3
points
0
comments
4
min read
LW
link
The Dissolution of AI Safety
Roko
Dec 12, 2024, 10:34 AM
8
points
44
comments
1
min read
LW
link
(www.transhumanaxiology.com)
Is Optimization Correct?
Yoshinori Okamoto
Dec 12, 2024, 10:27 AM
−9
points
0
comments
2
min read
LW
link
AXRP Episode 38.3 - Erik Jenner on Learned Look-Ahead
DanielFilan
Dec 12, 2024, 5:40 AM
20
points
0
comments
16
min read
LW
link
Public computers can make addictive tools safe
dkl9
Dec 11, 2024, 7:55 PM
23
points
0
comments
1
min read
LW
link
(dkl9.net)
Solving Newcomb’s Paradox In Real Life
Alice Wanderland
Dec 11, 2024, 7:48 PM
3
points
0
comments
1
min read
LW
link
(open.substack.com)
The “Think It Faster” Exercise
Raemon
Dec 11, 2024, 7:14 PM
144
points
35
comments
13
min read
LW
link
Forecast With GiveWell
ChristianWilliams
Dec 11, 2024, 5:52 PM
11
points
0
comments
LW
link
(www.metaculus.com)
A shortcoming of concrete demonstrations as AGI risk advocacy
Steven Byrnes
Dec 11, 2024, 4:48 PM
105
points
27
comments
2
min read
LW
link
Why Isn’t Tesla Level 3?
jefftk
Dec 11, 2024, 2:50 PM
22
points
7
comments
2
min read
LW
link
(www.jefftk.com)
Investing in Robust Safety Mechanisms is critical for reducing Systemic Risks
Tom DAVID
,
Pierre Peigné
,
Quentin FEUILLADE--MONTIXI
,
Kay Kozaronek
and
Miailhe Nicolas
Dec 11, 2024, 1:37 PM
8
points
3
comments
2
min read
LW
link
Post-Quantum Investing: Dump Crypto for Index Funds and Real Estate?
G
Dec 11, 2024, 11:59 AM
8
points
5
comments
1
min read
LW
link
Low-effort review of “AI For Humanity”
Charlie Steiner
Dec 11, 2024, 9:54 AM
13
points
0
comments
4
min read
LW
link
SAEBench: A Comprehensive Benchmark for Sparse Autoencoders
Can
,
Adam Karvonen
,
Johnny Lin
,
Curt Tigges
,
Joseph Bloom
,
chanind
,
Yeu-Tong Lau
,
Eoin Farrell
,
Arthur Conmy
,
CallumMcDougall
,
Kola Ayonrinde
,
Matthew Wearden
,
Sam Marks
and
Neel Nanda
Dec 11, 2024, 6:30 AM
82
points
6
comments
2
min read
LW
link
(www.neuronpedia.org)
Zombies! Substance Dualist Zombies?
Ape in the coat
Dec 11, 2024, 6:10 AM
15
points
10
comments
6
min read
LW
link
My thoughts on correlation and causation
Victor Porton
Dec 11, 2024, 5:08 AM
−13
points
3
comments
1
min read
LW
link
Why empiricists should believe in AI risk
Knight Lee
Dec 11, 2024, 3:51 AM
5
points
0
comments
1
min read
LW
link
[Question]
fake alignment solutions????
KvmanThinking
Dec 11, 2024, 3:31 AM
1
point
6
comments
1
min read
LW
link
Second-Time Free
jefftk
Dec 11, 2024, 3:30 AM
24
points
4
comments
1
min read
LW
link
(www.jefftk.com)
Frontier AI systems have surpassed the self-replicating red line
aproteinengine
Dec 11, 2024, 3:06 AM
9
points
4
comments
1
min read
LW
link
(github.com)
The Technist Reformation: A Discussion with o1 About The Coming Economic Event Horizon
Yuli_Ban
Dec 11, 2024, 2:34 AM
5
points
2
comments
17
min read
LW
link
LessWrong audio: help us choose the new voice
PeterH
and
TYPE III AUDIO
Dec 11, 2024, 2:24 AM
23
points
1
comment
1
min read
LW
link
Apply to attend a Global Challenges Project workshop in 2025!
LiamE
Dec 11, 2024, 12:41 AM
6
points
0
comments
2
min read
LW
link
(forum.effectivealtruism.org)
The MVO and The MVP
kwang
Dec 10, 2024, 11:17 PM
0
points
0
comments
7
min read
LW
link
(kevw.substack.com)
What is Confidence—in Game Theory and Life?
James Stephen Brown
Dec 10, 2024, 11:06 PM
3
points
0
comments
8
min read
LW
link
(nonzerosum.games)
Computational functionalism probably can’t explain phenomenal consciousness
EuanMcLean
Dec 10, 2024, 5:11 PM
17
points
36
comments
12
min read
LW
link
o1 Turns Pro
Zvi
Dec 10, 2024, 5:00 PM
59
points
3
comments
14
min read
LW
link
(thezvi.wordpress.com)
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel