Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Page
1
Activation Pattern SVD: A proposal for SAE Interpretability
Daniel Tan
Jun 28, 2024, 10:12 PM
15
points
2
comments
2
min read
LW
link
Podcast: Elizabeth & Austin on “What Manifold was allowed to do”
Austin Chen
Jun 28, 2024, 10:10 PM
20
points
0
comments
LW
link
(share.descript.com)
The Incredible Fentanyl-Detecting Machine
sarahconstantin
Jun 28, 2024, 10:10 PM
156
points
26
comments
7
min read
LW
link
(sarahconstantin.substack.com)
Saving Lives Reduces Over-Population—A Counter-Intuitive Non-Zero-Sum Game
James Stephen Brown
Jun 28, 2024, 7:29 PM
6
points
0
comments
5
min read
LW
link
(nonzerosum.games)
Mentorship in AGI Safety: Applications for mentorship are open!
Valentin2026
and
Joe Rogero
Jun 28, 2024, 2:49 PM
5
points
0
comments
1
min read
LW
link
Contra Acemoglu on AI
Maxwell Tabarrok
Jun 28, 2024, 1:13 PM
48
points
0
comments
5
min read
LW
link
(www.maximum-progress.com)
Five toy worlds to think about heritability
David Hugh-Jones
Jun 28, 2024, 1:11 PM
13
points
0
comments
9
min read
LW
link
(wyclif.substack.com)
[Question]
How do natural sciences prove causation?
Kongo Landwalker
Jun 28, 2024, 11:58 AM
1
point
3
comments
1
min read
LW
link
LessWrong/ACX meetup Transilvanya tour—Sibiu
Marius Adrian Nicoară
Jun 28, 2024, 11:41 AM
1
point
1
comment
1
min read
LW
link
Bayes’ Theorem: In Search of Gold (Lesson 1)
bayesyatina
Jun 28, 2024, 8:39 AM
3
points
0
comments
3
min read
LW
link
How a chip is designed
YM
Jun 28, 2024, 8:04 AM
65
points
4
comments
5
min read
LW
link
The Wisdom of Living for 200 Years
Martin Sustrik
Jun 28, 2024, 4:44 AM
25
points
3
comments
4
min read
LW
link
A Generally Intelligent Game
snerx
Jun 28, 2024, 1:31 AM
−1
points
1
comment
4
min read
LW
link
Corrigibility = Tool-ness?
johnswentworth
and
David Lorell
Jun 28, 2024, 1:19 AM
78
points
8
comments
9
min read
LW
link
Situational Awareness
PeterMcCluskey
Jun 28, 2024, 1:08 AM
11
points
0
comments
12
min read
LW
link
(bayesianinvestor.com)
Toward a taxonomy of cognitive benchmarks for agentic AGIs
Ben Smith
Jun 27, 2024, 11:50 PM
15
points
0
comments
5
min read
LW
link
How Big a Deal are MatMul-Free Transformers?
JustisMills
Jun 27, 2024, 10:28 PM
19
points
6
comments
5
min read
LW
link
(justismills.substack.com)
Secondary forces of debt
KatjaGrace
Jun 27, 2024, 9:10 PM
81
points
18
comments
2
min read
LW
link
(worldspiritsockpuppet.com)
Distillation of ‘Do language models plan for future tokens’
TheManxLoiner
Jun 27, 2024, 8:57 PM
26
points
2
comments
6
min read
LW
link
how birds sense magnetic fields
bhauth
Jun 27, 2024, 6:59 PM
51
points
4
comments
5
min read
LW
link
(www.bhauth.com)
Representation Tuning
Christopher Ackerman
Jun 27, 2024, 5:44 PM
35
points
9
comments
13
min read
LW
link
An issue with training schemers with supervised fine-tuning
Fabien Roger
Jun 27, 2024, 3:37 PM
49
points
12
comments
6
min read
LW
link
AI #70: A Beautiful Sonnet
Zvi
Jun 27, 2024, 2:40 PM
38
points
0
comments
44
min read
LW
link
(thezvi.wordpress.com)
Detecting Genetically Engineered Viruses With Metagenomic Sequencing
jefftk
Jun 27, 2024, 2:01 PM
87
points
10
comments
LW
link
(naobservatory.org)
Cross Robin
jefftk
Jun 27, 2024, 3:10 AM
11
points
2
comments
1
min read
LW
link
(www.jefftk.com)
Live Theory Part 0: Taking Intelligence Seriously
Sahil
Jun 26, 2024, 9:37 PM
101
points
3
comments
8
min read
LW
link
Instrumental vs Terminal Desiderata
Max Harms
Jun 26, 2024, 8:57 PM
21
points
0
comments
3
min read
LW
link
Imbue (Generally Intelligent) continue to make progress
Nathan Helm-Burger
Jun 26, 2024, 8:41 PM
18
points
0
comments
1
min read
LW
link
(imbue.com)
Tracing the steps
matimissona
Jun 26, 2024, 7:22 PM
−8
points
2
comments
4
min read
LW
link
Countering AI disinformation and deep fakes with digital signatures
Dave Lindbergh
Jun 26, 2024, 6:09 PM
13
points
5
comments
1
min read
LW
link
Progress Conference 2024: Toward Abundant Futures
jasoncrawford
Jun 26, 2024, 3:39 PM
40
points
2
comments
1
min read
LW
link
(rootsofprogress.org)
Schelling points in the AGI policy space
mesaoptimizer
Jun 26, 2024, 1:19 PM
52
points
2
comments
6
min read
LW
link
Bad lessons learned from the debate
bayesyatina
Jun 26, 2024, 11:54 AM
8
points
5
comments
6
min read
LW
link
Childhood and Education Roundup #6: College Edition
Zvi
Jun 26, 2024, 11:40 AM
28
points
8
comments
23
min read
LW
link
(thezvi.wordpress.com)
New fast transformer inference ASIC — Sohu by Etched
lemonhope
Jun 26, 2024, 9:56 AM
8
points
9
comments
1
min read
LW
link
(www.etched.com)
Empirical vs. Mathematical Joints of Nature
Elizabeth
and
Alex_Altair
Jun 26, 2024, 1:55 AM
35
points
1
comment
5
min read
LW
link
My Current Claims and Cruxes on LLM Forecasting & Epistemics
ozziegooen
Jun 26, 2024, 12:40 AM
11
points
0
comments
LW
link
In favour of exploring nagging doubts about x-risk
owencb
Jun 25, 2024, 11:52 PM
105
points
2
comments
LW
link
What is a Tool?
johnswentworth
and
David Lorell
Jun 25, 2024, 11:40 PM
62
points
4
comments
6
min read
LW
link
[Question]
When do alignment researchers retire?
Jordan Taylor
Jun 25, 2024, 11:30 PM
4
points
2
comments
1
min read
LW
link
Compute Governance Literature Review
sijarvis
Jun 25, 2024, 10:41 PM
11
points
0
comments
13
min read
LW
link
Computational Complexity as an Intuition Pump for LLM Generality
aribrill
Jun 25, 2024, 8:25 PM
18
points
6
comments
3
min read
LW
link
Failure Modes of Teaching AI Safety
Eleni Angelou
Jun 25, 2024, 7:07 PM
20
points
0
comments
1
min read
LW
link
Kingfisher Summer Tour 2024
jefftk
Jun 25, 2024, 6:50 PM
9
points
0
comments
1
min read
LW
link
(www.jefftk.com)
Incentive Learning vs Dead Sea Salt Experiment
Steven Byrnes
Jun 25, 2024, 5:49 PM
30
points
1
comment
28
min read
LW
link
An Intuitive Explanation of Sparse Autoencoders for Mechanistic Interpretability of LLMs
Adam Karvonen
Jun 25, 2024, 3:57 PM
27
points
0
comments
9
min read
LW
link
(adamkarvonen.github.io)
Formal verification, heuristic explanations and surprise accounting
Jacob_Hilton
Jun 25, 2024, 3:40 PM
156
points
11
comments
9
min read
LW
link
(www.alignment.org)
Metastrategy get-started guide
Tahp
Jun 25, 2024, 3:04 PM
6
points
1
comment
8
min read
LW
link
Labor Participation is an Alignment Risk
alex
Jun 25, 2024, 2:15 PM
−5
points
2
comments
17
min read
LW
link
Monthly Roundup #19: June 2024
Zvi
Jun 25, 2024, 12:00 PM
28
points
9
comments
54
min read
LW
link
(thezvi.wordpress.com)
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel