Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Page
1
Live Theory Part 0: Taking Intelligence Seriously
Sahil
Jun 26, 2024, 9:37 PM
101
points
3
comments
8
min read
LW
link
Instrumental vs Terminal Desiderata
Max Harms
Jun 26, 2024, 8:57 PM
21
points
0
comments
3
min read
LW
link
Imbue (Generally Intelligent) continue to make progress
Nathan Helm-Burger
Jun 26, 2024, 8:41 PM
18
points
0
comments
1
min read
LW
link
(imbue.com)
Tracing the steps
matimissona
Jun 26, 2024, 7:22 PM
−8
points
2
comments
4
min read
LW
link
Countering AI disinformation and deep fakes with digital signatures
Dave Lindbergh
Jun 26, 2024, 6:09 PM
13
points
5
comments
1
min read
LW
link
Progress Conference 2024: Toward Abundant Futures
jasoncrawford
Jun 26, 2024, 3:39 PM
40
points
2
comments
1
min read
LW
link
(rootsofprogress.org)
Schelling points in the AGI policy space
mesaoptimizer
Jun 26, 2024, 1:19 PM
52
points
2
comments
6
min read
LW
link
Bad lessons learned from the debate
bayesyatina
Jun 26, 2024, 11:54 AM
8
points
5
comments
6
min read
LW
link
Childhood and Education Roundup #6: College Edition
Zvi
Jun 26, 2024, 11:40 AM
28
points
8
comments
23
min read
LW
link
(thezvi.wordpress.com)
New fast transformer inference ASIC — Sohu by Etched
lemonhope
Jun 26, 2024, 9:56 AM
8
points
9
comments
1
min read
LW
link
(www.etched.com)
Empirical vs. Mathematical Joints of Nature
Elizabeth
and
Alex_Altair
Jun 26, 2024, 1:55 AM
35
points
1
comment
5
min read
LW
link
My Current Claims and Cruxes on LLM Forecasting & Epistemics
ozziegooen
Jun 26, 2024, 12:40 AM
11
points
0
comments
LW
link
In favour of exploring nagging doubts about x-risk
owencb
Jun 25, 2024, 11:52 PM
105
points
2
comments
LW
link
What is a Tool?
johnswentworth
and
David Lorell
Jun 25, 2024, 11:40 PM
62
points
4
comments
6
min read
LW
link
[Question]
When do alignment researchers retire?
Jordan Taylor
Jun 25, 2024, 11:30 PM
4
points
2
comments
1
min read
LW
link
Compute Governance Literature Review
sijarvis
Jun 25, 2024, 10:41 PM
11
points
0
comments
13
min read
LW
link
Computational Complexity as an Intuition Pump for LLM Generality
aribrill
Jun 25, 2024, 8:25 PM
18
points
6
comments
3
min read
LW
link
Failure Modes of Teaching AI Safety
Eleni Angelou
Jun 25, 2024, 7:07 PM
20
points
0
comments
1
min read
LW
link
Kingfisher Summer Tour 2024
jefftk
Jun 25, 2024, 6:50 PM
9
points
0
comments
1
min read
LW
link
(www.jefftk.com)
Incentive Learning vs Dead Sea Salt Experiment
Steven Byrnes
Jun 25, 2024, 5:49 PM
30
points
1
comment
28
min read
LW
link
An Intuitive Explanation of Sparse Autoencoders for Mechanistic Interpretability of LLMs
Adam Karvonen
Jun 25, 2024, 3:57 PM
27
points
0
comments
9
min read
LW
link
(adamkarvonen.github.io)
Formal verification, heuristic explanations and surprise accounting
Jacob_Hilton
Jun 25, 2024, 3:40 PM
156
points
11
comments
9
min read
LW
link
(www.alignment.org)
Metastrategy get-started guide
Tahp
Jun 25, 2024, 3:04 PM
6
points
1
comment
8
min read
LW
link
Labor Participation is an Alignment Risk
alex
Jun 25, 2024, 2:15 PM
−5
points
2
comments
17
min read
LW
link
Monthly Roundup #19: June 2024
Zvi
Jun 25, 2024, 12:00 PM
28
points
9
comments
54
min read
LW
link
(thezvi.wordpress.com)
Regularly meta-optimization
Crazy philosopher
Jun 25, 2024, 6:12 AM
−4
points
6
comments
1
min read
LW
link
Memetics as an analogy and its implicit connotations
Rachel Shu
Jun 25, 2024, 5:13 AM
3
points
0
comments
3
min read
LW
link
Mistakes people make when thinking about units
Isaac King
Jun 25, 2024, 3:39 AM
74
points
14
comments
7
min read
LW
link
Higher-effort summer solstice: What if we used AI (i.e., Angel Island)?
Rachel Shu
Jun 25, 2024, 1:35 AM
46
points
9
comments
3
min read
LW
link
I’m a bit skeptical of AlphaFold 3
Oleg Trott
Jun 25, 2024, 12:04 AM
87
points
14
comments
2
min read
LW
link
Being hella lost as rationality practice
Rachel Shu
Jun 24, 2024, 11:50 PM
14
points
0
comments
2
min read
LW
link
A Basic Economics-Style Model of AI Existential Risk
Rubi J. Hudson
Jun 24, 2024, 8:26 PM
24
points
3
comments
7
min read
LW
link
The Minority Coalition
Richard_Ngo
Jun 24, 2024, 8:01 PM
103
points
9
comments
5
min read
LW
link
(www.narrativeark.xyz)
Compact Proofs of Model Performance via Mechanistic Interpretability
LawrenceC
,
rajashree
,
Adrià Garriga-alonso
and
Jason Gross
Jun 24, 2024, 7:27 PM
97
points
4
comments
8
min read
LW
link
(arxiv.org)
Contrapositive Natural Abstraction—Project Intro
Elliot Callender
Jun 24, 2024, 6:37 PM
4
points
5
comments
2
min read
LW
link
Sparse Features Through Time
Rogan Inglis
Jun 24, 2024, 6:06 PM
12
points
1
comment
1
min read
LW
link
(roganinglis.io)
PSA: Consider alternatives to AUROC when reporting classifier metrics for alignment
rpglover64
24 Jun 2024 17:53 UTC
18
points
1
comment
3
min read
LW
link
Paying Russians to not invade Ukraine
djColliderBias
24 Jun 2024 17:46 UTC
9
points
7
comments
3
min read
LW
link
SAE feature geometry is outside the superposition hypothesis
jake_mendel
24 Jun 2024 16:07 UTC
228
points
17
comments
11
min read
LW
link
So you want to work on technical AI safety
gw
24 Jun 2024 14:29 UTC
51
points
3
comments
14
min read
LW
link
The Future of Work: How Can Policymakers Prepare for AI’s Impact on Labor Markets?
davidconrad
,
Arturs
and
Tillman Schenk
24 Jun 2024 14:18 UTC
5
points
0
comments
3
min read
LW
link
LLM Generality is a Timeline Crux
eggsyntax
24 Jun 2024 12:52 UTC
218
points
119
comments
7
min read
LW
link
On Claude 3.5 Sonnet
Zvi
24 Jun 2024 12:00 UTC
95
points
14
comments
13
min read
LW
link
(thezvi.wordpress.com)
Book Review: Righteous Victims—A History of the Zionist-Arab Conflict
Yair Halberstadt
24 Jun 2024 11:02 UTC
53
points
8
comments
34
min read
LW
link
The Living Planet Index: A Case Study in Statistical Pitfalls
Jan_Kulveit
24 Jun 2024 10:05 UTC
24
points
0
comments
4
min read
LW
link
(www.nature.com)
Sci-Fi books micro-reviews
Yair Halberstadt
24 Jun 2024 9:49 UTC
44
points
27
comments
4
min read
LW
link
A Step Against Land Value Tax
Blog Alt
24 Jun 2024 5:13 UTC
9
points
23
comments
6
min read
LW
link
(antematters.substack.com)
Different senses in which two AIs can be “the same”
Vivek Hebbar
and
Buck
24 Jun 2024 3:16 UTC
69
points
2
comments
4
min read
LW
link
Talk: AI safety fieldbuilding at MATS
Ryan Kidd
23 Jun 2024 23:06 UTC
26
points
2
comments
10
min read
LW
link
AI Labs Wouldn’t be Convicted of Treason or Sedition
Matthew Khoriaty
23 Jun 2024 21:34 UTC
9
points
2
comments
3
min read
LW
link
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel