Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
1
Announcing the AI Forecasting Benchmark Series | July 8, $120k in Prizes
ChristianWilliams
Jul 2, 2024, 10:33 PM
15
points
0
comments
LW
link
(www.metaculus.com)
Open Sourcing Metaculus
ChristianWilliams
Jul 2, 2024, 10:30 PM
44
points
0
comments
LW
link
(www.metaculus.com)
[Question]
Why Can’t Sub-AGI Solve AI Alignment? Or: Why Would Sub-AGI AI Not be Aligned?
MrThink
Jul 2, 2024, 8:13 PM
4
points
23
comments
1
min read
LW
link
[Question]
Why haven’t there been assassination attempts against high profile AI accelerationists like sam altman yet?
louisTrem
Jul 2, 2024, 6:16 PM
−13
points
4
comments
2
min read
LW
link
How ARENA course material gets made
CallumMcDougall
Jul 2, 2024, 6:04 PM
41
points
2
comments
7
min read
LW
link
An AI Race With China Can Be Better Than Not Racing
niplav
Jul 2, 2024, 5:57 PM
69
points
34
comments
11
min read
LW
link
List of Collective Intelligence Projects
Chipmonk
Jul 2, 2024, 2:10 PM
42
points
9
comments
2
min read
LW
link
(chrislakin.blog)
Decomposing the QK circuit with Bilinear Sparse Dictionary Learning
keith_wynroe
and
Lee Sharkey
Jul 2, 2024, 1:17 PM
86
points
7
comments
12
min read
LW
link
Economics Roundup #2
Zvi
Jul 2, 2024, 12:40 PM
35
points
5
comments
23
min read
LW
link
(thezvi.wordpress.com)
How Congressional Offices Process Constituent Communication
Tristan Williams
Jul 2, 2024, 12:38 PM
24
points
0
comments
LW
link
OthelloGPT learned a bag of heuristics
jylin04
,
JackS
,
Adam Karvonen
and
Can
Jul 2, 2024, 9:12 AM
111
points
10
comments
9
min read
LW
link
Blueprint for a Brighter Future
Alex Beyman
Jul 2, 2024, 6:15 AM
−1
points
0
comments
5
min read
LW
link
Covert Malicious Finetuning
Tony Wang
and
dannyhalawi
Jul 2, 2024, 2:41 AM
89
points
4
comments
3
min read
LW
link
Interpreting Preference Models w/ Sparse Autoencoders
Logan Riggs
and
Jannik Brinkmann
Jul 1, 2024, 9:35 PM
74
points
12
comments
9
min read
LW
link
Honest science is spirituality
pchvykov
Jul 1, 2024, 8:33 PM
−1
points
10
comments
4
min read
LW
link
New Executive Team & Board — PIBBSS
Nora_Ammann
Jul 1, 2024, 7:30 PM
43
points
1
comment
1
min read
LW
link
Uncursing Civilization
Lorec
Jul 1, 2024, 6:44 PM
−5
points
2
comments
5
min read
LW
link
[Question]
Self-censoring on AI x-risk discussions?
Decaeneus
Jul 1, 2024, 6:24 PM
17
points
2
comments
1
min read
LW
link
Rationalists As People Who Build Piles Of Rocks
Sable
Jul 1, 2024, 10:32 AM
9
points
0
comments
5
min read
LW
link
(affablyevil.substack.com)
How good are LLMs at doing ML on an unknown dataset?
Håvard Tveit Ihle
Jul 1, 2024, 9:04 AM
33
points
4
comments
13
min read
LW
link
Whirlwind Tour of Chain of Thought Literature Relevant to Automating Alignment Research.
sevdeawesome
Jul 1, 2024, 5:50 AM
25
points
0
comments
17
min read
LW
link
Probabilistic Logic ⇔ Oracles?
Yudhister Kumar
Jul 1, 2024, 5:36 AM
15
points
0
comments
4
min read
LW
link
Important open problems in voting
Closed Limelike Curves
Jul 1, 2024, 2:53 AM
33
points
1
comment
1
min read
LW
link
Anti-Circumcision Essay 3 of 3: Now That I Think About It, Is There Actually a Space Between “Info” and “Hazard”? Isn’t It Just One Word?
Harry Stevenage
Jul 1, 2024, 2:21 AM
12
points
0
comments
7
min read
LW
link
In Defense of Lawyers Playing Their Part
Isaac King
Jul 1, 2024, 1:32 AM
32
points
9
comments
9
min read
LW
link
Anti-circumcision Essay 2 of 3: Physical and Psychological Realities
Harry Stevenage
Jun 30, 2024, 10:13 PM
12
points
5
comments
9
min read
LW
link
Review of METR’s public evaluation protocol
nahoj
and
JaimeRV
Jun 30, 2024, 10:03 PM
10
points
0
comments
5
min read
LW
link
Superposition, Self-Modeling, and the Path to AGI: A New Perspective
Peterpiper
Jun 30, 2024, 5:20 PM
−13
points
0
comments
2
min read
LW
link
Anti-Circumcision Essay 1 of 3: According To Their Critics, Intactivists Are The Best-Behaved Protest Movement In History
Harry Stevenage
Jun 30, 2024, 5:17 PM
12
points
6
comments
5
min read
LW
link
The Xerox Parc/ARPA version of the intellectual Turing test: Class 1 vs Class 2 disagreement
hamishtodd1
Jun 30, 2024, 3:34 PM
6
points
3
comments
1
min read
LW
link
LLMs Universally Learn a Feature Representing Token Frequency / Rarity
Sean Osier
Jun 30, 2024, 2:48 AM
12
points
5
comments
6
min read
LW
link
(github.com)
My 5-step program for losing weight
nsokolsky
Jun 30, 2024, 1:05 AM
22
points
20
comments
5
min read
LW
link
(nsokolsky.substack.com)
Datasets that change the odds you exist
dynomight
Jun 29, 2024, 6:45 PM
56
points
4
comments
6
min read
LW
link
(dynomight.net)
A “Scaling Monosemanticity” Explainer
latterframe
and
Luoencz
Jun 29, 2024, 5:50 PM
10
points
0
comments
3
min read
LW
link
Analysis of key AI analogies
Kevin Kohler
Jun 29, 2024, 10:55 AM
10
points
2
comments
15
min read
LW
link
Georgism Crash Course
Zero Contradictions
Jun 29, 2024, 6:18 AM
9
points
5
comments
1
min read
LW
link
(zerocontradictions.net)
Activation Pattern SVD: A proposal for SAE Interpretability
Daniel Tan
Jun 28, 2024, 10:12 PM
15
points
2
comments
2
min read
LW
link
Podcast: Elizabeth & Austin on “What Manifold was allowed to do”
Austin Chen
Jun 28, 2024, 10:10 PM
20
points
0
comments
LW
link
(share.descript.com)
The Incredible Fentanyl-Detecting Machine
sarahconstantin
Jun 28, 2024, 10:10 PM
156
points
26
comments
7
min read
LW
link
(sarahconstantin.substack.com)
Saving Lives Reduces Over-Population—A Counter-Intuitive Non-Zero-Sum Game
James Stephen Brown
Jun 28, 2024, 7:29 PM
6
points
0
comments
5
min read
LW
link
(nonzerosum.games)
Mentorship in AGI Safety: Applications for mentorship are open!
Valentin2026
and
Joe Rogero
Jun 28, 2024, 2:49 PM
5
points
0
comments
1
min read
LW
link
Contra Acemoglu on AI
Maxwell Tabarrok
Jun 28, 2024, 1:13 PM
48
points
0
comments
5
min read
LW
link
(www.maximum-progress.com)
Five toy worlds to think about heritability
David Hugh-Jones
Jun 28, 2024, 1:11 PM
13
points
0
comments
9
min read
LW
link
(wyclif.substack.com)
[Question]
How do natural sciences prove causation?
Kongo Landwalker
Jun 28, 2024, 11:58 AM
1
point
3
comments
1
min read
LW
link
LessWrong/ACX meetup Transilvanya tour—Sibiu
Marius Adrian Nicoară
Jun 28, 2024, 11:41 AM
1
point
1
comment
1
min read
LW
link
Bayes’ Theorem: In Search of Gold (Lesson 1)
bayesyatina
Jun 28, 2024, 8:39 AM
3
points
0
comments
3
min read
LW
link
How a chip is designed
YM
Jun 28, 2024, 8:04 AM
65
points
4
comments
5
min read
LW
link
The Wisdom of Living for 200 Years
Martin Sustrik
Jun 28, 2024, 4:44 AM
25
points
3
comments
4
min read
LW
link
A Generally Intelligent Game
snerx
Jun 28, 2024, 1:31 AM
−1
points
1
comment
4
min read
LW
link
Corrigibility = Tool-ness?
johnswentworth
and
David Lorell
Jun 28, 2024, 1:19 AM
78
points
8
comments
9
min read
LW
link
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel