Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Page
2
Motivation gaps: Why so much EA criticism is hostile and lazy
titotal
Apr 22, 2024, 11:49 AM
70
points
5
comments
LW
link
(titotal.substack.com)
The Inner Ring by C. S. Lewis
Saul Munn
Apr 24, 2024, 10:48 PM
69
points
6
comments
13
min read
LW
link
(www.lewissociety.org)
Duct Tape security
Isaac King
Apr 26, 2024, 6:57 PM
69
points
11
comments
5
min read
LW
link
AXRP Episode 27 - AI Control with Buck Shlegeris and Ryan Greenblatt
DanielFilan
Apr 11, 2024, 9:30 PM
69
points
10
comments
107
min read
LW
link
Text Posts from the Kids Group: 2020
jefftk
Apr 13, 2024, 10:30 PM
69
points
3
comments
19
min read
LW
link
(www.jefftk.com)
The 2nd Demographic Transition
Maxwell Tabarrok
Apr 6, 2024, 2:10 PM
68
points
17
comments
4
min read
LW
link
(www.maximum-progress.com)
“Fractal Strategy” workshop report
Raemon
Apr 6, 2024, 9:26 PM
68
points
23
comments
10
min read
LW
link
Ophiology (or, how the Mamba architecture works)
Danielle Ensign
,
SrGonao
and
Adrià Garriga-alonso
Apr 9, 2024, 7:31 PM
67
points
8
comments
10
min read
LW
link
Superposition is not “just” neuron polysemanticity
LawrenceC
Apr 26, 2024, 11:22 PM
66
points
4
comments
13
min read
LW
link
Improving Dictionary Learning with Gated Sparse Autoencoders
Senthooran Rajamanoharan
,
Arthur Conmy
,
lewis smith
,
Tom Lieberum
,
Vikrant Varma
,
János Kramár
,
Rohin Shah
and
Neel Nanda
Apr 25, 2024, 6:43 PM
63
points
38
comments
1
min read
LW
link
(arxiv.org)
On Llama-3 and Dwarkesh Patel’s Podcast with Zuckerberg
Zvi
Apr 22, 2024, 1:10 PM
63
points
4
comments
47
min read
LW
link
(thezvi.wordpress.com)
Moving on from community living
Vika
Apr 17, 2024, 5:02 PM
63
points
7
comments
3
min read
LW
link
(vkrakovna.wordpress.com)
[Question]
What’s with all the bans recently?
Gerald Monroe
Apr 4, 2024, 6:16 AM
62
points
83
comments
4
min read
LW
link
Transfer Learning in Humans
niplav
Apr 21, 2024, 8:49 PM
61
points
1
comment
13
min read
LW
link
This is Water by David Foster Wallace
Nathan Young
Apr 24, 2024, 9:21 PM
60
points
16
comments
13
min read
LW
link
(fs.blog)
LessOnline Festival Updates Thread
Ben Pace
Apr 18, 2024, 9:55 PM
59
points
26
comments
1
min read
LW
link
“Why I Write” by George Orwell (1946)
Arjun Panickssery
Apr 25, 2024, 4:02 PM
59
points
2
comments
9
min read
LW
link
(www.orwellfoundation.com)
Gradient Descent on the Human Brain
Jozdien
and
gaspode
Apr 1, 2024, 10:39 PM
59
points
5
comments
2
min read
LW
link
So What’s Up With PUFAs Chemically?
J Bostock
Apr 27, 2024, 1:32 PM
57
points
23
comments
6
min read
LW
link
Let’s Design A School, Part 1
Sable
Apr 23, 2024, 9:50 PM
56
points
5
comments
11
min read
LW
link
(affablyevil.substack.com)
Experiment on repeating choices
KatjaGrace
Apr 19, 2024, 4:20 AM
56
points
1
comment
3
min read
LW
link
(worldspiritsockpuppet.com)
A D&D.Sci Dodecalogue
abstractapplic
Apr 12, 2024, 1:10 AM
56
points
0
comments
3
min read
LW
link
Towards a formalization of the agent structure problem
Alex_Altair
Apr 29, 2024, 8:28 PM
55
points
6
comments
14
min read
LW
link
Spatial attention as a “tell” for empathetic simulation?
Steven Byrnes
Apr 26, 2024, 3:10 PM
55
points
12
comments
8
min read
LW
link
Math-to-English Cheat Sheet
nahoj
Apr 8, 2024, 9:19 AM
54
points
5
comments
6
min read
LW
link
[Closed] PIBBSS is hiring in a variety of roles (alignment research and incubation program)
Nora_Ammann
,
Lucas Teixeira
and
DusanDNesic
Apr 9, 2024, 8:12 AM
54
points
0
comments
3
min read
LW
link
We are headed into an extreme compute overhang
devrandom
Apr 26, 2024, 9:38 PM
54
points
34
comments
2
min read
LW
link
Monthly Roundup #17: April 2024
Zvi
Apr 15, 2024, 12:10 PM
54
points
4
comments
76
min read
LW
link
(thezvi.wordpress.com)
LLMs seem (relatively) safe
JustisMills
Apr 25, 2024, 10:13 PM
53
points
24
comments
7
min read
LW
link
(justismills.substack.com)
So You Created a Sociopath—New Book Announcement!
Garrett Baker
Apr 1, 2024, 6:02 PM
52
points
3
comments
1
min read
LW
link
On Complexity Science
Garrett Baker
Apr 5, 2024, 2:24 AM
51
points
19
comments
4
min read
LW
link
on the dollar-yen exchange rate
bhauth
Apr 7, 2024, 4:49 AM
50
points
21
comments
10
min read
LW
link
(www.bhauth.com)
Changes in College Admissions
Zvi
Apr 24, 2024, 1:50 PM
50
points
11
comments
39
min read
LW
link
(thezvi.wordpress.com)
Koan: divining alien datastructures from RAM activations
TsviBT
Apr 5, 2024, 6:04 PM
49
points
10
comments
21
min read
LW
link
My intellectual journey to (dis)solve the hard problem of consciousness
Charbel-Raphaël
Apr 6, 2024, 9:32 AM
49
points
44
comments
30
min read
LW
link
AI #58: Stargate AGI
Zvi
Apr 4, 2024, 1:10 PM
49
points
9
comments
60
min read
LW
link
(thezvi.wordpress.com)
Run evals on base models too!
orthonormal
Apr 4, 2024, 6:43 PM
49
points
6
comments
1
min read
LW
link
D&D.Sci: The Mad Tyrant’s Pet Turtles [Evaluation and Ruleset]
abstractapplic
Apr 9, 2024, 2:01 PM
48
points
6
comments
3
min read
LW
link
The Mom Test: Summary and Thoughts
Adam Zerner
Apr 18, 2024, 3:34 AM
48
points
3
comments
10
min read
LW
link
An Introduction to AI Sandbagging
Teun van der Weij
,
Felix Hofstätter
and
Francis Rhys Ward
Apr 26, 2024, 1:40 PM
47
points
13
comments
8
min read
LW
link
I’m open for projects (sort of)
cousin_it
Apr 18, 2024, 6:05 PM
46
points
13
comments
1
min read
LW
link
LLM Evaluators Recognize and Favor Their Own Generations
Arjun Panickssery
,
Sam Bowman
and
Shi Feng
Apr 17, 2024, 9:09 PM
46
points
1
comment
3
min read
LW
link
(tiny.cc)
Apply to LASR Labs: a London-based technical AI safety research programme
Erin Robertson
,
charlie_griffin
and
joehardie
Apr 9, 2024, 5:34 PM
45
points
1
comment
3
min read
LW
link
Announcing Atlas Computing
miyazono
Apr 11, 2024, 3:56 PM
45
points
4
comments
4
min read
LW
link
Book review: Deep Utopia
PeterMcCluskey
Apr 23, 2024, 7:55 PM
45
points
14
comments
4
min read
LW
link
(bayesianinvestor.com)
Things Solenoid Narrates
Solenoid_Entity
Apr 12, 2024, 11:57 PM
45
points
2
comments
2
min read
LW
link
D&D.Sci Long War: Defender of Data-mocracy
aphyer
Apr 26, 2024, 10:30 PM
44
points
20
comments
4
min read
LW
link
ProLU: A Nonlinearity for Sparse Autoencoders
Glen Taggart
Apr 23, 2024, 2:09 PM
44
points
4
comments
9
min read
LW
link
AI #60: Oh the Humanity
Zvi
Apr 18, 2024, 2:10 PM
44
points
7
comments
62
min read
LW
link
(thezvi.wordpress.com)
[Aspiration-based designs] 1. Informal introduction
B Jacobs
,
Jobst Heitzig
,
Simon Fischer
and
Simon Dima
28 Apr 2024 13:00 UTC
44
points
4
comments
8
min read
LW
link
Previous
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel