Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
1
[Question]
Are there high-quality surveys available detailing the rates of polyamory among Americans age 18-45 in metropolitan areas in the United States?
Evan_Gaensbauer
Jan 18, 2024, 11:50 PM
23
points
0
comments
1
min read
LW
link
Manifund: 2023 in Review
Austin Chen
Jan 18, 2024, 11:50 PM
32
points
0
comments
LW
link
(manifund.substack.com)
The Underreaction to OpenAI
Sherrinford
Jan 18, 2024, 10:08 PM
21
points
0
comments
6
min read
LW
link
Against Nonlinear (Thing Of Things)
tailcalled
Jan 18, 2024, 9:40 PM
58
points
18
comments
1
min read
LW
link
(thingofthings.substack.com)
Toward A Mathematical Framework for Computation in Superposition
Dmitry Vaintrob
,
jake_mendel
and
Kaarel
Jan 18, 2024, 9:06 PM
205
points
18
comments
63
min read
LW
link
The True Story of How GPT-2 Became Maximally Lewd
Writer
and
Jai
Jan 18, 2024, 9:03 PM
70
points
7
comments
6
min read
LW
link
(youtu.be)
Gaia Network: An Illustrated Primer
Rafael Kaufmann Nedal
and
Roman Leventov
Jan 18, 2024, 6:23 PM
3
points
2
comments
15
min read
LW
link
On the abolition of man
Joe Carlsmith
Jan 18, 2024, 6:17 PM
90
points
18
comments
41
min read
LW
link
More Usable Recipes
jefftk
Jan 18, 2024, 5:40 PM
14
points
1
comment
1
min read
LW
link
(www.jefftk.com)
Good job opportunities for helping with the most important century
HoldenKarnofsky
Jan 18, 2024, 5:30 PM
36
points
0
comments
4
min read
LW
link
(www.cold-takes.com)
Flexibility and the Singularity
Jonathan Moregård
Jan 18, 2024, 3:29 PM
8
points
0
comments
3
min read
LW
link
(honestliving.substack.com)
AI #48: Exponentials in Geometry
Zvi
Jan 18, 2024, 2:20 PM
59
points
9
comments
54
min read
LW
link
(thezvi.wordpress.com)
Worrisome misunderstanding of the core issues with AI transition
Roman Leventov
Jan 18, 2024, 10:05 AM
5
points
2
comments
4
min read
LW
link
[Question]
What evidence is there for (or against) theories about the extent to which effective altruist interests motivated the ouster of Sam Altman last year?
Evan_Gaensbauer
Jan 18, 2024, 5:14 AM
10
points
0
comments
LW
link
Does literacy remove your ability to be a bard as good as Homer?
Adrià Garriga-alonso
Jan 18, 2024, 3:43 AM
51
points
19
comments
3
min read
LW
link
D&D.Sci Hypersphere Analysis Part 4: Fine-tuning and Wrapup
aphyer
Jan 18, 2024, 3:06 AM
25
points
5
comments
7
min read
LW
link
Some heuristics I use for deciding how much I trust scientific results
NathanBarnard
Jan 18, 2024, 2:48 AM
13
points
2
comments
5
min read
LW
link
Newport News VA Meetup—Living Museum
Daniel
Jan 18, 2024, 2:05 AM
1
point
0
comments
1
min read
LW
link
In Strategic Time, Open-Source Games Are Loopy
StrivingForLegibility
Jan 18, 2024, 12:08 AM
21
points
2
comments
6
min read
LW
link
Four visions of Transformative AI success
Steven Byrnes
Jan 17, 2024, 8:45 PM
112
points
22
comments
15
min read
LW
link
AI Disclosure Ballot Initiative (and voting method)
Aaron Hamlin
Jan 17, 2024, 8:02 PM
−8
points
3
comments
1
min read
LW
link
Hatching the Cosmic Egg (Hymn to Dionysus)
rogersbacon
Jan 17, 2024, 6:34 PM
7
points
0
comments
9
min read
LW
link
(www.secretorum.life)
[Question]
What do people colloquially mean by deep breathing? Slow, large, or diaphragmatic?
VipulNaik
Jan 17, 2024, 6:01 PM
13
points
8
comments
2
min read
LW
link
AlphaGeometry: An Olympiad-level AI system for geometry
alyssavance
Jan 17, 2024, 5:17 PM
45
points
9
comments
1
min read
LW
link
(deepmind.google)
On Anthropic’s Sleeper Agents Paper
Zvi
Jan 17, 2024, 4:10 PM
54
points
5
comments
36
min read
LW
link
(thezvi.wordpress.com)
A Pedagogical Guide to Corrigibility
A.H.
Jan 17, 2024, 11:45 AM
6
points
3
comments
16
min read
LW
link
An Introduction To The Mandelbrot Set That Doesn’t Mention Complex Numbers
Yitz
Jan 17, 2024, 9:48 AM
82
points
11
comments
9
min read
LW
link
Vote in the LessWrong review! (LW 2022 Review voting phase)
habryka
Jan 17, 2024, 7:22 AM
26
points
9
comments
2
min read
LW
link
Coalescer Models
DaemonicSigil
and
bhauth
Jan 17, 2024, 6:39 AM
16
points
2
comments
10
min read
LW
link
Maybe talking isn’t the best way to communicate with LLMs
mnvr
Jan 17, 2024, 6:24 AM
3
points
1
comment
1
min read
LW
link
(mrmr.io)
D&D.Sci Hypersphere Analysis Part 3: Beat it with Linear Algebra
aphyer
Jan 16, 2024, 10:44 PM
26
points
1
comment
5
min read
LW
link
The weak-to-strong generalization (WTSG) paper in 60 seconds
sudo
Jan 16, 2024, 10:44 PM
12
points
1
comment
1
min read
LW
link
(arxiv.org)
Social media alignment test
amayhew
Jan 16, 2024, 8:56 PM
1
point
0
comments
1
min read
LW
link
(naiveskepticblog.wordpress.com)
Medical Roundup #1
Zvi
Jan 16, 2024, 8:30 PM
57
points
9
comments
29
min read
LW
link
(thezvi.wordpress.com)
Being nicer than Clippy
Joe Carlsmith
Jan 16, 2024, 7:44 PM
109
points
32
comments
27
min read
LW
link
How polysemantic can one neuron be? Investigating features in TinyStories.
Evan Anders
Jan 16, 2024, 7:10 PM
14
points
0
comments
8
min read
LW
link
(evanhanders.blog)
Applying AI Safety concepts to astronomy
Faris
Jan 16, 2024, 6:29 PM
1
point
0
comments
12
min read
LW
link
Managing catastrophic misuse without robust AIs
ryan_greenblatt
and
Buck
Jan 16, 2024, 5:27 PM
63
points
17
comments
11
min read
LW
link
[Question]
What are the most common social insecurities?
Chipmonk
Jan 16, 2024, 5:24 PM
9
points
6
comments
1
min read
LW
link
Why wasn’t preservation with the goal of potential future revival started earlier in history?
Andy_McKenzie
Jan 16, 2024, 4:15 PM
31
points
1
comment
6
min read
LW
link
[Question]
Why are people unkeen to immortality that would come from technological advancements and/or AI?
Gabi QUENE
Jan 16, 2024, 2:23 PM
12
points
41
comments
1
min read
LW
link
Dealing with Awkwardness
Jonathan Moregård
Jan 16, 2024, 12:32 PM
13
points
0
comments
4
min read
LW
link
(honestliving.substack.com)
The impossible problem of due process
mingyuan
16 Jan 2024 5:18 UTC
197
points
64
comments
14
min read
LW
link
[Retracted] Newton’s law of cooling from first principles
Nisan
16 Jan 2024 4:21 UTC
9
points
15
comments
2
min read
LW
link
Sparse Autoencoders Work on Attention Layer Outputs
Connor Kissane
,
robertzk
,
Arthur Conmy
and
Neel Nanda
16 Jan 2024 0:26 UTC
84
points
9
comments
18
min read
LW
link
Goals selected from learned knowledge: an alternative to RL alignment
Seth Herd
15 Jan 2024 21:52 UTC
42
points
18
comments
7
min read
LW
link
Introducing REBUS: A Robust Evaluation Benchmark of Understanding Symbols
Arjun Panickssery
and
agg
15 Jan 2024 21:21 UTC
33
points
0
comments
1
min read
LW
link
Live Sound: Big-O Improvements
jefftk
15 Jan 2024 19:50 UTC
8
points
0
comments
1
min read
LW
link
(www.jefftk.com)
Investigating Bias Representations in LLMs via Activation Steering
DawnLu
15 Jan 2024 19:39 UTC
29
points
4
comments
5
min read
LW
link
Sparse MLP Distillation
slavachalnev
15 Jan 2024 19:39 UTC
30
points
3
comments
6
min read
LW
link
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel