Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
3
Worrisome misunderstanding of the core issues with AI transition
Roman Leventov
Jan 18, 2024, 10:05 AM
5
points
2
comments
4
min read
LW
link
[Question]
What evidence is there for (or against) theories about the extent to which effective altruist interests motivated the ouster of Sam Altman last year?
Evan_Gaensbauer
Jan 18, 2024, 5:14 AM
10
points
0
comments
3
min read
LW
link
Does literacy remove your ability to be a bard as good as Homer?
Adrià Garriga-alonso
Jan 18, 2024, 3:43 AM
51
points
19
comments
3
min read
LW
link
D&D.Sci Hypersphere Analysis Part 4: Fine-tuning and Wrapup
aphyer
Jan 18, 2024, 3:06 AM
25
points
5
comments
7
min read
LW
link
Some heuristics I use for deciding how much I trust scientific results
NathanBarnard
Jan 18, 2024, 2:48 AM
13
points
2
comments
5
min read
LW
link
Newport News VA Meetup—Living Museum
Daniel
Jan 18, 2024, 2:05 AM
1
point
0
comments
1
min read
LW
link
In Strategic Time, Open-Source Games Are Loopy
StrivingForLegibility
Jan 18, 2024, 12:08 AM
21
points
2
comments
6
min read
LW
link
Four visions of Transformative AI success
Steven Byrnes
Jan 17, 2024, 8:45 PM
112
points
22
comments
15
min read
LW
link
AI Disclosure Ballot Initiative (and voting method)
Aaron Hamlin
Jan 17, 2024, 8:02 PM
−8
points
3
comments
1
min read
LW
link
Hatching the Cosmic Egg (Hymn to Dionysus)
rogersbacon
Jan 17, 2024, 6:34 PM
7
points
0
comments
9
min read
LW
link
(www.secretorum.life)
[Question]
What do people colloquially mean by deep breathing? Slow, large, or diaphragmatic?
VipulNaik
Jan 17, 2024, 6:01 PM
13
points
8
comments
2
min read
LW
link
AlphaGeometry: An Olympiad-level AI system for geometry
alyssavance
Jan 17, 2024, 5:17 PM
45
points
9
comments
1
min read
LW
link
(deepmind.google)
On Anthropic’s Sleeper Agents Paper
Zvi
Jan 17, 2024, 4:10 PM
54
points
5
comments
36
min read
LW
link
(thezvi.wordpress.com)
A Pedagogical Guide to Corrigibility
A.H.
Jan 17, 2024, 11:45 AM
6
points
3
comments
16
min read
LW
link
An Introduction To The Mandelbrot Set That Doesn’t Mention Complex Numbers
Yitz
Jan 17, 2024, 9:48 AM
82
points
11
comments
9
min read
LW
link
Vote in the LessWrong review! (LW 2022 Review voting phase)
habryka
Jan 17, 2024, 7:22 AM
26
points
9
comments
2
min read
LW
link
Coalescer Models
DaemonicSigil
and
bhauth
Jan 17, 2024, 6:39 AM
16
points
2
comments
10
min read
LW
link
Maybe talking isn’t the best way to communicate with LLMs
mnvr
Jan 17, 2024, 6:24 AM
3
points
1
comment
1
min read
LW
link
(mrmr.io)
D&D.Sci Hypersphere Analysis Part 3: Beat it with Linear Algebra
aphyer
Jan 16, 2024, 10:44 PM
26
points
1
comment
5
min read
LW
link
The weak-to-strong generalization (WTSG) paper in 60 seconds
sudo
Jan 16, 2024, 10:44 PM
12
points
1
comment
1
min read
LW
link
(arxiv.org)
Social media alignment test
amayhew
Jan 16, 2024, 8:56 PM
1
point
0
comments
1
min read
LW
link
(naiveskepticblog.wordpress.com)
Medical Roundup #1
Zvi
Jan 16, 2024, 8:30 PM
57
points
9
comments
29
min read
LW
link
(thezvi.wordpress.com)
Being nicer than Clippy
Joe Carlsmith
Jan 16, 2024, 7:44 PM
109
points
32
comments
27
min read
LW
link
How polysemantic can one neuron be? Investigating features in TinyStories.
Evan Anders
Jan 16, 2024, 7:10 PM
14
points
0
comments
8
min read
LW
link
(evanhanders.blog)
Applying AI Safety concepts to astronomy
Faris
Jan 16, 2024, 6:29 PM
1
point
0
comments
12
min read
LW
link
Managing catastrophic misuse without robust AIs
ryan_greenblatt
and
Buck
Jan 16, 2024, 5:27 PM
63
points
17
comments
11
min read
LW
link
[Question]
What are the most common social insecurities?
Chris Lakin
Jan 16, 2024, 5:24 PM
9
points
6
comments
1
min read
LW
link
Why wasn’t preservation with the goal of potential future revival started earlier in history?
Andy_McKenzie
Jan 16, 2024, 4:15 PM
31
points
1
comment
6
min read
LW
link
[Question]
Why are people unkeen to immortality that would come from technological advancements and/or AI?
Gabi QUENE
Jan 16, 2024, 2:23 PM
12
points
42
comments
1
min read
LW
link
Dealing with Awkwardness
Jonathan Moregård
Jan 16, 2024, 12:32 PM
13
points
0
comments
4
min read
LW
link
(honestliving.substack.com)
The impossible problem of due process
mingyuan
Jan 16, 2024, 5:18 AM
197
points
64
comments
14
min read
LW
link
[Retracted] Newton’s law of cooling from first principles
Nisan
Jan 16, 2024, 4:21 AM
9
points
15
comments
2
min read
LW
link
Sparse Autoencoders Work on Attention Layer Outputs
Connor Kissane
,
robertzk
,
Arthur Conmy
and
Neel Nanda
Jan 16, 2024, 12:26 AM
85
points
9
comments
18
min read
LW
link
Goals selected from learned knowledge: an alternative to RL alignment
Seth Herd
Jan 15, 2024, 9:52 PM
42
points
18
comments
7
min read
LW
link
Introducing REBUS: A Robust Evaluation Benchmark of Understanding Symbols
Arjun Panickssery
and
agg
Jan 15, 2024, 9:21 PM
33
points
0
comments
1
min read
LW
link
Live Sound: Big-O Improvements
jefftk
Jan 15, 2024, 7:50 PM
8
points
0
comments
1
min read
LW
link
(www.jefftk.com)
Investigating Bias Representations in LLMs via Activation Steering
DawnLu
Jan 15, 2024, 7:39 PM
29
points
4
comments
5
min read
LW
link
Sparse MLP Distillation
slavachalnev
Jan 15, 2024, 7:39 PM
30
points
3
comments
6
min read
LW
link
Review of Alignment Plan Critiques- December AI-Plans Critique-a-Thon Results
Iknownothing
Jan 15, 2024, 7:37 PM
24
points
0
comments
25
min read
LW
link
(aiplans.substack.com)
[Question]
What does it look like for AI to significantly improve human coordination, before superintelligence?
Bird Concept
Jan 15, 2024, 7:22 PM
22
points
2
comments
1
min read
LW
link
Now Accepting Player Applications for Band of Blades
Joe Rogero
Jan 15, 2024, 5:58 PM
2
points
0
comments
3
min read
LW
link
Three Types of Constraints in the Space of Agents
Nora_Ammann
and
Mateusz Bagiński
Jan 15, 2024, 5:27 PM
26
points
3
comments
17
min read
LW
link
The case for training frontier AIs on Sumerian-only corpus
Alexandre Variengien
,
Charbel-Raphaël
and
Jonathan Claybrough
Jan 15, 2024, 4:40 PM
130
points
16
comments
3
min read
LW
link
How to Promote More Productive Dialogue Outside of LessWrong
sweenesm
Jan 15, 2024, 2:16 PM
18
points
4
comments
2
min read
LW
link
[Question]
Come and daydream with me about science reform
TeaTieAndHat
Jan 15, 2024, 11:09 AM
9
points
1
comment
1
min read
LW
link
AI doing philosophy = AI generating hands?
Wei Dai
Jan 15, 2024, 9:04 AM
46
points
23
comments
3
min read
LW
link
Even if we lose, we win
Morphism
Jan 15, 2024, 2:15 AM
24
points
17
comments
4
min read
LW
link
Detachment vs attachment [AI risk and mental health]
Neil
Jan 15, 2024, 12:41 AM
15
points
4
comments
3
min read
LW
link
Making up statistics to establish priority on Land Value Tax vs Earned Income Tax Credit vs Social Media Dynamic Regulation
Canucklug
Jan 14, 2024, 11:57 PM
−5
points
2
comments
7
min read
LW
link
Is the universe all there is? ‘Evidence’ for objects outside the universe...
JonathanHall
Jan 14, 2024, 11:56 PM
−4
points
27
comments
11
min read
LW
link
Back to first
Previous
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel