Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
1
Article Review: Discovering Latent Knowledge (Burns, Ye, et al)
Robert_AIZI
Dec 22, 2022, 6:16 PM
13
points
4
comments
6
min read
LW
link
(aizi.substack.com)
Let’s think about slowing down AI
KatjaGrace
Dec 22, 2022, 5:40 PM
551
points
182
comments
38
min read
LW
link
3
reviews
(aiimpacts.org)
Some Notes on the mathematics of Toy Autoencoding Problems
carboniferous_umbraculum
Dec 22, 2022, 5:21 PM
18
points
1
comment
12
min read
LW
link
December 2022 updates and fundraising
AI Impacts
Dec 22, 2022, 5:20 PM
39
points
1
comment
3
min read
LW
link
(aiimpacts.org)
Covid 12/22/22: Reevaluating Past Options
Zvi
Dec 22, 2022, 4:50 PM
30
points
2
comments
9
min read
LW
link
(thezvi.wordpress.com)
China Covid #4
Zvi
Dec 22, 2022, 4:30 PM
50
points
2
comments
11
min read
LW
link
(thezvi.wordpress.com)
Racing through a minefield: the AI deployment problem
HoldenKarnofsky
Dec 22, 2022, 4:10 PM
38
points
2
comments
13
min read
LW
link
(www.cold-takes.com)
Lead in Chocolate?
jefftk
Dec 22, 2022, 4:10 PM
41
points
6
comments
2
min read
LW
link
(www.jefftk.com)
Response to Holden’s alignment plan
Alex Flint
Dec 22, 2022, 4:08 PM
36
points
4
comments
6
min read
LW
link
Staring into the abyss as a core life skill
benkuhn
Dec 22, 2022, 3:30 PM
357
points
22
comments
12
min read
LW
link
1
review
(www.benkuhn.net)
Secular Solstice for children
juliawise
and
denkenberger
Dec 22, 2022, 2:33 PM
31
points
1
comment
3
min read
LW
link
Mental acceptance and reflection
remember
and
Gabriel Alfour
Dec 22, 2022, 2:32 PM
34
points
1
comment
2
min read
LW
link
Against Diversification
Jack Malde
Dec 22, 2022, 1:29 PM
4
points
0
comments
3
min read
LW
link
(ethicaleconomist.substack.com)
Notes on Meta’s Diplomacy-Playing AI
Erich_Grunewald
Dec 22, 2022, 11:34 AM
15
points
2
comments
14
min read
LW
link
(www.erichgrunewald.com)
Take 13: RLHF bad, conditioning good.
Charlie Steiner
Dec 22, 2022, 10:44 AM
54
points
4
comments
2
min read
LW
link
Applied Linear Algebra Lecture Series
johnswentworth
Dec 22, 2022, 6:57 AM
103
points
8
comments
1
min read
LW
link
Naive Set Theory, Halmos
David Udell
Dec 22, 2022, 2:34 AM
11
points
1
comment
8
min read
LW
link
Not Getting Hacked
jefftk
Dec 21, 2022, 9:40 PM
40
points
14
comments
7
min read
LW
link
(www.jefftk.com)
Metaphor.systems
the gears to ascension
Dec 21, 2022, 9:31 PM
25
points
9
comments
1
min read
LW
link
(metaphor.systems)
[Question]
How much is DQC (Dynamic Quantum Clustering) currently looked into in AI Capabilities Research?
macmillan
Dec 21, 2022, 8:46 PM
1
point
0
comments
1
min read
LW
link
Think wider about the root causes of progress
jasoncrawford
Dec 21, 2022, 8:05 PM
49
points
11
comments
4
min read
LW
link
(rootsofprogress.org)
[Question]
What readings did you consider best for the happy parts of the secular solstice?
ChristianKl
Dec 21, 2022, 3:45 PM
17
points
0
comments
1
min read
LW
link
Recreating logic in type theory
Thomas Kehrenberg
Dec 21, 2022, 3:19 PM
18
points
0
comments
13
min read
LW
link
You become the UI you use
Viliam
Dec 21, 2022, 3:04 PM
21
points
7
comments
2
min read
LW
link
Price’s equation for neural networks
tailcalled
Dec 21, 2022, 1:09 PM
29
points
4
comments
2
min read
LW
link
Decisions: Ontologically Shifting to Determinism
Chris_Leong
Dec 21, 2022, 12:41 PM
8
points
11
comments
6
min read
LW
link
A Comprehensive Mechanistic Interpretability Explainer & Glossary
Neel Nanda
Dec 21, 2022, 12:35 PM
91
points
6
comments
2
min read
LW
link
(neelnanda.io)
Google Search loses to ChatGPT fair and square
Shmi
Dec 21, 2022, 8:11 AM
14
points
17
comments
1
min read
LW
link
(www.surgehq.ai)
Sazen
Duncan Sabien (Inactive)
Dec 21, 2022, 7:54 AM
285
points
83
comments
12
min read
LW
link
2
reviews
Podcast: What’s Wrong With LessWrong
Alfred
Dec 21, 2022, 7:06 AM
−32
points
11
comments
1
min read
LW
link
(youtu.be)
New AI risk intro from Vox [link post]
JakubK
Dec 21, 2022, 6:00 AM
5
points
1
comment
2
min read
LW
link
(www.vox.com)
Local Memes Against Geometric Rationality
Scott Garrabrant
Dec 21, 2022, 3:53 AM
90
points
3
comments
6
min read
LW
link
Logging Shell History in Zsh
jefftk
Dec 21, 2022, 3:30 AM
19
points
2
comments
1
min read
LW
link
(www.jefftk.com)
CIRL Corrigibility is Fragile
Rachel Freedman
and
AdamGleave
Dec 21, 2022, 1:40 AM
58
points
8
comments
12
min read
LW
link
[Question]
[DISC] Are Values Robust?
DragonGod
Dec 21, 2022, 1:00 AM
12
points
9
comments
2
min read
LW
link
Performing an SVD on a time-series matrix of gradient updates on an MNIST network produces 92.5 singular values
Garrett Baker
Dec 21, 2022, 12:44 AM
9
points
10
comments
5
min read
LW
link
Progress links and tweets, 2022-12-20
jasoncrawford
Dec 21, 2022, 12:35 AM
12
points
0
comments
2
min read
LW
link
(rootsofprogress.org)
K-complexity is silly; use cross-entropy instead
So8res
Dec 20, 2022, 11:06 PM
147
points
54
comments
14
min read
LW
link
2
reviews
Podcast: Tamera Lanham on AI risk, threat models, alignment proposals, externalized reasoning oversight, and working at Anthropic
Orpheus16
Dec 20, 2022, 9:39 PM
18
points
2
comments
11
min read
LW
link
Discovering Language Model Behaviors with Model-Written Evaluations
evhub
and
Ethan Perez
Dec 20, 2022, 8:08 PM
100
points
34
comments
1
min read
LW
link
(www.anthropic.com)
Reflections: Bureaucratic Hell
Haris Rashid
Dec 20, 2022, 7:22 PM
−5
points
1
comment
1
min read
LW
link
(www.harisrab.com)
Proliferating Education
Haris Rashid
20 Dec 2022 19:22 UTC
−1
points
2
comments
5
min read
LW
link
(www.harisrab.com)
AGI is here, but nobody wants it. Why should we even care?
MGow
20 Dec 2022 19:14 UTC
−22
points
0
comments
17
min read
LW
link
Properties of current AIs and some predictions of the evolution of AI from the perspective of scale-free theories of agency and regulative development
Roman Leventov
20 Dec 2022 17:13 UTC
33
points
3
comments
36
min read
LW
link
I believe some AI doomers are overconfident
FTPickle
20 Dec 2022 17:09 UTC
8
points
15
comments
2
min read
LW
link
Note on algorithms with multiple trained components
Steven Byrnes
20 Dec 2022 17:08 UTC
23
points
4
comments
2
min read
LW
link
Marvel Snap: Phase 2
Zvi
20 Dec 2022 14:50 UTC
11
points
1
comment
13
min read
LW
link
(thezvi.wordpress.com)
(Extremely) Naive Gradient Hacking Doesn’t Work
ojorgensen
20 Dec 2022 14:35 UTC
17
points
0
comments
6
min read
LW
link
An Open Agency Architecture for Safe Transformative AI
davidad
20 Dec 2022 13:04 UTC
80
points
22
comments
4
min read
LW
link
Under-Appreciated Ways to Use Flashcards—Part I
Florence Hinder
20 Dec 2022 12:43 UTC
22
points
5
comments
5
min read
LW
link
(thoughtsaver.ghost.io)
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel