Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
1
AI Forecasting: Two Years In
jsteinhardt
Aug 19, 2023, 11:40 PM
72
points
15
comments
11
min read
LW
link
(bounded-regret.ghost.io)
Four management/leadership book summaries
Nikola Jurkovic
Aug 19, 2023, 11:38 PM
25
points
2
comments
7
min read
LW
link
Interpreting a dimensionality reduction of a collection of matrices as two positive semidefinite block diagonal matrices
Joseph Van Name
Aug 19, 2023, 7:52 PM
16
points
2
comments
5
min read
LW
link
Will AI kill everyone? Here’s what the godfathers of AI have to say [RA video]
Writer
Aug 19, 2023, 5:29 PM
58
points
8
comments
LW
link
(youtu.be)
Ten variations on red-pill-blue-pill
Richard_Kennaway
Aug 19, 2023, 4:34 PM
23
points
34
comments
3
min read
LW
link
Are we running out of new music/movies/art from a metaphysical perspective? (updated)
stephen_s
Aug 19, 2023, 4:24 PM
4
points
23
comments
1
min read
LW
link
[Question]
Any ideas for a prediction market observable that quantifies “culture-warisation”?
Ppau
Aug 19, 2023, 3:11 PM
6
points
1
comment
1
min read
LW
link
[Question]
Clarifying how misalignment can arise from scaling LLMs
Util
Aug 19, 2023, 2:16 PM
3
points
1
comment
1
min read
LW
link
Chess as a case study in hidden capabilities in ChatGPT
AdamYedidia
Aug 19, 2023, 6:35 AM
47
points
32
comments
6
min read
LW
link
We can do better than DoWhatIMean (inextricably kind AI)
lemonhope
Aug 19, 2023, 5:41 AM
25
points
8
comments
2
min read
LW
link
Supervised Program for Alignment Research (SPAR) at UC Berkeley: Spring 2023 summary
mic
,
dx26
,
adamk
and
Carolyn Qian
Aug 19, 2023, 2:27 AM
23
points
2
comments
6
min read
LW
link
Could fabs own AI?
lemonhope
Aug 19, 2023, 12:16 AM
15
points
0
comments
3
min read
LW
link
Is Chinese total factor productivity lower today than it was in 1956?
Ege Erdil
Aug 18, 2023, 10:33 PM
43
points
0
comments
26
min read
LW
link
Rationality-ish Meetups Showcase: 2019-2021
jenn
Aug 18, 2023, 10:22 PM
10
points
0
comments
5
min read
LW
link
The U.S. is becoming less stable
lc
Aug 18, 2023, 9:13 PM
149
points
68
comments
2
min read
LW
link
Meetup Tip: Board Games
Screwtape
Aug 18, 2023, 6:11 PM
10
points
4
comments
7
min read
LW
link
[Question]
AI labs’ requests for input
Zach Stein-Perlman
Aug 18, 2023, 5:00 PM
29
points
0
comments
1
min read
LW
link
6 non-obvious mental health issues specific to AI safety
Igor Ivanov
Aug 18, 2023, 3:46 PM
147
points
24
comments
4
min read
LW
link
When discussing AI doom barriers propose specific plausible scenarios
anithite
Aug 18, 2023, 4:06 AM
5
points
0
comments
3
min read
LW
link
Risks from AI Overview: Summary
Dan H
,
Mantas Mazeika
and
TW123
Aug 18, 2023, 1:21 AM
25
points
1
comment
13
min read
LW
link
(www.safe.ai)
Managing risks of our own work
Beth Barnes
Aug 18, 2023, 12:41 AM
66
points
0
comments
2
min read
LW
link
ACI#5: From Human-AI Co-evolution to the Evolution of Value Systems
Akira Pyinya
Aug 18, 2023, 12:38 AM
0
points
0
comments
9
min read
LW
link
Memetic Judo #1: On Doomsday Prophets v.3
Max TK
Aug 18, 2023, 12:14 AM
25
points
17
comments
3
min read
LW
link
Looking for judges for critiques of Alignment Plans
Iknownothing
Aug 17, 2023, 10:35 PM
6
points
0
comments
1
min read
LW
link
How is ChatGPT’s behavior changing over time?
worse
Aug 17, 2023, 8:54 PM
3
points
0
comments
1
min read
LW
link
(arxiv.org)
Progress links digest, 2023-08-17: Cloud seeding, robotic sculptors, and rogue planets
jasoncrawford
Aug 17, 2023, 8:29 PM
15
points
1
comment
4
min read
LW
link
(rootsofprogress.org)
Model of psychosis, take 2
Steven Byrnes
Aug 17, 2023, 7:11 PM
34
points
13
comments
4
min read
LW
link
[Linkpost] Robustified ANNs Reveal Wormholes Between Human Category Percepts
Bogdan Ionut Cirstea
Aug 17, 2023, 7:10 PM
6
points
2
comments
1
min read
LW
link
Against Almost Every Theory of Impact of Interpretability
Charbel-Raphaël
Aug 17, 2023, 6:44 PM
329
points
91
comments
26
min read
LW
link
2
reviews
Goldilocks and the Three Optimisers
dkl9
Aug 17, 2023, 6:15 PM
−10
points
0
comments
5
min read
LW
link
(dkl9.net)
Announcing Foresight Institute’s AI Safety Grants Program
Allison Duettmann
Aug 17, 2023, 5:34 PM
35
points
2
comments
1
min read
LW
link
The Negentropy Cliff
mephistopheles
Aug 17, 2023, 5:08 PM
6
points
10
comments
1
min read
LW
link
“AI Wellbeing” and the Ongoing Debate on Phenomenal Consciousness
FlorianH
Aug 17, 2023, 3:47 PM
10
points
6
comments
7
min read
LW
link
AI #25: Inflection Point
Zvi
Aug 17, 2023, 2:40 PM
59
points
9
comments
36
min read
LW
link
(thezvi.wordpress.com)
[Question]
Why might General Intelligences have long term goals?
yrimon
Aug 17, 2023, 2:10 PM
3
points
17
comments
1
min read
LW
link
Understanding Counterbalanced Subtractions for Better Activation Additions
ojorgensen
Aug 17, 2023, 1:53 PM
21
points
0
comments
14
min read
LW
link
Reflections on “Making the Atomic Bomb”
boazbarak
Aug 17, 2023, 2:48 AM
51
points
7
comments
8
min read
LW
link
Autonomous replication and adaptation: an attempt at a concrete danger threshold
Hjalmar_Wijk
Aug 17, 2023, 1:31 AM
45
points
0
comments
13
min read
LW
link
[Question]
(Thought experiment) If you had to choose, which would you prefer?
kuira
Aug 17, 2023, 12:57 AM
9
points
2
comments
1
min read
LW
link
Some rules for life (v.0,0)
Neil
Aug 17, 2023, 12:43 AM
43
points
13
comments
12
min read
LW
link
(neilwarren.substack.com)
When AI critique works even with misaligned models
Fabien Roger
Aug 17, 2023, 12:12 AM
23
points
0
comments
2
min read
LW
link
Book Launch: “The Carving of Reality,” Best of LessWrong vol. III
Raemon
Aug 16, 2023, 11:52 PM
131
points
22
comments
5
min read
LW
link
One example of how LLM propaganda attacks can hack the brain
trevor
Aug 16, 2023, 9:41 PM
27
points
8
comments
4
min read
LW
link
If we had known the atmosphere would ignite
Jeffs
Aug 16, 2023, 8:28 PM
59
points
63
comments
2
min read
LW
link
Stampy’s AI Safety Info—New Distillations #4 [July 2023]
markov
Aug 16, 2023, 7:03 PM
22
points
10
comments
1
min read
LW
link
(aisafety.info)
A Proof of Löb’s Theorem using Computability Theory
jessicata
Aug 16, 2023, 6:57 PM
76
points
0
comments
17
min read
LW
link
(unstableontology.com)
Summary of and Thoughts on the Hotz/Yudkowsky Debate
Zvi
Aug 16, 2023, 4:50 PM
106
points
47
comments
9
min read
LW
link
(thezvi.wordpress.com)
Red Pill vs Blue Pill, Bayes style
ErickBall
Aug 16, 2023, 3:23 PM
28
points
33
comments
1
min read
LW
link
What does it mean to “trust science”?
jasoncrawford
Aug 16, 2023, 2:56 PM
34
points
9
comments
1
min read
LW
link
(rootsofprogress.org)
Jason Crawford / The Roots of Progress in Bangalore, August 21 to September 8
jasoncrawford
Aug 16, 2023, 1:36 PM
13
points
1
comment
1
min read
LW
link
(rootsofprogress.org)
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel