Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
2
3 levels of threat obfuscation
HoldenKarnofsky
Aug 2, 2023, 2:58 PM
69
points
14
comments
7
min read
LW
link
LLMs are (mostly) not helped by filler tokens
Kshitij Sachan
Aug 10, 2023, 12:48 AM
66
points
35
comments
6
min read
LW
link
Steven Wolfram on AI Alignment
Bill Benzon
Aug 20, 2023, 7:49 PM
66
points
15
comments
4
min read
LW
link
Managing risks of our own work
Beth Barnes
Aug 18, 2023, 12:41 AM
66
points
0
comments
2
min read
LW
link
“Dirty concepts” in AI alignment discourses, and some guesses for how to deal with them
Nora_Ammann
and
peckzy
Aug 20, 2023, 9:13 AM
66
points
4
comments
3
min read
LW
link
State of Generally Available Self-Driving
jefftk
Aug 22, 2023, 6:50 PM
66
points
6
comments
2
min read
LW
link
(www.jefftk.com)
AI Regulation May Be More Important Than AI Alignment For Existential Safety
otto.barten
Aug 24, 2023, 11:41 AM
65
points
39
comments
5
min read
LW
link
A short calculation about a Twitter poll
Ege Erdil
Aug 14, 2023, 7:48 PM
64
points
64
comments
11
min read
LW
link
Ideas for improving epistemics in AI safety outreach
mic
Aug 21, 2023, 7:55 PM
64
points
6
comments
3
min read
LW
link
What Does a Marginal Grant at LTFF Look Like? Funding Priorities and Grantmaking Thresholds at the Long-Term Future Fund
Linch
,
calebp99
and
Daniel_Eth
Aug 11, 2023, 3:59 AM
64
points
0
comments
1
min read
LW
link
(forum.effectivealtruism.org)
“Is There Anything That’s Worth More”
Zack_M_Davis
Aug 2, 2023, 3:28 AM
64
points
6
comments
1
min read
LW
link
DIY Deliberate Practice
lynettebye
Aug 21, 2023, 12:22 PM
63
points
4
comments
5
min read
LW
link
(lynettebye.com)
Barriers to Mechanistic Interpretability for AGI Safety
Connor Leahy
Aug 29, 2023, 10:56 AM
63
points
13
comments
1
min read
LW
link
(www.youtube.com)
Private notes on LW?
Raemon
Aug 4, 2023, 5:35 PM
61
points
33
comments
1
min read
LW
link
‘We’re changing the clouds.’ An unforeseen test of geoengineering is fueling record ocean warmth
Annapurna
Aug 6, 2023, 8:58 PM
60
points
6
comments
1
min read
LW
link
(www.science.org)
AI #25: Inflection Point
Zvi
Aug 17, 2023, 2:40 PM
59
points
9
comments
36
min read
LW
link
(thezvi.wordpress.com)
If we had known the atmosphere would ignite
Jeffs
Aug 16, 2023, 8:28 PM
59
points
63
comments
2
min read
LW
link
AI #23: Fundamental Problems with RLHF
Zvi
Aug 3, 2023, 12:50 PM
59
points
9
comments
41
min read
LW
link
(thezvi.wordpress.com)
Will AI kill everyone? Here’s what the godfathers of AI have to say [RA video]
Writer
Aug 19, 2023, 5:29 PM
58
points
8
comments
LW
link
(youtu.be)
Stomach Ulcers and Dental Cavities
Metacelsus
Aug 5, 2023, 2:08 PM
57
points
7
comments
1
min read
LW
link
(denovo.substack.com)
Open Call for Research Assistants in Developmental Interpretability
Jesse Hoogland
,
Daniel Murfet
,
Alexander Gietelink Oldenziel
and
Stan van Wingerden
Aug 30, 2023, 9:02 AM
56
points
11
comments
4
min read
LW
link
Diet Experiment Preregistration: Long-term water fasting + seed oil removal
lc
Aug 23, 2023, 10:08 PM
56
points
18
comments
1
min read
LW
link
AI Deception: A Survey of Examples, Risks, and Potential Solutions
Simon Goldstein
and
Peter S. Park
Aug 29, 2023, 1:29 AM
54
points
3
comments
10
min read
LW
link
The lost millennium
Ege Erdil
Aug 24, 2023, 3:48 AM
54
points
14
comments
3
min read
LW
link
Why Is No One Trying To Align Profit Incentives With Alignment Research?
Prometheus
Aug 23, 2023, 1:16 PM
51
points
11
comments
4
min read
LW
link
Efficiency and resource use scaling parity
Ege Erdil
Aug 21, 2023, 12:18 AM
51
points
1
comment
4
min read
LW
link
1
review
Reflections on “Making the Atomic Bomb”
boazbarak
Aug 17, 2023, 2:48 AM
51
points
7
comments
8
min read
LW
link
Announcing Squiggle Hub
ozziegooen
and
Slava Matyukhin
Aug 5, 2023, 1:00 AM
49
points
4
comments
5
min read
LW
link
(forum.effectivealtruism.org)
AI #26: Fine Tuning Time
Zvi
Aug 24, 2023, 3:30 PM
49
points
6
comments
33
min read
LW
link
(thezvi.wordpress.com)
AI #24: Week of the Podcast
Zvi
Aug 10, 2023, 3:00 PM
49
points
5
comments
44
min read
LW
link
(thezvi.wordpress.com)
Barbieheimer: Across the Dead Reckoning
Zvi
Aug 1, 2023, 1:00 PM
49
points
17
comments
41
min read
LW
link
(thezvi.wordpress.com)
how 2 tell if ur input is out of distribution given only model weights
dkirmani
Aug 5, 2023, 10:45 PM
48
points
10
comments
1
min read
LW
link
Assessment of intelligence agency functionality is difficult yet important
trevor
Aug 24, 2023, 1:42 AM
48
points
5
comments
9
min read
LW
link
Perpetually Declining Population?
jefftk
Aug 8, 2023, 1:30 AM
48
points
29
comments
3
min read
LW
link
(www.jefftk.com)
Chess as a case study in hidden capabilities in ChatGPT
AdamYedidia
Aug 19, 2023, 6:35 AM
47
points
32
comments
6
min read
LW
link
Understanding and visualizing sycophancy datasets
Nina Panickssery
Aug 16, 2023, 5:34 AM
46
points
0
comments
6
min read
LW
link
Autonomous replication and adaptation: an attempt at a concrete danger threshold
Hjalmar_Wijk
Aug 17, 2023, 1:31 AM
45
points
0
comments
13
min read
LW
link
A Model-based Approach to AI Existential Risk
Sammy Martin
,
Lonnie Chrisman
and
Aryeh Englander
Aug 25, 2023, 10:32 AM
45
points
9
comments
32
min read
LW
link
Manifund: What we’re funding (weeks 2-4)
Austin Chen
Aug 4, 2023, 4:00 PM
44
points
2
comments
LW
link
(manifund.substack.com)
The Sinews of Sudan’s Latest War
Tim Liptrot
Aug 4, 2023, 6:17 PM
43
points
12
comments
12
min read
LW
link
Is Chinese total factor productivity lower today than it was in 1956?
Ege Erdil
Aug 18, 2023, 10:33 PM
43
points
0
comments
26
min read
LW
link
Monthly Roundup #9: August 2023
Zvi
Aug 7, 2023, 1:20 PM
42
points
25
comments
57
min read
LW
link
(thezvi.wordpress.com)
[Linkpost] Personal and Psychological Dimensions of AI Researchers Confronting AI Catastrophic Risks
Bogdan Ionut Cirstea
12 Aug 2023 22:02 UTC
42
points
0
comments
1
min read
LW
link
Some rules for life (v.0,0)
Neil
17 Aug 2023 0:43 UTC
42
points
13
comments
12
min read
LW
link
(neilwarren.substack.com)
[Question]
Which possible AI systems are relatively safe?
Zach Stein-Perlman
21 Aug 2023 17:00 UTC
42
points
20
comments
1
min read
LW
link
Walk while you talk: don’t balk at “no chalk”
dkl9
22 Aug 2023 21:27 UTC
41
points
10
comments
2
min read
LW
link
(dkl9.net)
AGI is easier than robotaxis
Daniel Kokotajlo
13 Aug 2023 17:00 UTC
41
points
30
comments
4
min read
LW
link
marine cloud brightening
bhauth
9 Aug 2023 2:50 UTC
40
points
14
comments
3
min read
LW
link
(www.bhauth.com)
Seth Explains Consciousness
Jacob Falkovich
22 Aug 2023 18:06 UTC
39
points
130
comments
14
min read
LW
link
1
review
(putanumonit.com)
Implications of evidential cooperation in large worlds
Lukas Finnveden
23 Aug 2023 0:43 UTC
39
points
4
comments
17
min read
LW
link
(lukasfinnveden.substack.com)
Previous
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel