Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
Page
2
Are short timelines actually bad?
joshc
Feb 5, 2023, 9:21 PM
61
points
7
comments
3
min read
LW
link
Stanzas On Power Calculation
DirectedEvolution
Feb 5, 2023, 7:15 PM
9
points
0
comments
1
min read
LW
link
A List of things I might do with a Proof Oracle
Logan Zoellner
Feb 5, 2023, 6:14 PM
−14
points
13
comments
3
min read
LW
link
Teaching Simple Boundaries
jefftk
Feb 5, 2023, 5:30 PM
23
points
0
comments
2
min read
LW
link
(www.jefftk.com)
Control
TsviBT
Feb 5, 2023, 4:16 PM
21
points
14
comments
9
min read
LW
link
Have an idea? Come to Oxford to discuss and write (20 – 24 March)
RP
,
Flourish Journal
and
Jemima
Feb 5, 2023, 3:05 PM
20
points
0
comments
1
min read
LW
link
H5N1 - thread for information sharing, planning, and action
MathiasKB
Feb 5, 2023, 12:44 PM
31
points
8
comments
LW
link
Second call: CFP for Rebellion and Disobedience in AI workshop
Ram Rachum
Feb 5, 2023, 12:18 PM
2
points
0
comments
2
min read
LW
link
Research Direction: Be the AGI you want to see in the world
scottviteri
,
sudo
and
Lauro Langosco
Feb 5, 2023, 7:15 AM
44
points
0
comments
7
min read
LW
link
Sex is Good, Actually
Gordon Seidoh Worley
Feb 5, 2023, 6:33 AM
41
points
8
comments
4
min read
LW
link
Questions about AI that bother me
Eleni Angelou
Feb 5, 2023, 5:04 AM
13
points
6
comments
2
min read
LW
link
Evaluations (of new AI Safety researchers) can be noisy
LawrenceC
Feb 5, 2023, 4:15 AM
132
points
11
comments
16
min read
LW
link
1
review
Pandemic Prediction Checklist: H5N1 (6/14)
DirectedEvolution
Feb 5, 2023, 3:26 AM
50
points
10
comments
7
min read
LW
link
Podcast with Oli Habryka on LessWrong / Lightcone Infrastructure
DanielFilan
Feb 5, 2023, 2:52 AM
88
points
20
comments
1
min read
LW
link
(thefilancabinet.com)
Misleading Fast Charging Specs
jefftk
Feb 5, 2023, 2:50 AM
9
points
3
comments
1
min read
LW
link
(www.jefftk.com)
I hired 5 people to sit behind me and make me productive for a month
Simon Berens
Feb 5, 2023, 1:19 AM
252
points
83
comments
10
min read
LW
link
(www.simonberens.com)
Modal Fixpoint Cooperation without Löb’s Theorem
Andrew_Critch
Feb 5, 2023, 12:58 AM
134
points
34
comments
3
min read
LW
link
1
review
Who invented knitting? The plot thickens
eukaryote
Feb 5, 2023, 12:24 AM
60
points
9
comments
19
min read
LW
link
(eukaryotewritesblog.com)
Some miscellaneous thoughts on ChatGPT, stories, and mechanical interpretability
Bill Benzon
Feb 4, 2023, 7:35 PM
2
points
0
comments
3
min read
LW
link
O(“AGI Safety”)>O(“Stop Tyrants”)
AnthonyRepetto
Feb 4, 2023, 6:38 PM
−4
points
11
comments
1
min read
LW
link
Monthly Doom Argument Threads? Doom Argument Wiki?
LVSN
Feb 4, 2023, 4:59 PM
3
points
0
comments
1
min read
LW
link
The Future of Structured Self Improvement
Evenflair
Feb 4, 2023, 4:02 PM
27
points
4
comments
1
min read
LW
link
(guildoftherose.org)
Empathy as a natural consequence of learnt reward models
beren
Feb 4, 2023, 3:35 PM
48
points
27
comments
13
min read
LW
link
Mech Interp Project Advising Call: Memorisation in GPT-2 Small
Neel Nanda
Feb 4, 2023, 2:17 PM
7
points
0
comments
1
min read
LW
link
Do IQ tests measure intelligence? - A prediction market on my future beliefs about the topic
tailcalled
Feb 4, 2023, 11:19 AM
1
point
10
comments
1
min read
LW
link
(manifold.markets)
AXRP Episode 19 - Mechanistic Interpretability with Neel Nanda
DanielFilan
Feb 4, 2023, 3:00 AM
45
points
0
comments
117
min read
LW
link
The 2/3 rule for multi-factor authentication
RomanHauksson
Feb 4, 2023, 2:57 AM
4
points
0
comments
1
min read
LW
link
(roman.computer)
Path-Dependence in ChatGPT’s Political Outputs
lsusr
Feb 4, 2023, 2:02 AM
28
points
4
comments
4
min read
LW
link
Fucking Goddamn Basics of Rationalist Discourse
LoganStrohl
Feb 4, 2023, 1:47 AM
356
points
103
comments
1
min read
LW
link
3
reviews
Small Talk is Good, Actually
Gordon Seidoh Worley
Feb 4, 2023, 12:38 AM
52
points
9
comments
3
min read
LW
link
Update on Book Review Dominant Assurance Contract
Arjun Panickssery
Feb 3, 2023, 11:16 PM
9
points
0
comments
LW
link
[Question]
2+2=π√2+n
Logan Zoellner
Feb 3, 2023, 10:27 PM
16
points
15
comments
1
min read
LW
link
[Question]
If I encounter a capabilities paper that kinda spooks me, what should I do with it?
the gears to ascension
Feb 3, 2023, 9:37 PM
28
points
8
comments
1
min read
LW
link
[Question]
What Are The Preconditions/Prerequisites for Asymptotic Analysis?
DragonGod
Feb 3, 2023, 9:26 PM
8
points
2
comments
1
min read
LW
link
[Linkpost] Google invested $300M in Anthropic in late 2022
Orpheus16
Feb 3, 2023, 7:13 PM
73
points
14
comments
1
min read
LW
link
(www.ft.com)
Many AI governance proposals have a tradeoff between usefulness and feasibility
Orpheus16
and
Carson Ezell
Feb 3, 2023, 6:49 PM
22
points
2
comments
2
min read
LW
link
Reply to Duncan Sabien on Strawmanning
Zack_M_Davis
Feb 3, 2023, 5:57 PM
43
points
11
comments
4
min read
LW
link
Semi-rare plain language words that are great to remember
LVSN
Feb 3, 2023, 4:33 PM
4
points
7
comments
1
min read
LW
link
[Question]
What qualities does an AGI need to have to realize the risk of false vacuum, without hardcoding physics theories into it?
RationalSieve
Feb 3, 2023, 4:00 PM
1
point
4
comments
1
min read
LW
link
Housing and Transit Roundup #3
Zvi
Feb 3, 2023, 3:10 PM
21
points
6
comments
16
min read
LW
link
(thezvi.wordpress.com)
Taboo P(doom)
NathanBarnard
Feb 3, 2023, 10:37 AM
14
points
10
comments
1
min read
LW
link
ChatGPT: Tantalizing afterthoughts in search of story trajectories [induction heads]
Bill Benzon
Feb 3, 2023, 10:35 AM
4
points
0
comments
20
min read
LW
link
Jordan Peterson: Guru/Villain
Bryan Frances
3 Feb 2023 9:02 UTC
−14
points
6
comments
9
min read
LW
link
[Question]
What is the risk of asking a counterfactual oracle a question that already had its answer erased?
Chris_Leong
3 Feb 2023 3:13 UTC
7
points
0
comments
1
min read
LW
link
I don’t think MIRI “gave up”
Raemon
3 Feb 2023 0:26 UTC
106
points
64
comments
4
min read
LW
link
What fact that you know is true but most people aren’t ready to accept it?
lorepieri
3 Feb 2023 0:06 UTC
47
points
211
comments
1
min read
LW
link
[Question]
Monotonous Work
Gideon Bauer
2 Feb 2023 21:35 UTC
1
point
0
comments
1
min read
LW
link
Is AI risk assessment too anthropocentric?
Craig Mattson
2 Feb 2023 21:34 UTC
3
points
6
comments
1
min read
LW
link
Halifax Monthly Meetup: Introduction to Effective Altruism
Ideopunk
2 Feb 2023 21:10 UTC
10
points
0
comments
1
min read
LW
link
Conditioning Predictive Models: Outer alignment via careful conditioning
evhub
,
Adam Jermyn
,
Johannes Treutlein
,
Rubi J. Hudson
and
kcwoolverton
2 Feb 2023 20:28 UTC
72
points
15
comments
57
min read
LW
link
Previous
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel