Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
Page
1
Some miscellaneous thoughts on ChatGPT, stories, and mechanical interpretability
Bill Benzon
Feb 4, 2023, 7:35 PM
2
points
0
comments
3
min read
LW
link
O(“AGI Safety”)>O(“Stop Tyrants”)
AnthonyRepetto
Feb 4, 2023, 6:38 PM
−4
points
11
comments
1
min read
LW
link
Monthly Doom Argument Threads? Doom Argument Wiki?
LVSN
Feb 4, 2023, 4:59 PM
3
points
0
comments
1
min read
LW
link
The Future of Structured Self Improvement
Evenflair
Feb 4, 2023, 4:02 PM
27
points
4
comments
1
min read
LW
link
(guildoftherose.org)
Empathy as a natural consequence of learnt reward models
beren
Feb 4, 2023, 3:35 PM
48
points
27
comments
13
min read
LW
link
Mech Interp Project Advising Call: Memorisation in GPT-2 Small
Neel Nanda
Feb 4, 2023, 2:17 PM
7
points
0
comments
1
min read
LW
link
Do IQ tests measure intelligence? - A prediction market on my future beliefs about the topic
tailcalled
Feb 4, 2023, 11:19 AM
1
point
10
comments
1
min read
LW
link
(manifold.markets)
AXRP Episode 19 - Mechanistic Interpretability with Neel Nanda
DanielFilan
Feb 4, 2023, 3:00 AM
45
points
0
comments
117
min read
LW
link
The 2/3 rule for multi-factor authentication
RomanHauksson
Feb 4, 2023, 2:57 AM
4
points
0
comments
1
min read
LW
link
(roman.computer)
Path-Dependence in ChatGPT’s Political Outputs
lsusr
Feb 4, 2023, 2:02 AM
28
points
4
comments
4
min read
LW
link
Fucking Goddamn Basics of Rationalist Discourse
LoganStrohl
Feb 4, 2023, 1:47 AM
351
points
103
comments
1
min read
LW
link
3
reviews
Small Talk is Good, Actually
Gordon Seidoh Worley
Feb 4, 2023, 12:38 AM
52
points
9
comments
3
min read
LW
link
Update on Book Review Dominant Assurance Contract
Arjun Panickssery
Feb 3, 2023, 11:16 PM
9
points
0
comments
LW
link
[Question]
2+2=π√2+n
Logan Zoellner
Feb 3, 2023, 10:27 PM
16
points
15
comments
1
min read
LW
link
[Question]
If I encounter a capabilities paper that kinda spooks me, what should I do with it?
the gears to ascension
Feb 3, 2023, 9:37 PM
28
points
8
comments
1
min read
LW
link
[Question]
What Are The Preconditions/Prerequisites for Asymptotic Analysis?
DragonGod
Feb 3, 2023, 9:26 PM
8
points
2
comments
1
min read
LW
link
[Linkpost] Google invested $300M in Anthropic in late 2022
Orpheus16
Feb 3, 2023, 7:13 PM
73
points
14
comments
1
min read
LW
link
(www.ft.com)
Many AI governance proposals have a tradeoff between usefulness and feasibility
Orpheus16
and
Carson Ezell
Feb 3, 2023, 6:49 PM
22
points
2
comments
2
min read
LW
link
Reply to Duncan Sabien on Strawmanning
Zack_M_Davis
Feb 3, 2023, 5:57 PM
42
points
11
comments
4
min read
LW
link
Semi-rare plain language words that are great to remember
LVSN
Feb 3, 2023, 4:33 PM
4
points
7
comments
1
min read
LW
link
[Question]
What qualities does an AGI need to have to realize the risk of false vacuum, without hardcoding physics theories into it?
RationalSieve
Feb 3, 2023, 4:00 PM
1
point
4
comments
1
min read
LW
link
Housing and Transit Roundup #3
Zvi
Feb 3, 2023, 3:10 PM
21
points
6
comments
16
min read
LW
link
(thezvi.wordpress.com)
Taboo P(doom)
NathanBarnard
Feb 3, 2023, 10:37 AM
14
points
10
comments
1
min read
LW
link
ChatGPT: Tantalizing afterthoughts in search of story trajectories [induction heads]
Bill Benzon
Feb 3, 2023, 10:35 AM
4
points
0
comments
20
min read
LW
link
Jordan Peterson: Guru/Villain
Bryan Frances
Feb 3, 2023, 9:02 AM
−14
points
6
comments
9
min read
LW
link
[Question]
What is the risk of asking a counterfactual oracle a question that already had its answer erased?
Chris_Leong
Feb 3, 2023, 3:13 AM
7
points
0
comments
1
min read
LW
link
I don’t think MIRI “gave up”
Raemon
Feb 3, 2023, 12:26 AM
106
points
64
comments
4
min read
LW
link
What fact that you know is true but most people aren’t ready to accept it?
lorepieri
Feb 3, 2023, 12:06 AM
47
points
211
comments
1
min read
LW
link
[Question]
Monotonous Work
Gideon Bauer
Feb 2, 2023, 9:35 PM
1
point
0
comments
1
min read
LW
link
Is AI risk assessment too anthropocentric?
Craig Mattson
Feb 2, 2023, 9:34 PM
3
points
6
comments
1
min read
LW
link
Halifax Monthly Meetup: Introduction to Effective Altruism
Ideopunk
Feb 2, 2023, 9:10 PM
10
points
0
comments
1
min read
LW
link
Conditioning Predictive Models: Outer alignment via careful conditioning
evhub
,
Adam Jermyn
,
Johannes Treutlein
,
Rubi J. Hudson
and
kcwoolverton
Feb 2, 2023, 8:28 PM
72
points
15
comments
57
min read
LW
link
Conditioning Predictive Models: Large language models as predictors
evhub
,
Adam Jermyn
,
Johannes Treutlein
,
Rubi J. Hudson
and
kcwoolverton
Feb 2, 2023, 8:28 PM
88
points
4
comments
13
min read
LW
link
Normative vs Descriptive Models of Agency
mattmacdermott
Feb 2, 2023, 8:28 PM
26
points
5
comments
4
min read
LW
link
Andrew Huberman on How to Optimize Sleep
Leon Lang
Feb 2, 2023, 8:17 PM
37
points
6
comments
6
min read
LW
link
[Question]
How can I help inflammation-based nerve damage be temporary?
Optimization Process
Feb 2, 2023, 7:20 PM
17
points
4
comments
1
min read
LW
link
More findings on maximal data dimension
Marius Hobbhahn
Feb 2, 2023, 6:33 PM
27
points
1
comment
11
min read
LW
link
Heritability, Behaviorism, and Within-Lifetime RL
Steven Byrnes
Feb 2, 2023, 4:34 PM
39
points
3
comments
4
min read
LW
link
Covid 2/2/23: The Emergency Ends on 5/11
Zvi
Feb 2, 2023, 2:00 PM
22
points
6
comments
7
min read
LW
link
(thezvi.wordpress.com)
You are probably not a good alignment researcher, and other blatant lies
junk heap homotopy
Feb 2, 2023, 1:55 PM
83
points
16
comments
2
min read
LW
link
Don’t Judge a Tool by its Average Output
silentbob
Feb 2, 2023, 1:42 PM
12
points
2
comments
4
min read
LW
link
Epoch Impact Report 2022
Jsevillamol
Feb 2, 2023, 1:09 PM
16
points
0
comments
LW
link
You Don’t Exist, Duncan
Duncan Sabien (Inactive)
2 Feb 2023 8:37 UTC
252
points
107
comments
9
min read
LW
link
Temporally Layered Architecture for Adaptive, Distributed and Continuous Control
Roman Leventov
2 Feb 2023 6:29 UTC
6
points
4
comments
1
min read
LW
link
(arxiv.org)
Research agenda: Formalizing abstractions of computations
Erik Jenner
2 Feb 2023 4:29 UTC
93
points
10
comments
31
min read
LW
link
Progress links and tweets, 2023-02-01
jasoncrawford
2 Feb 2023 2:25 UTC
10
points
0
comments
1
min read
LW
link
(rootsofprogress.org)
Retrospective on the AI Safety Field Building Hub
Vael Gates
2 Feb 2023 2:06 UTC
30
points
0
comments
LW
link
How to export Android Chrome tabs to an HTML file in Linux (as of February 2023)
Adam Scherlis
2 Feb 2023 2:03 UTC
7
points
3
comments
2
min read
LW
link
(adam.scherlis.com)
Hacked Account Spam
jefftk
2 Feb 2023 1:50 UTC
13
points
5
comments
1
min read
LW
link
(www.jefftk.com)
A simple technique to reduce negative rumination
cranberry_bear
2 Feb 2023 1:33 UTC
9
points
0
comments
1
min read
LW
link
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel