Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Page
1
Luck based medicine: inositol for anxiety and brain fog
Elizabeth
Sep 22, 2023, 8:10 PM
40
points
5
comments
3
min read
LW
link
(acesounderglass.com)
If influence functions are not approximating leave-one-out, how are they supposed to help?
Fabien Roger
Sep 22, 2023, 2:23 PM
66
points
5
comments
3
min read
LW
link
Modeling p(doom) with TrojanGDP
K. Liam Smith
Sep 22, 2023, 2:19 PM
−2
points
2
comments
13
min read
LW
link
Let’s talk about Impostor syndrome in AI safety
Igor Ivanov
Sep 22, 2023, 1:51 PM
30
points
4
comments
3
min read
LW
link
Fund Transit With Development
jefftk
Sep 22, 2023, 11:10 AM
47
points
22
comments
3
min read
LW
link
(www.jefftk.com)
Atoms to Agents Proto-Lectures
johnswentworth
Sep 22, 2023, 6:22 AM
96
points
14
comments
2
min read
LW
link
(www.youtube.com)
Would You Work Harder In The Least Convenient Possible World?
Firinn
Sep 22, 2023, 5:17 AM
99
points
100
comments
9
min read
LW
link
2
reviews
Contra Kevin Dorst’s Rational Polarization
azsantosk
Sep 22, 2023, 4:28 AM
8
points
2
comments
9
min read
LW
link
ACX Boston—Petrov Day 2023
duck_master
Sep 22, 2023, 1:13 AM
2
points
0
comments
1
min read
LW
link
What social science research do you want to see reanalyzed?
Michael Wiebe
Sep 22, 2023, 12:03 AM
14
points
9
comments
1
min read
LW
link
Immortality or death by AGI
ImmortalityOrDeathByAGI
Sep 21, 2023, 11:59 PM
47
points
30
comments
4
min read
LW
link
(forum.effectivealtruism.org)
Neel Nanda on the Mechanistic Interpretability Researcher Mindset
Michaël Trazzi
Sep 21, 2023, 7:47 PM
37
points
1
comment
3
min read
LW
link
(theinsideview.ai)
Require AGI to be Explainable
PeterMcCluskey
Sep 21, 2023, 4:11 PM
5
points
0
comments
6
min read
LW
link
(bayesianinvestor.com)
Update to “Dominant Assurance Contract Platform”
moyamo
Sep 21, 2023, 4:09 PM
32
points
1
comment
1
min read
LW
link
Sparse Autoencoders: Future Work
Logan Riggs
and
Aidan Ewart
Sep 21, 2023, 3:30 PM
35
points
5
comments
6
min read
LW
link
Sparse Autoencoders Find Highly Interpretable Directions in Language Models
Logan Riggs
,
Hoagy
,
Aidan Ewart
and
Robert_AIZI
Sep 21, 2023, 3:30 PM
159
points
8
comments
5
min read
LW
link
There should be more AI safety orgs
Marius Hobbhahn
Sep 21, 2023, 2:53 PM
181
points
25
comments
17
min read
LW
link
Ward 5: Jack Perenick and Naima Sait
jefftk
Sep 21, 2023, 1:00 PM
24
points
0
comments
1
min read
LW
link
(www.jefftk.com)
AI #30: Dalle-3 and GPT-3.5-Instruct-Turbo
Zvi
Sep 21, 2023, 12:00 PM
75
points
8
comments
47
min read
LW
link
(thezvi.wordpress.com)
[Question]
How are rationalists or orgs blocked, that you can see?
Nathan Young
Sep 21, 2023, 2:37 AM
7
points
2
comments
1
min read
LW
link
Vision Weekend US Edition
Allison Duettmann
Sep 20, 2023, 9:28 PM
4
points
0
comments
1
min read
LW
link
Foresight Vision Weekend Europe Edition
Allison Duettmann
Sep 20, 2023, 9:25 PM
3
points
0
comments
1
min read
LW
link
Notes on ChatGPT’s “memory” for strings and for events
Bill Benzon
Sep 20, 2023, 6:12 PM
3
points
0
comments
10
min read
LW
link
Belief and the Truth
Sam I am
Sep 20, 2023, 5:38 PM
2
points
14
comments
5
min read
LW
link
(open.substack.com)
Image Hijacks: Adversarial Images can Control Generative Models at Runtime
Scott Emmons
,
Luke Bailey
and
Euan Ong
Sep 20, 2023, 3:23 PM
58
points
9
comments
1
min read
LW
link
(arxiv.org)
Interpretability Externalities Case Study—Hungry Hungry Hippos
Magdalena Wache
Sep 20, 2023, 2:42 PM
64
points
22
comments
2
min read
LW
link
An Elementary Introduction to Infra-Bayesianism
CharlesRW
Sep 20, 2023, 2:29 PM
16
points
0
comments
1
min read
LW
link
Weekly Incidence Including Delay
jefftk
Sep 20, 2023, 2:00 PM
11
points
0
comments
2
min read
LW
link
(www.jefftk.com)
[Question]
The stereotype of male classical music lovers being gay
BB6
Sep 20, 2023, 1:23 PM
11
points
6
comments
1
min read
LW
link
Housing Roundup #6
Zvi
Sep 20, 2023, 1:10 PM
27
points
8
comments
14
min read
LW
link
(thezvi.wordpress.com)
Careless talk on US-China AI competition? (and criticism of CAIS coverage)
Oliver Sourbut
Sep 20, 2023, 12:46 PM
16
points
3
comments
10
min read
LW
link
3
reviews
(www.oliversourbut.net)
A New Bayesian Decision Theory
Pareto Optimal
Sep 20, 2023, 9:36 AM
−6
points
0
comments
1
min read
LW
link
(paretooptimal.substack.com)
Protest against Meta’s irreversible proliferation (Sept 29, San Francisco)
Holly_Elmore
Sep 19, 2023, 11:40 PM
54
points
33
comments
LW
link
The AI Explosion Might Never Happen
snewman
Sep 19, 2023, 11:20 PM
22
points
31
comments
9
min read
LW
link
Science of Deep Learning more tractably addresses the Sharp Left Turn than Agent Foundations
NickGabs
Sep 19, 2023, 10:06 PM
20
points
2
comments
6
min read
LW
link
Formalizing «Boundaries» with Markov blankets
Chris Lakin
Sep 19, 2023, 9:01 PM
21
points
20
comments
3
min read
LW
link
Precision of Sets of Forecasts
niplav
Sep 19, 2023, 6:19 PM
20
points
5
comments
10
min read
LW
link
The Proxy Political Party
antidefault
Sep 19, 2023, 5:47 PM
−3
points
4
comments
1
min read
LW
link
(antidefault.net)
The Limits of the Existence Proof Argument for General Intelligence
Amadeus Pagel
Sep 19, 2023, 5:45 PM
−21
points
3
comments
1
min read
LW
link
(amadeuspagel.com)
[Question]
Is there a publicly available list of examples of frontier model capabilities?
Max Kearney
Sep 19, 2023, 5:45 PM
1
point
0
comments
1
min read
LW
link
Tallinn, Estonia – ACX Meetups Everywhere Autumn 2023
Andrew
Sep 19, 2023, 4:24 PM
1
point
0
comments
1
min read
LW
link
Anthropic’s Responsible Scaling Policy & Long-Term Benefit Trust
Zac Hatfield-Dodds
Sep 19, 2023, 3:09 PM
85
points
26
comments
3
min read
LW
link
1
review
(www.anthropic.com)
AISN #22: The Landscape of US AI Legislation - Hearings, Frameworks, Bills, and Laws
Dan H
Sep 19, 2023, 2:44 PM
20
points
0
comments
5
min read
LW
link
(newsletter.safe.ai)
Compilation of Profit for Good Redteaming and Responses
Brad West
Sep 19, 2023, 1:34 PM
1
point
0
comments
9
min read
LW
link
[Link post] Michael Nielsen’s “Notes on Existential Risk from Artificial Superintelligence”
Joel Becker
Sep 19, 2023, 1:31 PM
67
points
12
comments
LW
link
(michaelnotebook.com)
[Question]
Do LLMs Implement NLP Algorithms for Better Next Token Predictions?
simeon_c
Sep 19, 2023, 12:28 PM
5
points
1
comment
1
min read
LW
link
On martingales
Joey Marcellino
Sep 19, 2023, 11:39 AM
8
points
4
comments
4
min read
LW
link
Luck based medicine: angry eldritch sugar gods edition
Elizabeth
Sep 19, 2023, 4:40 AM
75
points
14
comments
9
min read
LW
link
(acesounderglass.com)
Don’t Think About the Thing Behind the Curtain.
keltan
Sep 19, 2023, 2:07 AM
4
points
0
comments
5
min read
LW
link
Panel with Israeli Prime Minister on existential risk from AI
Michaël Trazzi
Sep 18, 2023, 11:16 PM
22
points
2
comments
1
min read
LW
link
(x.com)
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel