Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Page
1
The slingshot helps with learning
Wilson Wu
Oct 31, 2024, 11:18 PM
33
points
0
comments
8
min read
LW
link
Toward Safety Case Inspired Basic Research
Lucas Teixeira
,
Lauren Greenspan
,
Dmitry Vaintrob
and
Eric Winsor
Oct 31, 2024, 11:06 PM
55
points
3
comments
13
min read
LW
link
Spooky Recommendation System Scaling
phdead
Oct 31, 2024, 10:00 PM
11
points
0
comments
4
min read
LW
link
‘Meta’, ‘mesa’, and mountains
Lorec
Oct 31, 2024, 5:25 PM
1
point
0
comments
3
min read
LW
link
Toward Safety Cases For AI Scheming
Mikita Balesni
and
Marius Hobbhahn
Oct 31, 2024, 5:20 PM
60
points
1
comment
2
min read
LW
link
AI #88: Thanks for the Memos
Zvi
Oct 31, 2024, 3:00 PM
46
points
5
comments
77
min read
LW
link
(thezvi.wordpress.com)
The Compendium, A full argument about extinction risk from AGI
adamShimi
,
Gabriel Alfour
,
Connor Leahy
,
Chris Scammell
and
Andrea_Miotti
Oct 31, 2024, 12:01 PM
195
points
52
comments
2
min read
LW
link
(www.thecompendium.ai)
Some Preliminary Notes on the Promise of a Wisdom Explosion
Chris_Leong
Oct 31, 2024, 9:21 AM
2
points
0
comments
1
min read
LW
link
(aiimpacts.org)
What TMS is like
Sable
Oct 31, 2024, 12:44 AM
208
points
23
comments
6
min read
LW
link
(affablyevil.substack.com)
AI Safety at the Frontier: Paper Highlights, October ’24
gasteigerjo
Oct 31, 2024, 12:09 AM
3
points
0
comments
9
min read
LW
link
(aisafetyfrontier.substack.com)
Standard SAEs Might Be Incoherent: A Choosing Problem & A “Concise” Solution
Kola Ayonrinde
Oct 30, 2024, 10:50 PM
27
points
0
comments
12
min read
LW
link
Generic advice caveats
Saul Munn
Oct 30, 2024, 9:03 PM
27
points
1
comment
3
min read
LW
link
(www.brasstacks.blog)
I turned decision theory problems into memes about trolleys
Tapatakt
Oct 30, 2024, 8:13 PM
104
points
23
comments
1
min read
LW
link
AI as a powerful meme, via CGP Grey
TheManxLoiner
Oct 30, 2024, 6:31 PM
46
points
8
comments
4
min read
LW
link
[Question]
How might language influence how an AI “thinks”?
bodry
Oct 30, 2024, 5:41 PM
3
points
0
comments
1
min read
LW
link
Motivation control
Joe Carlsmith
Oct 30, 2024, 5:15 PM
45
points
7
comments
52
min read
LW
link
Updating the NAO Simulator
jefftk
Oct 30, 2024, 1:50 PM
11
points
0
comments
2
min read
LW
link
(www.jefftk.com)
Occupational Licensing Roundup #1
Zvi
Oct 30, 2024, 11:00 AM
65
points
11
comments
11
min read
LW
link
(thezvi.wordpress.com)
Three Notions of “Power”
johnswentworth
Oct 30, 2024, 6:10 AM
92
points
44
comments
4
min read
LW
link
Introduction to Choice set Misspecification in Reward Inference
Rahul Chand
Oct 29, 2024, 10:57 PM
1
point
0
comments
8
min read
LW
link
Gothenburg LW/ACX meetup
Stefan
Oct 29, 2024, 8:40 PM
2
points
0
comments
1
min read
LW
link
The Alignment Trap: AI Safety as Path to Power
crispweed
Oct 29, 2024, 3:21 PM
57
points
17
comments
5
min read
LW
link
(upcoder.com)
Housing Roundup #10
Zvi
Oct 29, 2024, 1:50 PM
32
points
2
comments
32
min read
LW
link
(thezvi.wordpress.com)
[Intuitive self-models] 7. Hearing Voices, and Other Hallucinations
Steven Byrnes
Oct 29, 2024, 1:36 PM
51
points
2
comments
16
min read
LW
link
Review: “The Case Against Reality”
David Gross
Oct 29, 2024, 1:13 PM
20
points
9
comments
5
min read
LW
link
A Poem Is All You Need: Jailbreaking ChatGPT, Meta & More
Sharat Jacob Jacob
Oct 29, 2024, 12:41 PM
12
points
0
comments
9
min read
LW
link
Searching for phenomenal consciousness in LLMs: Perceptual reality monitoring and introspective confidence
EuanMcLean
Oct 29, 2024, 12:16 PM
45
points
9
comments
26
min read
LW
link
AI #87: Staying in Character
Zvi
Oct 29, 2024, 7:10 AM
57
points
3
comments
33
min read
LW
link
(thezvi.wordpress.com)
A path to human autonomy
Nathan Helm-Burger
Oct 29, 2024, 3:02 AM
53
points
16
comments
20
min read
LW
link
D&D.Sci Coliseum: Arena of Data Evaluation and Ruleset
aphyer
Oct 29, 2024, 1:21 AM
47
points
13
comments
6
min read
LW
link
Gwern: Why So Few Matt Levines?
kave
Oct 29, 2024, 1:07 AM
78
points
10
comments
1
min read
LW
link
(gwern.net)
October 2024 Progress in Guaranteed Safe AI
Quinn
Oct 28, 2024, 11:34 PM
7
points
0
comments
1
min read
LW
link
(gsai.substack.com)
5 homegrown EA projects, seeking small donors
Austin Chen
Oct 28, 2024, 11:24 PM
85
points
4
comments
LW
link
How might we solve the alignment problem? (Part 1: Intro, summary, ontology)
Joe Carlsmith
Oct 28, 2024, 9:57 PM
54
points
5
comments
32
min read
LW
link
Enhancing Mathematical Modeling with LLMs: Goals, Challenges, and Evaluations
ozziegooen
Oct 28, 2024, 9:44 PM
7
points
0
comments
LW
link
AI & wisdom 3: AI effects on amortised optimisation
L Rudolf L
Oct 28, 2024, 9:08 PM
18
points
0
comments
14
min read
LW
link
(rudolf.website)
AI & wisdom 2: growth and amortised optimisation
L Rudolf L
Oct 28, 2024, 9:07 PM
18
points
0
comments
8
min read
LW
link
(rudolf.website)
AI & wisdom 1: wisdom, amortised optimisation, and AI
L Rudolf L
Oct 28, 2024, 9:02 PM
29
points
0
comments
15
min read
LW
link
(rudolf.website)
Finishing The SB-1047 Documentary In 6 Weeks
Michaël Trazzi
Oct 28, 2024, 8:17 PM
94
points
7
comments
4
min read
LW
link
(manifund.org)
Towards the Operationalization of Philosophy & Wisdom
Thane Ruthenis
Oct 28, 2024, 7:45 PM
20
points
2
comments
33
min read
LW
link
(aiimpacts.org)
Quantitative Trading Bootcamp [Nov 6-10]
Ricki Heicklen
Oct 28, 2024, 6:39 PM
7
points
0
comments
1
min read
LW
link
Winners of the Essay competition on the Automation of Wisdom and Philosophy
owencb
and
AI Impacts
Oct 28, 2024, 5:10 PM
40
points
3
comments
30
min read
LW
link
(blog.aiimpacts.org)
Miles Brundage: Finding Ways to Credibly Signal the Benignness of AI Development and Deployment is an Urgent Priority
Zach Stein-Perlman
Oct 28, 2024, 5:00 PM
22
points
4
comments
3
min read
LW
link
(milesbrundage.substack.com)
[Question]
somebody explain the word “epistemic” to me
KvmanThinking
Oct 28, 2024, 4:40 PM
7
points
8
comments
1
min read
LW
link
~80 Interesting Questions about Foundation Model Agent Safety
RohanS
and
Govind Pimpale
Oct 28, 2024, 4:37 PM
46
points
4
comments
15
min read
LW
link
AI Safety Newsletter #43: White House Issues First National Security Memo on AI Plus, AI and Job Displacement, and AI Takes Over the Nobels
Corin Katzke
,
Corin Katzke
,
Alexa Pan
and
Dan H
Oct 28, 2024, 4:03 PM
6
points
0
comments
6
min read
LW
link
(newsletter.safe.ai)
Death notes − 7 thoughts on death
Nathan Young
Oct 28, 2024, 3:01 PM
26
points
1
comment
5
min read
LW
link
(nathanpmyoung.substack.com)
SAEs you can See: Applying Sparse Autoencoders to Clustering
Robert_AIZI
Oct 28, 2024, 2:48 PM
27
points
0
comments
10
min read
LW
link
Bridging the VLM and mech interp communities for multimodal interpretability
Sonia Joseph
28 Oct 2024 14:41 UTC
19
points
5
comments
15
min read
LW
link
How Likely Are Various Precursors of Existential Risk?
NunoSempere
28 Oct 2024 13:27 UTC
55
points
4
comments
15
min read
LW
link
(blog.sentinel-team.org)
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel