Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
1
Notes from the Qatar Center for Global Banking and Finance 3rd Annual Conference
PixelatedPenguin
Jul 7, 2023, 11:48 PM
2
points
0
comments
1
min read
LW
link
Introducing bayescalc.io
Adele Lopez
Jul 7, 2023, 4:11 PM
115
points
29
comments
1
min read
LW
link
(bayescalc.io)
Meetup Tip: Ask Attendees To Explain It
Screwtape
Jul 7, 2023, 4:08 PM
10
points
0
comments
4
min read
LW
link
Interpreting Modular Addition in MLPs
Bart Bussmann
Jul 7, 2023, 9:22 AM
20
points
0
comments
6
min read
LW
link
Internal independent review for language model agent alignment
Seth Herd
Jul 7, 2023, 6:54 AM
55
points
30
comments
11
min read
LW
link
[Question]
Can LessWrong provide me with something I find obviously highly useful to my own practical life?
agrippa
Jul 7, 2023, 3:08 AM
32
points
4
comments
1
min read
LW
link
ask me about technology
bhauth
Jul 7, 2023, 2:03 AM
23
points
42
comments
1
min read
LW
link
Apparently, of the 195 Million the DoD allocated in University Research Funding Awards in 2022, more than half of them concerned AI or compute hardware research
mako yass
Jul 7, 2023, 1:20 AM
41
points
5
comments
2
min read
LW
link
(www.defense.gov)
What are the best non-LW places to read on alignment progress?
Raemon
Jul 7, 2023, 12:57 AM
50
points
14
comments
1
min read
LW
link
Two paths to win the AGI transition
Nathan Helm-Burger
Jul 6, 2023, 9:59 PM
11
points
8
comments
4
min read
LW
link
Empirical Evidence Against “The Longest Training Run”
NickGabs
Jul 6, 2023, 6:32 PM
31
points
0
comments
14
min read
LW
link
Progress Studies Fellowship looking for members
jay ram
Jul 6, 2023, 5:41 PM
3
points
0
comments
1
min read
LW
link
BOUNTY AVAILABLE: AI ethicists, what are your object-level arguments against AI notkilleveryoneism?
Peter Berggren
Jul 6, 2023, 5:32 PM
18
points
6
comments
2
min read
LW
link
Layering and Technical Debt in the Global Wayfinding Model
herschel
Jul 6, 2023, 5:30 PM
14
points
0
comments
3
min read
LW
link
Localizing goal misgeneralization in a maze-solving policy network
Jan Betley
Jul 6, 2023, 4:21 PM
37
points
2
comments
7
min read
LW
link
Jesse Hoogland on Developmental Interpretability and Singular Learning Theory
Michaël Trazzi
Jul 6, 2023, 3:46 PM
42
points
2
comments
4
min read
LW
link
(theinsideview.ai)
Progress links and tweets, 2023-07-06: Terraformer Mark One, Israeli water management, & more
jasoncrawford
Jul 6, 2023, 3:35 PM
18
points
4
comments
2
min read
LW
link
(rootsofprogress.org)
Towards Non-Panopticon AI Alignment
Logan Zoellner
Jul 6, 2023, 3:29 PM
7
points
0
comments
3
min read
LW
link
A Defense of Work on Mathematical AI Safety
Davidmanheim
Jul 6, 2023, 2:15 PM
28
points
13
comments
3
min read
LW
link
(forum.effectivealtruism.org)
Understanding the two most common mental health problems in the world
spencerg
Jul 6, 2023, 2:06 PM
19
points
0
comments
LW
link
Announcing the EA Archive
Aaron Bergman
Jul 6, 2023, 1:49 PM
13
points
2
comments
LW
link
Agency begets agency
Richard_Ngo
Jul 6, 2023, 1:08 PM
60
points
1
comment
4
min read
LW
link
AI #19: Hofstadter, Sutskever, Leike
Zvi
Jul 6, 2023, 12:50 PM
60
points
16
comments
40
min read
LW
link
(thezvi.wordpress.com)
Do you feel that AGI Alignment could be achieved in a Type 0 civilization?
Super AGI
Jul 6, 2023, 4:52 AM
−2
points
1
comment
1
min read
LW
link
Open Thread—July 2023
Ruby
Jul 6, 2023, 4:50 AM
11
points
35
comments
1
min read
LW
link
AI Intermediation
jefftk
Jul 6, 2023, 1:50 AM
12
points
0
comments
1
min read
LW
link
(www.jefftk.com)
Announcing Manifund Regrants
Austin Chen
Jul 5, 2023, 7:42 PM
74
points
8
comments
LW
link
Infra-Bayesian Logic
harfe
and
Yegreg
Jul 5, 2023, 7:16 PM
15
points
2
comments
1
min read
LW
link
[Linkpost] Introducing Superalignment
beren
Jul 5, 2023, 6:23 PM
175
points
69
comments
1
min read
LW
link
(openai.com)
If you wish to make an apple pie, you must first become dictator of the universe
jasoncrawford
Jul 5, 2023, 6:14 PM
27
points
9
comments
13
min read
LW
link
(rootsofprogress.org)
An AGI kill switch with defined security properties
Peterpiper
Jul 5, 2023, 5:40 PM
−5
points
6
comments
1
min read
LW
link
The risk-reward tradeoff of interpretability research
JustinShovelain
and
Elliot Mckernon
Jul 5, 2023, 5:05 PM
15
points
1
comment
6
min read
LW
link
(tentatively) Found 600+ Monosemantic Features in a Small LM Using Sparse Autoencoders
Logan Riggs
Jul 5, 2023, 4:49 PM
60
points
1
comment
7
min read
LW
link
[Question]
What did AI Safety’s specific funding of AGI R&D labs lead to?
Remmelt
Jul 5, 2023, 3:51 PM
3
points
0
comments
LW
link
AISN #13: An interdisciplinary perspective on AI proxy failures, new competitors to ChatGPT, and prompting language models to misbehave
Dan H
Jul 5, 2023, 3:33 PM
13
points
0
comments
LW
link
Exploring Functional Decision Theory (FDT) and a modified version (ModFDT)
MiguelDev
Jul 5, 2023, 2:06 PM
11
points
11
comments
15
min read
LW
link
Optimized for Something other than Winning or: How Cricket Resists Moloch and Goodhart’s Law
A.H.
Jul 5, 2023, 12:33 PM
53
points
26
comments
4
min read
LW
link
Puffer-pope reality check
Neil
Jul 5, 2023, 9:27 AM
20
points
2
comments
1
min read
LW
link
Final Lightspeed Grants coworking/office hours before the application deadline
habryka
Jul 5, 2023, 6:03 AM
13
points
2
comments
1
min read
LW
link
MXR Talkbox Cap?
jefftk
Jul 5, 2023, 1:50 AM
9
points
0
comments
1
min read
LW
link
(www.jefftk.com)
“Reification”
herschel
Jul 5, 2023, 12:53 AM
11
points
4
comments
2
min read
LW
link
Dominant Assurance Contract Experiment #2: Berkeley House Dinners
Arjun Panickssery
Jul 5, 2023, 12:13 AM
51
points
8
comments
1
min read
LW
link
(arjunpanickssery.substack.com)
Three camps in AI x-risk discussions: My personal very oversimplified overview
Aryeh Englander
Jul 4, 2023, 8:42 PM
21
points
0
comments
LW
link
Six (and a half) intuitions for SVD
CallumMcDougall
4 Jul 2023 19:23 UTC
71
points
1
comment
1
min read
LW
link
Animal Weapons: Lessons for Humans in the Age of X-Risk
Damin Curtis
4 Jul 2023 18:14 UTC
4
points
0
comments
10
min read
LW
link
Apocalypse Prepping—Concise SHTF guide to prepare for AGI doomsday
prepper
4 Jul 2023 17:41 UTC
−7
points
9
comments
1
min read
LW
link
(prepper.i2phides.me)
Ways I Expect AI Regulation To Increase Extinction Risk
1a3orn
4 Jul 2023 17:32 UTC
226
points
32
comments
7
min read
LW
link
AI labs’ statements on governance
Zach Stein-Perlman
4 Jul 2023 16:30 UTC
30
points
0
comments
36
min read
LW
link
AIs teams will probably be more superintelligent than individual AIs
Robert_AIZI
4 Jul 2023 14:06 UTC
3
points
1
comment
2
min read
LW
link
(aizi.substack.com)
What I Think About When I Think About History
Jacob G-W
4 Jul 2023 14:02 UTC
3
points
4
comments
3
min read
LW
link
(g-w1.github.io)
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel