Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
2
Best-Responding Is Not Always the Best Response
StrivingForLegibility
Jan 4, 2024, 11:30 PM
10
points
0
comments
3
min read
LW
link
Safety Data Sheets for Optimization Processes
StrivingForLegibility
Jan 4, 2024, 11:30 PM
15
points
1
comment
4
min read
LW
link
The Gears of Argmax
StrivingForLegibility
Jan 4, 2024, 11:30 PM
11
points
0
comments
3
min read
LW
link
Cellular reprogramming, pneumatic launch systems, and terraforming Mars: Some things I learned about at Foresight Vision Weekend
jasoncrawford
Jan 4, 2024, 7:33 PM
28
points
0
comments
8
min read
LW
link
(rootsofprogress.org)
Deep atheism and AI risk
Joe Carlsmith
Jan 4, 2024, 6:58 PM
153
points
22
comments
27
min read
LW
link
Some Vacation Photos
johnswentworth
Jan 4, 2024, 5:15 PM
83
points
0
comments
1
min read
LW
link
AISN #29: Progress on the EU AI Act Plus, the NY Times sues OpenAI for Copyright Infringement, and Congressional Questions about Research Standards in AI Safety
Dan H
and
Corin Katzke
Jan 4, 2024, 4:09 PM
8
points
0
comments
6
min read
LW
link
(newsletter.safe.ai)
EAG Bay Area Satellite event: AI Institution Design Hackathon 2024
beatrice@foresight.org
Jan 4, 2024, 3:02 PM
1
point
0
comments
1
min read
LW
link
AI #45: To Be Determined
Zvi
Jan 4, 2024, 3:00 PM
52
points
4
comments
31
min read
LW
link
(thezvi.wordpress.com)
Screen-supported Portable Monitor
jefftk
Jan 4, 2024, 1:50 PM
16
points
10
comments
1
min read
LW
link
(www.jefftk.com)
[Question]
Which investments for aligned-AI outcomes?
tailcalled
Jan 4, 2024, 1:28 PM
8
points
9
comments
2
min read
LW
link
Non-alignment project ideas for making transformative AI go well
Lukas Finnveden
Jan 4, 2024, 7:23 AM
44
points
1
comment
LW
link
(www.forethought.org)
Fact Checking and Retaliation Against Sources
jefftk
Jan 4, 2024, 12:41 AM
7
points
2
comments
4
min read
LW
link
(www.jefftk.com)
Investigating Alternative Futures: Human and Superintelligence Interaction Scenarios
Hiroshi Yamakawa
Jan 3, 2024, 11:46 PM
1
point
0
comments
17
min read
LW
link
“Attitudes Toward Artificial General Intelligence: Results from American Adults 2021 and 2023”—call for reviewers (Seeds of Science)
rogersbacon
Jan 3, 2024, 8:11 PM
4
points
0
comments
1
min read
LW
link
What’s up with LLMs representing XORs of arbitrary features?
Sam Marks
Jan 3, 2024, 7:44 PM
158
points
63
comments
16
min read
LW
link
Spirit Airlines Merger Play
sapphire
Jan 3, 2024, 7:25 PM
5
points
12
comments
1
min read
LW
link
$300 for the best sci-fi prompt: the results
RomanS
Jan 3, 2024, 7:10 PM
16
points
19
comments
7
min read
LW
link
Agent membranes/boundaries and formalizing “safety”
Chipmonk
Jan 3, 2024, 5:55 PM
26
points
46
comments
3
min read
LW
link
Safety First: safety before full alignment. The deontic sufficiency hypothesis.
Chipmonk
Jan 3, 2024, 5:55 PM
48
points
3
comments
3
min read
LW
link
Practically A Book Review: Appendix to “Nonlinear’s Evidence: Debunking False and Misleading Claims” (ThingOfThings)
tailcalled
Jan 3, 2024, 5:07 PM
111
points
25
comments
2
min read
LW
link
(thingofthings.substack.com)
Trivial Mathematics as a Path Forward
ACrackedPot
Jan 3, 2024, 4:41 PM
−4
points
2
comments
2
min read
LW
link
Copyright Confrontation #1
Zvi
Jan 3, 2024, 3:50 PM
34
points
7
comments
18
min read
LW
link
(thezvi.wordpress.com)
[Question]
Theoretically, could we balance the budget painlessly?
Logan Zoellner
Jan 3, 2024, 2:46 PM
4
points
12
comments
1
min read
LW
link
Johannes’ Biography
Johannes C. Mayer
Jan 3, 2024, 1:27 PM
24
points
0
comments
10
min read
LW
link
What Helped Me—Kale, Blood, CPAP, X-tiamine, Methylphenidate
Johannes C. Mayer
Jan 3, 2024, 1:22 PM
35
points
12
comments
2
min read
LW
link
[Question]
Does LessWrong make a difference when it comes to AI alignment?
PhilosophicalSoul
Jan 3, 2024, 12:21 PM
18
points
13
comments
1
min read
LW
link
[Question]
Terminology: <something>-ware for ML?
Oliver Sourbut
Jan 3, 2024, 11:42 AM
17
points
27
comments
1
min read
LW
link
Trading off Lives
jefftk
Jan 3, 2024, 3:40 AM
53
points
12
comments
2
min read
LW
link
(www.jefftk.com)
MonoPoly Restricted Trust
ymeskhout
Jan 2, 2024, 11:02 PM
42
points
37
comments
9
min read
LW
link
Agent membranes and causal distance
Chipmonk
Jan 2, 2024, 10:43 PM
20
points
3
comments
3
min read
LW
link
Focusing on Mal-Alignment
John Fisher
Jan 2, 2024, 7:51 PM
1
point
0
comments
1
min read
LW
link
Gentleness and the artificial Other
Joe Carlsmith
Jan 2, 2024, 6:21 PM
313
points
33
comments
11
min read
LW
link
Otherness and control in the age of AGI
Joe Carlsmith
Jan 2, 2024, 6:15 PM
43
points
0
comments
7
min read
LW
link
Apologizing is a Core Rationalist Skill
johnswentworth
Jan 2, 2024, 5:47 PM
156
points
42
comments
5
min read
LW
link
Cortés, AI Risk, and the Dynamics of Competing Conquerors
James_Miller
Jan 2, 2024, 4:37 PM
14
points
2
comments
3
min read
LW
link
OpenAI’s Preparedness Framework: Praise & Recommendations
Orpheus16
Jan 2, 2024, 4:20 PM
66
points
1
comment
7
min read
LW
link
Dating Roundup #2: If At First You Don’t Succeed
Zvi
Jan 2, 2024, 4:00 PM
54
points
29
comments
47
min read
LW
link
(thezvi.wordpress.com)
Looking for Reading Recommendations: Content Moderation, Power & Censorship
Joerg Weiss
Jan 2, 2024, 11:37 AM
2
points
7
comments
1
min read
LW
link
AI Is Not Software
Davidmanheim
Jan 2, 2024, 7:58 AM
58
points
29
comments
5
min read
LW
link
Are Metaculus AI Timelines Inconsistent?
Chris_Leong
Jan 2, 2024, 6:47 AM
17
points
7
comments
2
min read
LW
link
Boston Solstice 2023 Retrospective
jefftk
Jan 2, 2024, 3:10 AM
33
points
0
comments
6
min read
LW
link
(www.jefftk.com)
Steering Llama-2 with contrastive activation additions
Nina Panickssery
,
Wuschel Schulz
,
NickGabs
,
Meg
,
evhub
and
TurnTrout
2 Jan 2024 0:47 UTC
125
points
29
comments
8
min read
LW
link
(arxiv.org)
Twin Cities ACX Meetup—January 2024
Timothy M.
1 Jan 2024 21:13 UTC
1
point
2
comments
1
min read
LW
link
San Francisco ACX Meetup “First Saturday”
guenael
1 Jan 2024 20:58 UTC
1
point
1
comment
1
min read
LW
link
Mech Interp Challenge: January—Deciphering the Caesar Cipher Model
CallumMcDougall
1 Jan 2024 18:03 UTC
17
points
0
comments
3
min read
LW
link
Aldix and the Book of Life
ville
1 Jan 2024 17:23 UTC
1
point
0
comments
4
min read
LW
link
(medium.com)
Metaculus Hosts ACX 2024 Prediction Contest
ChristianWilliams
1 Jan 2024 16:38 UTC
4
points
0
comments
LW
link
(www.metaculus.com)
The Act Itself: Exceptionless Moral Norms
SebastianG
1 Jan 2024 16:06 UTC
5
points
11
comments
6
min read
LW
link
Deception Chess
Chris Land
1 Jan 2024 15:40 UTC
7
points
2
comments
4
min read
LW
link
Previous
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel