Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Page
2
[Question]
A Coordination Cookbook?
azergante
Nov 10, 2024, 11:20 PM
2
points
0
comments
1
min read
LW
link
Towards a Clever Hans Test: Unmasking Sentience Biases in Chatbot Interactions
glykokalyx
Nov 10, 2024, 10:34 PM
4
points
0
comments
1
min read
LW
link
Urbit New England Meetup
Conquerer Cohen
Nov 10, 2024, 5:56 PM
−4
points
0
comments
1
min read
LW
link
Personal AI Planning
jefftk
Nov 10, 2024, 2:00 PM
68
points
11
comments
2
min read
LW
link
(www.jefftk.com)
AI alignment via civilizational cognitive updates
AtillaYasar
Nov 10, 2024, 9:33 AM
1
point
10
comments
6
min read
LW
link
[Question]
How should vegans think about Methionine needs?
ChristianKl
Nov 10, 2024, 9:28 AM
32
points
3
comments
1
min read
LW
link
Is P(Doom) Meaningful? Bayesian vs. Popperian Epistemology Debate
Liron
Nov 9, 2024, 11:39 PM
5
points
0
comments
124
min read
LW
link
(www.youtube.com)
Bellevue Library Meetup—Nov 23
Cedar
Nov 9, 2024, 11:05 PM
5
points
3
comments
1
min read
LW
link
LifeKeeper Diaries: Exploring Misaligned AI Through Interactive Fiction
Tristan Tran
,
stijn
and
Mose Wintner
Nov 9, 2024, 8:58 PM
15
points
5
comments
2
min read
LW
link
[Question]
Poll: what’s your impression of altruism?
David Gross
Nov 9, 2024, 8:28 PM
2
points
4
comments
1
min read
LW
link
Chaos Theory in Ecology
Elizabeth
Nov 9, 2024, 5:50 PM
15
points
4
comments
20
min read
LW
link
(acesounderglass.com)
Some Comments on Recent AI Safety Developments
testingthewaters
Nov 9, 2024, 4:44 PM
4
points
0
comments
8
min read
LW
link
Formalize the Hashiness Model of AGI Uncontainability
Remmelt
Nov 9, 2024, 4:10 PM
3
points
0
comments
LW
link
(docs.google.com)
Agenda Manipulation
Pazzaz
Nov 9, 2024, 2:13 PM
2
points
0
comments
3
min read
LW
link
Force Sequential Output with SCP?
jefftk
Nov 9, 2024, 12:40 PM
9
points
4
comments
1
min read
LW
link
(www.jefftk.com)
Anthropic teams up with Palantir and AWS to sell AI to defense customers
Matrice Jacobine
Nov 9, 2024, 11:50 AM
9
points
0
comments
2
min read
LW
link
(techcrunch.com)
GPT-4o Can In Some Cases Solve Moderately Complicated Captchas
dirk
Nov 9, 2024, 4:04 AM
12
points
2
comments
1
min read
LW
link
Stone Age Herbalist’s notes on ant warfare and slavery
trevor
Nov 9, 2024, 2:40 AM
32
points
0
comments
3
min read
LW
link
(x.com)
LLMs Look Increasingly Like General Reasoners
eggsyntax
Nov 8, 2024, 11:47 PM
94
points
45
comments
3
min read
LW
link
overengineered air filter shelving
bhauth
Nov 8, 2024, 10:04 PM
26
points
2
comments
5
min read
LW
link
(bhauth.com)
Bigger Livers?
sarahconstantin
Nov 8, 2024, 9:50 PM
98
points
17
comments
6
min read
LW
link
(sarahconstantin.substack.com)
New UChicago Rationality Group
Noah Birnbaum
Nov 8, 2024, 9:20 PM
9
points
0
comments
1
min read
LW
link
Active Recall and Spaced Repetition are Different Things
Saul Munn
Nov 8, 2024, 8:14 PM
49
points
2
comments
3
min read
LW
link
(www.brasstacks.blog)
The King and the Golem—The Animation
Writer
Nov 8, 2024, 6:23 PM
70
points
0
comments
1
min read
LW
link
Boring & straightforward trauma explanation
lemonhope
Nov 8, 2024, 9:45 AM
24
points
7
comments
2
min read
LW
link
Curriculum of Ascension
andrew sauer
Nov 7, 2024, 11:54 PM
13
points
0
comments
18
min read
LW
link
Analyzing how SAE features evolve across a forward pass
bensenberner
,
danibalcells
,
Michael Oesterle
,
Ediz Ucar
and
StefanHex
Nov 7, 2024, 10:07 PM
47
points
0
comments
1
min read
LW
link
(arxiv.org)
Markets Are Information—Beating the Sportsbooks at Their Own Game
JJXW
Nov 7, 2024, 8:58 PM
9
points
1
comment
2
min read
LW
link
(thehobbyist.substack.com)
Signaling with Small Orange Diamonds
jefftk
Nov 7, 2024, 8:20 PM
40
points
1
comment
1
min read
LW
link
(www.jefftk.com)
Fundamental Uncertainty: Chapter 9 - How do we live with uncertainty?
Gordon Seidoh Worley
Nov 7, 2024, 6:15 PM
11
points
2
comments
15
min read
LW
link
AI #89: Trump Card
Zvi
Nov 7, 2024, 4:30 PM
42
points
12
comments
42
min read
LW
link
(thezvi.wordpress.com)
Quantum Immortality: A Perspective if AI Doomers are Probably Right
avturchin
and
James_Miller
Nov 7, 2024, 4:06 PM
12
points
55
comments
14
min read
LW
link
On Targeted Manipulation and Deception when Optimizing LLMs for User Feedback
Marcus Williams
,
micahcarroll
,
Adhyyan Narang
,
Constantin Weisser
and
Brendan Murphy
Nov 7, 2024, 3:39 PM
51
points
7
comments
11
min read
LW
link
In the Name of All That Needs Saving
pleiotroth
Nov 7, 2024, 3:26 PM
18
points
3
comments
22
min read
LW
link
Agency overhang as a proxy for Sharp left turn
Eris
and
Iuliia Levin
Nov 7, 2024, 12:14 PM
6
points
0
comments
5
min read
LW
link
The Case Against Moral Realism
Zero Contradictions
Nov 7, 2024, 10:14 AM
−5
points
10
comments
1
min read
LW
link
(thewaywardaxolotl.blogspot.com)
[Question]
What are the primary drivers that caused selection pressure for intelligence in humans?
Towards_Keeperhood
Nov 7, 2024, 9:40 AM
8
points
15
comments
1
min read
LW
link
The Logistics of Distribution of Meaning: Against Epistemic Bureaucratization
Sahil
Nov 7, 2024, 5:27 AM
27
points
7
comments
12
min read
LW
link
SAEs are highly dataset dependent: a case study on the refusal direction
Connor Kissane
,
robertzk
,
Neel Nanda
and
Arthur Conmy
Nov 7, 2024, 5:22 AM
66
points
4
comments
14
min read
LW
link
Should CA, TX, OK, and LA merge into a giant swing state, just for elections?
Thomas Kwa
Nov 6, 2024, 11:01 PM
115
points
35
comments
1
min read
LW
link
New Funding Category Open in Foresight’s AI Safety Grants
Allison Duettmann
Nov 6, 2024, 10:59 PM
15
points
0
comments
1
min read
LW
link
Scattered thoughts on what it means for an LLM to believe
TheManxLoiner
Nov 6, 2024, 10:10 PM
5
points
4
comments
5
min read
LW
link
The Bayesian Conspiracy Live Recording
Eneasz
Nov 6, 2024, 4:25 PM
9
points
0
comments
1
min read
LW
link
Anthropic: Three Sketches of ASL-4 Safety Case Components
Zach Stein-Perlman
Nov 6, 2024, 4:00 PM
95
points
33
comments
1
min read
LW
link
(alignment.anthropic.com)
Meme Talking Points
ymeskhout
Nov 6, 2024, 3:27 PM
34
points
0
comments
3
min read
LW
link
Advisors for Smaller Major Donors?
jefftk
Nov 6, 2024, 2:30 PM
18
points
2
comments
3
min read
LW
link
(www.jefftk.com)
Scissors Statements for President?
AnnaSalamon
Nov 6, 2024, 10:38 AM
118
points
32
comments
1
min read
LW
link
[Question]
How to cite LessWrong as an academic source?
PhilosophicalSoul
Nov 6, 2024, 8:28 AM
6
points
6
comments
1
min read
LW
link
How to put California and Texas on the campaign trail!
Yair Halberstadt
Nov 6, 2024, 6:08 AM
25
points
4
comments
1
min read
LW
link
LDT (and everything else) can be irrational
Christopher King
Nov 6, 2024, 4:05 AM
10
points
15
comments
2
min read
LW
link
Previous
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel