Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
Page
2
Running the Numbers on a Heat Pump
jefftk
Feb 9, 2024, 3:00 AM
30
points
12
comments
4
min read
LW
link
(www.jefftk.com)
[Question]
How do high-trust societies form?
Shankar Sivarajan
Feb 9, 2024, 1:11 AM
23
points
17
comments
1
min read
LW
link
[Question]
How do health systems work in adequate worlds?
mukashi
Feb 9, 2024, 12:54 AM
10
points
2
comments
1
min read
LW
link
Twin Cities ACX Meetup—February 2024
Timothy M.
Feb 8, 2024, 11:26 PM
1
point
2
comments
1
min read
LW
link
A review of “Don’t forget the boundary problem...”
jessicata
Feb 8, 2024, 11:19 PM
12
points
1
comment
12
min read
LW
link
(unstablerontology.substack.com)
aintelope project update
Gunnar_Zarncke
Feb 8, 2024, 6:32 PM
24
points
2
comments
3
min read
LW
link
Updatelessness doesn’t solve most problems
Martín Soto
Feb 8, 2024, 5:30 PM
135
points
45
comments
12
min read
LW
link
Predicting Alignment Award Winners Using ChatGPT 4
Shoshannah Tekofsky
Feb 8, 2024, 2:38 PM
16
points
2
comments
11
min read
LW
link
AI #50: The Most Dangerous Thing
Zvi
Feb 8, 2024, 2:30 PM
53
points
4
comments
24
min read
LW
link
(thezvi.wordpress.com)
How to develop a photographic memory 3/3
PhilosophicalSoul
Feb 8, 2024, 9:22 AM
6
points
2
comments
18
min read
LW
link
Believing In
AnnaSalamon
Feb 8, 2024, 7:06 AM
241
points
51
comments
13
min read
LW
link
Measuring pre-peer-review epistemic status
Jakub Smékal
Feb 8, 2024, 5:09 AM
1
point
0
comments
2
min read
LW
link
A Chess-GPT Linear Emergent World Representation
Adam Karvonen
Feb 8, 2024, 4:25 AM
105
points
14
comments
7
min read
LW
link
(adamkarvonen.github.io)
Domestic Production vs International Wealth Creation
100YearPants
Feb 8, 2024, 4:25 AM
1
point
0
comments
1
min read
LW
link
Conditional prediction markets are evidential, not causal
philh
Feb 7, 2024, 9:52 PM
55
points
10
comments
2
min read
LW
link
A Back-Of-The-Envelope Calculation On How Unlikely The Circumstantial Evidence Around Covid-19 Is
Roko
Feb 7, 2024, 9:49 PM
−1
points
36
comments
5
min read
LW
link
Nitric oxide for covid and other viral infections
Elizabeth
Feb 7, 2024, 9:30 PM
39
points
6
comments
6
min read
LW
link
(acesounderglass.com)
Debating with More Persuasive LLMs Leads to More Truthful Answers
Akbir Khan
,
John Hughes
,
Dan Valentine
,
Sam Bowman
and
Ethan Perez
Feb 7, 2024, 9:28 PM
89
points
14
comments
9
min read
LW
link
(arxiv.org)
[Question]
Choosing a book on causality
martinkunev
Feb 7, 2024, 9:16 PM
4
points
3
comments
1
min read
LW
link
More Hyphenation
Arjun Panickssery
Feb 7, 2024, 7:43 PM
88
points
19
comments
1
min read
LW
link
(arjunpanickssery.substack.com)
Reading writing advice doesn’t make writing easier
Henry Sleight
Feb 7, 2024, 7:14 PM
17
points
0
comments
5
min read
LW
link
(open.substack.com)
[Question]
What’s this 3rd secret directive of evolution called? (survive & spread & ___)
lemonhope
Feb 7, 2024, 2:11 PM
10
points
11
comments
1
min read
LW
link
Training of superintelligence is secretly adversarial
quetzal_rainbow
Feb 7, 2024, 1:38 PM
15
points
2
comments
5
min read
LW
link
The Math of Suspicious Coincidences
Roko
Feb 7, 2024, 1:32 PM
25
points
3
comments
4
min read
LW
link
[Question]
How to deal with the sense of demotivation that comes from thinking about determinism?
SpectrumDT
Feb 7, 2024, 10:53 AM
13
points
71
comments
1
min read
LW
link
Quantum Darwinism, social constructs, and the scientific method
pchvykov
Feb 7, 2024, 7:04 AM
6
points
12
comments
9
min read
LW
link
Why I think it’s net harmful to do technical safety research at AGI labs
Remmelt
Feb 7, 2024, 4:17 AM
26
points
24
comments
1
min read
LW
link
story-based decision-making
bhauth
Feb 7, 2024, 2:35 AM
90
points
11
comments
4
min read
LW
link
Full Driving Engagement Optional
jefftk
Feb 7, 2024, 2:30 AM
14
points
0
comments
1
min read
LW
link
(www.jefftk.com)
How to train your own “Sleeper Agents”
evhub
Feb 7, 2024, 12:31 AM
92
points
11
comments
2
min read
LW
link
My guess at Conjecture’s vision: triggering a narrative bifurcation
Alexandre Variengien
Feb 6, 2024, 7:10 PM
75
points
12
comments
16
min read
LW
link
Arrogance and People Pleasing
Jonathan Moregård
Feb 6, 2024, 6:43 PM
26
points
7
comments
4
min read
LW
link
(honestliving.substack.com)
What does davidad want from «boundaries»?
Chipmonk
and
davidad
Feb 6, 2024, 5:45 PM
47
points
1
comment
5
min read
LW
link
[Question]
How can I efficiently read all the Dath Ilan worldbuilding?
mike_hawke
Feb 6, 2024, 4:52 PM
10
points
1
comment
1
min read
LW
link
Preventing model exfiltration with upload limits
ryan_greenblatt
Feb 6, 2024, 4:29 PM
71
points
22
comments
14
min read
LW
link
Evolution is an observation, not a process
Neil
Feb 6, 2024, 2:49 PM
8
points
11
comments
5
min read
LW
link
[Question]
Why do we need an understanding of the real world to predict the next tokens in a body of text?
Valentin Baltadzhiev
Feb 6, 2024, 2:43 PM
2
points
12
comments
1
min read
LW
link
On the Debate Between Jezos and Leahy
Zvi
Feb 6, 2024, 2:40 PM
64
points
6
comments
63
min read
LW
link
(thezvi.wordpress.com)
Why Two Valid Answers Approach is not Enough for Sleeping Beauty
Ape in the coat
Feb 6, 2024, 2:21 PM
6
points
12
comments
6
min read
LW
link
Are most personality disorders really trust disorders?
chaosmage
Feb 6, 2024, 12:37 PM
20
points
4
comments
1
min read
LW
link
From Conceptual Spaces to Quantum Concepts: Formalising and Learning Structured Conceptual Models
Roman Leventov
Feb 6, 2024, 10:18 AM
8
points
1
comment
4
min read
LW
link
(arxiv.org)
Fluent dreaming for language models (AI interpretability method)
tbenthompson
,
mikes
and
Zygi Straznickas
Feb 6, 2024, 6:02 AM
46
points
5
comments
1
min read
LW
link
(arxiv.org)
Selfish AI Inevitable
Davey Morse
6 Feb 2024 4:29 UTC
1
point
0
comments
1
min read
LW
link
Toy models of AI control for concentrated catastrophe prevention
Fabien Roger
and
Buck
6 Feb 2024 1:38 UTC
51
points
2
comments
7
min read
LW
link
Things You’re Allowed to Do: University Edition
Saul Munn
6 Feb 2024 0:36 UTC
97
points
13
comments
5
min read
LW
link
(www.brasstacks.blog)
Value learning in the absence of ground truth
Joel_Saarinen
5 Feb 2024 18:56 UTC
47
points
8
comments
45
min read
LW
link
Implementing activation steering
Annah
5 Feb 2024 17:51 UTC
75
points
8
comments
7
min read
LW
link
AI alignment as a translation problem
Roman Leventov
5 Feb 2024 14:14 UTC
22
points
2
comments
3
min read
LW
link
Safe Stasis Fallacy
Davidmanheim
5 Feb 2024 10:54 UTC
54
points
2
comments
LW
link
[Question]
How has internalising a post-AGI world affected your current choices?
yanni kyriacos
5 Feb 2024 5:43 UTC
10
points
8
comments
1
min read
LW
link
Previous
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel