Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
1
Near-mode thinking on AI
Olli Järviniemi
Aug 4, 2024, 8:47 PM
128
points
9
comments
5
min read
LW
link
Watermarks: Signing, Branding, and Boobytrapping
Shankar Sivarajan
Aug 4, 2024, 8:41 PM
4
points
0
comments
1
min read
LW
link
Modelling Social Exchange: A Systematised Method to Judge Friendship Quality
Wynn Walker
Aug 4, 2024, 6:49 PM
6
points
0
comments
5
min read
LW
link
We’re not as 3-Dimensional as We Think
silentbob
Aug 4, 2024, 2:39 PM
46
points
17
comments
5
min read
LW
link
You don’t know how bad most things are nor precisely how they’re bad.
Solenoid_Entity
Aug 4, 2024, 2:12 PM
329
points
49
comments
5
min read
LW
link
Can We Predict Persuasiveness Better Than Anthropic?
Lennart Finke
Aug 4, 2024, 2:05 PM
22
points
5
comments
4
min read
LW
link
[Question]
What should we do about COVID in 2024?
ChristianKl
Aug 4, 2024, 10:57 AM
20
points
2
comments
1
min read
LW
link
Tokenized SAEs: Infusing per-token biases.
tdooms
and
danwil
Aug 4, 2024, 9:17 AM
20
points
20
comments
15
min read
LW
link
Thoughts On Democracy
Zero Contradictions
Aug 4, 2024, 6:02 AM
2
points
0
comments
1
min read
LW
link
(zerocontradictions.net)
AI Alignment through Comparative Advantage
artemiocobb
Aug 4, 2024, 12:32 AM
−2
points
4
comments
3
min read
LW
link
Labelling, Variables, and In-Context Learning in Llama2
Joshua Penman
Aug 3, 2024, 7:36 PM
6
points
0
comments
1
min read
LW
link
(colab.research.google.com)
[Question]
Dan Hendrycks and EA
jeffreycaruso
Aug 3, 2024, 1:33 PM
−4
points
4
comments
1
min read
LW
link
[Question]
Why do Minimal Bayes Nets often correspond to Causal Models of Reality?
Dalcy
Aug 3, 2024, 12:39 PM
27
points
1
comment
1
min read
LW
link
Why did ChatGPT say that? Prompt engineering and more, with PIZZA.
Jessica Rumbelow
Aug 3, 2024, 12:07 PM
41
points
2
comments
4
min read
LW
link
Cooperation and Alignment in Delegation Games: You Need Both!
Oliver Sourbut
,
Lewis Hammond
and
HarrietW
Aug 3, 2024, 10:16 AM
8
points
0
comments
14
min read
LW
link
(www.oliversourbut.net)
SRE’s review of Democracy
Martin Sustrik
Aug 3, 2024, 7:20 AM
48
points
2
comments
3
min read
LW
link
(250bpm.substack.com)
The Case Against Libertarianism
Zero Contradictions
Aug 3, 2024, 5:05 AM
−4
points
1
comment
1
min read
LW
link
(zerocontradictions.net)
We Don’t Just Let People Die—So What Next?
James Stephen Brown
Aug 3, 2024, 1:04 AM
11
points
8
comments
10
min read
LW
link
The EA case for Trump
Judd Rosenblatt
Aug 3, 2024, 1:00 AM
14
points
1
comment
1
min read
LW
link
(www.secondbest.ca)
I didn’t think I’d take the time to build this calibration training game, but with websim it took roughly 30 seconds, so here it is!
mako yass
Aug 2, 2024, 10:35 PM
24
points
2
comments
5
min read
LW
link
Evaluating Sparse Autoencoders with Board Game Models
Adam Karvonen
,
Sam Marks
,
Can
,
Benjamin Wright
,
Jannik Brinkmann
,
Logan Riggs
and
Rico Angell
Aug 2, 2024, 7:50 PM
38
points
1
comment
9
min read
LW
link
The Bitter Lesson for AI Safety Research
adamk
,
Richard Ren
,
Dan H
and
Gabe M
Aug 2, 2024, 6:39 PM
57
points
5
comments
3
min read
LW
link
Ethical Deception: Should AI Ever Lie?
Jason Reid
Aug 2, 2024, 5:53 PM
5
points
2
comments
7
min read
LW
link
[Question]
Request for AI risk quotes, especially around speed, large impacts and black boxes
Nathan Young
Aug 2, 2024, 5:49 PM
6
points
0
comments
1
min read
LW
link
A Simple Toy Coherence Theorem
johnswentworth
and
David Lorell
Aug 2, 2024, 5:47 PM
74
points
22
comments
7
min read
LW
link
All the Following are Distinct
Gianluca Calcagni
Aug 2, 2024, 4:35 PM
16
points
3
comments
9
min read
LW
link
The ‘strong’ feature hypothesis could be wrong
lewis smith
Aug 2, 2024, 2:33 PM
231
points
19
comments
17
min read
LW
link
An information-theoretic study of lying in LLMs
Annah
and
Guillaume Corlouer
Aug 2, 2024, 10:06 AM
17
points
0
comments
4
min read
LW
link
How I Wrought a Lesser Scribing Artifact (You Can, Too!)
Lorxus
Aug 2, 2024, 3:35 AM
12
points
0
comments
5
min read
LW
link
The Rise and Stagnation of Modernity
Zero Contradictions
Aug 2, 2024, 3:31 AM
1
point
0
comments
1
min read
LW
link
(thewaywardaxolotl.blogspot.com)
Lessons from the FDA for AI
Remmelt
Aug 2, 2024, 12:52 AM
1
point
4
comments
LW
link
(ainowinstitute.org)
AI Rights for Human Safety
Simon Goldstein
Aug 1, 2024, 11:01 PM
53
points
6
comments
1
min read
LW
link
(papers.ssrn.com)
Case Study: Interpreting, Manipulating, and Controlling CLIP With Sparse Autoencoders
Gytis Daujotas
Aug 1, 2024, 9:08 PM
45
points
7
comments
7
min read
LW
link
Optimizing Repeated Correlations
SatvikBeri
Aug 1, 2024, 5:33 PM
26
points
1
comment
1
min read
LW
link
The need for multi-agent experiments
Martín Soto
Aug 1, 2024, 5:14 PM
43
points
3
comments
9
min read
LW
link
Dragon Agnosticism
jefftk
Aug 1, 2024, 5:00 PM
95
points
75
comments
2
min read
LW
link
(www.jefftk.com)
Morristown ACX Meetup
mbrooks
Aug 1, 2024, 4:29 PM
2
points
1
comment
1
min read
LW
link
Some comments on intelligence
Viliam
Aug 1, 2024, 3:17 PM
30
points
5
comments
3
min read
LW
link
[Question]
[Thought Experiment] Given a button to terminate all humanity, would you press it?
lorepieri
Aug 1, 2024, 3:10 PM
−2
points
9
comments
1
min read
LW
link
Are unpaid UN internships a good idea?
Cipolla
Aug 1, 2024, 3:06 PM
1
point
7
comments
4
min read
LW
link
AI #75: Math is Easier
Zvi
Aug 1, 2024, 1:40 PM
46
points
25
comments
72
min read
LW
link
(thezvi.wordpress.com)
Temporary Cognitive Hyperparameter Alteration
Jonathan Moregård
Aug 1, 2024, 10:27 AM
9
points
0
comments
3
min read
LW
link
(honestliving.substack.com)
Technology and Progress
Zero Contradictions
Aug 1, 2024, 4:49 AM
1
point
0
comments
1
min read
LW
link
(thewaywardaxolotl.blogspot.com)
Do Prediction Markets Work?
Benjamin_Sturisky
Aug 1, 2024, 2:31 AM
7
points
0
comments
4
min read
LW
link
2/3 Aussie & NZ AI Safety folk often or sometimes feel lonely or disconnected (and 16 other barriers to impact)
yanni kyriacos
Aug 1, 2024, 1:15 AM
13
points
0
comments
8
min read
LW
link
[Question]
Can UBI overcome inflation and rent seeking?
Gordon Seidoh Worley
Aug 1, 2024, 12:13 AM
5
points
34
comments
1
min read
LW
link
Recommendation: reports on the search for missing hiker Bill Ewasko
eukaryote
Jul 31, 2024, 10:15 PM
169
points
28
comments
14
min read
LW
link
(eukaryotewritesblog.com)
Economics101 predicted the failure of special card payments for refugees, 3 months later whole of Germany wants to adopt it
Yanling Guo
Jul 31, 2024, 9:09 PM
3
points
3
comments
2
min read
LW
link
Ambiguity in Prediction Market Resolution is Still Harmful
aphyer
Jul 31, 2024, 8:32 PM
43
points
17
comments
3
min read
LW
link
AI labs can boost external safety research
Zach Stein-Perlman
31 Jul 2024 19:30 UTC
31
points
1
comment
1
min read
LW
link
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel