Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
1
Labelling, Variables, and In-Context Learning in Llama2
Joshua Penman
Aug 3, 2024, 7:36 PM
6
points
0
comments
1
min read
LW
link
(colab.research.google.com)
[Question]
Dan Hendrycks and EA
jeffreycaruso
Aug 3, 2024, 1:33 PM
−4
points
4
comments
1
min read
LW
link
[Question]
Why do Minimal Bayes Nets often correspond to Causal Models of Reality?
Dalcy
Aug 3, 2024, 12:39 PM
27
points
1
comment
1
min read
LW
link
Why did ChatGPT say that? Prompt engineering and more, with PIZZA.
Jessica Rumbelow
Aug 3, 2024, 12:07 PM
41
points
2
comments
4
min read
LW
link
Cooperation and Alignment in Delegation Games: You Need Both!
Oliver Sourbut
,
Lewis Hammond
and
HarrietW
Aug 3, 2024, 10:16 AM
8
points
0
comments
14
min read
LW
link
(www.oliversourbut.net)
SRE’s review of Democracy
Martin Sustrik
Aug 3, 2024, 7:20 AM
48
points
2
comments
3
min read
LW
link
(250bpm.substack.com)
The Case Against Libertarianism
Zero Contradictions
Aug 3, 2024, 5:05 AM
−4
points
1
comment
1
min read
LW
link
(zerocontradictions.net)
We Don’t Just Let People Die—So What Next?
James Stephen Brown
Aug 3, 2024, 1:04 AM
11
points
8
comments
10
min read
LW
link
The EA case for Trump
Judd Rosenblatt
Aug 3, 2024, 1:00 AM
14
points
1
comment
1
min read
LW
link
(www.secondbest.ca)
I didn’t think I’d take the time to build this calibration training game, but with websim it took roughly 30 seconds, so here it is!
mako yass
Aug 2, 2024, 10:35 PM
24
points
2
comments
5
min read
LW
link
Evaluating Sparse Autoencoders with Board Game Models
Adam Karvonen
,
Sam Marks
,
Can
,
Benjamin Wright
,
Jannik Brinkmann
,
Logan Riggs
and
Rico Angell
Aug 2, 2024, 7:50 PM
38
points
1
comment
9
min read
LW
link
The Bitter Lesson for AI Safety Research
adamk
,
Richard Ren
,
Dan H
and
Gabe M
Aug 2, 2024, 6:39 PM
57
points
5
comments
3
min read
LW
link
Ethical Deception: Should AI Ever Lie?
Jason Reid
Aug 2, 2024, 5:53 PM
5
points
2
comments
7
min read
LW
link
[Question]
Request for AI risk quotes, especially around speed, large impacts and black boxes
Nathan Young
Aug 2, 2024, 5:49 PM
6
points
0
comments
1
min read
LW
link
A Simple Toy Coherence Theorem
johnswentworth
and
David Lorell
Aug 2, 2024, 5:47 PM
74
points
22
comments
7
min read
LW
link
All the Following are Distinct
Gianluca Calcagni
Aug 2, 2024, 4:35 PM
16
points
3
comments
9
min read
LW
link
The ‘strong’ feature hypothesis could be wrong
lewis smith
Aug 2, 2024, 2:33 PM
231
points
19
comments
17
min read
LW
link
An information-theoretic study of lying in LLMs
Annah
and
Guillaume Corlouer
Aug 2, 2024, 10:06 AM
17
points
0
comments
4
min read
LW
link
How I Wrought a Lesser Scribing Artifact (You Can, Too!)
Lorxus
Aug 2, 2024, 3:35 AM
12
points
0
comments
5
min read
LW
link
The Rise and Stagnation of Modernity
Zero Contradictions
Aug 2, 2024, 3:31 AM
1
point
0
comments
1
min read
LW
link
(thewaywardaxolotl.blogspot.com)
Lessons from the FDA for AI
Remmelt
Aug 2, 2024, 12:52 AM
1
point
4
comments
LW
link
(ainowinstitute.org)
AI Rights for Human Safety
Simon Goldstein
Aug 1, 2024, 11:01 PM
53
points
6
comments
1
min read
LW
link
(papers.ssrn.com)
Case Study: Interpreting, Manipulating, and Controlling CLIP With Sparse Autoencoders
Gytis Daujotas
Aug 1, 2024, 9:08 PM
45
points
7
comments
7
min read
LW
link
Optimizing Repeated Correlations
SatvikBeri
Aug 1, 2024, 5:33 PM
26
points
1
comment
1
min read
LW
link
The need for multi-agent experiments
Martín Soto
Aug 1, 2024, 5:14 PM
43
points
3
comments
9
min read
LW
link
Dragon Agnosticism
jefftk
Aug 1, 2024, 5:00 PM
95
points
75
comments
2
min read
LW
link
(www.jefftk.com)
Morristown ACX Meetup
mbrooks
Aug 1, 2024, 4:29 PM
2
points
1
comment
1
min read
LW
link
Some comments on intelligence
Viliam
Aug 1, 2024, 3:17 PM
30
points
5
comments
3
min read
LW
link
[Question]
[Thought Experiment] Given a button to terminate all humanity, would you press it?
lorepieri
Aug 1, 2024, 3:10 PM
−2
points
9
comments
1
min read
LW
link
Are unpaid UN internships a good idea?
Cipolla
Aug 1, 2024, 3:06 PM
1
point
7
comments
4
min read
LW
link
AI #75: Math is Easier
Zvi
Aug 1, 2024, 1:40 PM
46
points
25
comments
72
min read
LW
link
(thezvi.wordpress.com)
Temporary Cognitive Hyperparameter Alteration
Jonathan Moregård
Aug 1, 2024, 10:27 AM
9
points
0
comments
3
min read
LW
link
(honestliving.substack.com)
Technology and Progress
Zero Contradictions
Aug 1, 2024, 4:49 AM
1
point
0
comments
1
min read
LW
link
(thewaywardaxolotl.blogspot.com)
Do Prediction Markets Work?
Benjamin_Sturisky
Aug 1, 2024, 2:31 AM
7
points
0
comments
4
min read
LW
link
2/3 Aussie & NZ AI Safety folk often or sometimes feel lonely or disconnected (and 16 other barriers to impact)
yanni kyriacos
Aug 1, 2024, 1:15 AM
13
points
0
comments
8
min read
LW
link
[Question]
Can UBI overcome inflation and rent seeking?
Gordon Seidoh Worley
Aug 1, 2024, 12:13 AM
5
points
34
comments
1
min read
LW
link
Recommendation: reports on the search for missing hiker Bill Ewasko
eukaryote
Jul 31, 2024, 10:15 PM
169
points
28
comments
14
min read
LW
link
(eukaryotewritesblog.com)
Economics101 predicted the failure of special card payments for refugees, 3 months later whole of Germany wants to adopt it
Yanling Guo
Jul 31, 2024, 9:09 PM
3
points
3
comments
2
min read
LW
link
Ambiguity in Prediction Market Resolution is Still Harmful
aphyer
Jul 31, 2024, 8:32 PM
43
points
17
comments
3
min read
LW
link
AI labs can boost external safety research
Zach Stein-Perlman
Jul 31, 2024, 7:30 PM
31
points
1
comment
1
min read
LW
link
Women in AI Safety London Meetup
njg
Jul 31, 2024, 6:13 PM
1
point
0
comments
1
min read
LW
link
Constructing Neural Network Parameters with Downstream Trainability
ch271828n
Jul 31, 2024, 6:13 PM
1
point
0
comments
1
min read
LW
link
(github.com)
Want to work on US emerging tech policy? Consider the Horizon Fellowship.
Elika
Jul 31, 2024, 6:12 PM
4
points
0
comments
1
min read
LW
link
[Question]
What are your cruxes for imprecise probabilities / decision rules?
Anthony DiGiovanni
Jul 31, 2024, 3:42 PM
36
points
33
comments
1
min read
LW
link
The new UK government’s stance on AI safety
Elliot Mckernon
Jul 31, 2024, 3:23 PM
17
points
0
comments
4
min read
LW
link
Cat Sustenance Fortification
jefftk
31 Jul 2024 2:30 UTC
14
points
7
comments
1
min read
LW
link
(www.jefftk.com)
Twitter thread on open-source AI
Richard_Ngo
31 Jul 2024 0:26 UTC
33
points
6
comments
2
min read
LW
link
(x.com)
Twitter thread on AI takeover scenarios
Richard_Ngo
31 Jul 2024 0:24 UTC
37
points
0
comments
2
min read
LW
link
(x.com)
Twitter thread on AI safety evals
Richard_Ngo
31 Jul 2024 0:18 UTC
63
points
3
comments
2
min read
LW
link
(x.com)
Twitter thread on politics of AI safety
Richard_Ngo
31 Jul 2024 0:00 UTC
35
points
2
comments
1
min read
LW
link
(x.com)
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel