Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
2
AI demands unprecedented reliability
Jono
Jan 9, 2024, 4:30 PM
22
points
5
comments
2
min read
LW
link
Uncertainty in all its flavours
Cleo Nardo
Jan 9, 2024, 4:21 PM
34
points
6
comments
35
min read
LW
link
Compensating for Life Biases
Jonathan Moregård
Jan 9, 2024, 2:39 PM
24
points
6
comments
3
min read
LW
link
(honestliving.substack.com)
Can Morality Be Quantified?
Julius
Jan 9, 2024, 6:35 AM
3
points
0
comments
5
min read
LW
link
Learning Math in Time for Alignment
Nicholas / Heather Kross
Jan 9, 2024, 1:02 AM
32
points
5
comments
3
min read
LW
link
Brief Thoughts on Justifications for Paternalism
Srdjan Miletic
Jan 9, 2024, 12:36 AM
4
points
0
comments
4
min read
LW
link
(dissent.blog)
Hiring decisions are not suitable for prediction markets
SimonM
Jan 8, 2024, 9:11 PM
12
points
6
comments
1
min read
LW
link
Better Anomia
jefftk
Jan 8, 2024, 6:40 PM
8
points
0
comments
1
min read
LW
link
(www.jefftk.com)
A starter guide for evals
Marius Hobbhahn
,
Jérémy Scheurer
,
Mikita Balesni
,
rusheb
and
AlexMeinke
Jan 8, 2024, 6:24 PM
54
points
2
comments
12
min read
LW
link
(www.apolloresearch.ai)
Is it justifiable for non-experts to have strong opinions about Gaza?
Yair Halberstadt
and
Adam Zerner
Jan 8, 2024, 5:31 PM
23
points
12
comments
30
min read
LW
link
Project ideas: Backup plans & Cooperative AI
Lukas Finnveden
Jan 8, 2024, 5:19 PM
18
points
0
comments
LW
link
(www.forethought.org)
Hackathon and Staying Up-to-Date in AI
jacobhaimes
Jan 8, 2024, 5:10 PM
11
points
0
comments
1
min read
LW
link
(into-ai-safety.github.io)
When “yang” goes wrong
Joe Carlsmith
Jan 8, 2024, 4:35 PM
73
points
6
comments
13
min read
LW
link
Task vectors & analogy making in LLMs
Sergii
Jan 8, 2024, 3:17 PM
9
points
1
comment
4
min read
LW
link
(grgv.xyz)
[Question]
How to find translations of a book?
Viliam
Jan 8, 2024, 2:57 PM
9
points
8
comments
1
min read
LW
link
[Question]
Why aren’t Yudkowsky & Bostrom getting more attention now?
JoshuaFox
Jan 8, 2024, 2:42 PM
14
points
8
comments
1
min read
LW
link
2023 Prediction Evaluations
Zvi
Jan 8, 2024, 2:40 PM
47
points
0
comments
28
min read
LW
link
(thezvi.wordpress.com)
There is no sharp boundary between deontology and consequentialism
quetzal_rainbow
Jan 8, 2024, 11:01 AM
8
points
2
comments
1
min read
LW
link
Reflections on my first year of AI safety research
Jay Bailey
Jan 8, 2024, 7:49 AM
53
points
3
comments
LW
link
Why There Is Hope For An Alignment Solution
Darklight
Jan 8, 2024, 6:58 AM
10
points
0
comments
12
min read
LW
link
Sledding Among Hazards
jefftk
Jan 8, 2024, 3:30 AM
19
points
5
comments
1
min read
LW
link
(www.jefftk.com)
Utility is relative
CrimsonChin
Jan 8, 2024, 2:31 AM
2
points
4
comments
2
min read
LW
link
A model of research skill
L Rudolf L
Jan 8, 2024, 12:13 AM
60
points
6
comments
12
min read
LW
link
(www.strataoftheworld.com)
We shouldn’t fear superintelligence because it already exists
Spencer Chubb
Jan 7, 2024, 5:59 PM
−22
points
14
comments
1
min read
LW
link
(Partial) failure in replicating deceptive alignment experiment
claudia.biancotti
Jan 7, 2024, 5:56 PM
1
point
0
comments
1
min read
LW
link
Project ideas: Sentience and rights of digital minds
Lukas Finnveden
Jan 7, 2024, 5:34 PM
20
points
0
comments
LW
link
(www.forethought.org)
Deceptive AI ≠ Deceptively-aligned AI
Steven Byrnes
Jan 7, 2024, 4:55 PM
96
points
19
comments
6
min read
LW
link
Bayesians Commit the Gambler’s Fallacy
Kevin Dorst
Jan 7, 2024, 12:54 PM
49
points
30
comments
8
min read
LW
link
(kevindorst.substack.com)
Towards AI Safety Infrastructure: Talk & Outline
Paul Bricman
Jan 7, 2024, 9:31 AM
11
points
0
comments
2
min read
LW
link
(www.youtube.com)
Defending against hypothetical moon life during Apollo 11
eukaryote
Jan 7, 2024, 4:49 AM
57
points
9
comments
32
min read
LW
link
(eukaryotewritesblog.com)
The Sequences on YouTube
Neil
Jan 7, 2024, 1:44 AM
26
points
9
comments
2
min read
LW
link
AI Risk and the US Presidential Candidates
Zane
Jan 6, 2024, 8:18 PM
41
points
22
comments
6
min read
LW
link
A Challenge to Effective Altruism’s Premises
False Name
Jan 6, 2024, 6:46 PM
−26
points
3
comments
3
min read
LW
link
Lack of Spider-Man is evidence against the simulation hypothesis
RamblinDash
Jan 6, 2024, 6:17 PM
7
points
23
comments
1
min read
LW
link
A Land Tax For Britain
A.H.
Jan 6, 2024, 3:52 PM
6
points
9
comments
4
min read
LW
link
Book review: Trick or treatment (2008)
Fleece Minutia
Jan 6, 2024, 3:40 PM
1
point
0
comments
2
min read
LW
link
Are we inside a black hole?
Jay
Jan 6, 2024, 1:30 PM
2
points
5
comments
1
min read
LW
link
Survey of 2,778 AI authors: six parts in pictures
KatjaGrace
Jan 6, 2024, 4:43 AM
80
points
1
comment
2
min read
LW
link
Project ideas: Epistemics
Lukas Finnveden
Jan 5, 2024, 11:41 PM
43
points
4
comments
LW
link
(www.forethought.org)
Almost everyone I’ve met would be well-served thinking more about what to focus on
Henrik Karlsson
Jan 5, 2024, 9:01 PM
96
points
8
comments
11
min read
LW
link
(www.henrikkarlsson.xyz)
The Next ChatGPT Moment: AI Avatars
kolmplex
and
southpaw
Jan 5, 2024, 8:14 PM
43
points
10
comments
1
min read
LW
link
AI Impacts 2023 Expert Survey on Progress in AI
habryka
Jan 5, 2024, 7:42 PM
28
points
2
comments
7
min read
LW
link
(wiki.aiimpacts.org)
Technology path dependence and evaluating expertise
bhauth
and
Muireall
Jan 5, 2024, 7:21 PM
25
points
2
comments
15
min read
LW
link
The Hippie Rabbit Hole -Nuggets of Gold in Rivers of Bullshit
Jonathan Moregård
5 Jan 2024 18:27 UTC
39
points
20
comments
8
min read
LW
link
(honestliving.substack.com)
[Question]
What technical topics could help with boundaries/membranes?
Chipmonk
5 Jan 2024 18:14 UTC
15
points
25
comments
1
min read
LW
link
Catching AIs red-handed
ryan_greenblatt
and
Buck
5 Jan 2024 17:43 UTC
111
points
27
comments
17
min read
LW
link
AI Impacts Survey: December 2023 Edition
Zvi
5 Jan 2024 14:40 UTC
34
points
6
comments
10
min read
LW
link
(thezvi.wordpress.com)
Forecast your 2024 with Fatebook
Sage Future
5 Jan 2024 14:07 UTC
19
points
0
comments
1
min read
LW
link
(fatebook.io)
Predictive model agents are sort of corrigible
Raymond Douglas
5 Jan 2024 14:05 UTC
35
points
6
comments
3
min read
LW
link
Striking Implications for Learning Theory, Interpretability — and Safety?
RogerDearnaley
5 Jan 2024 8:46 UTC
37
points
4
comments
2
min read
LW
link
Previous
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel