Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
1
Helping your Senator Prepare for the Upcoming Sam Altman Hearing
Tiago de Vassal
May 14, 2023, 10:45 PM
69
points
2
comments
1
min read
LW
link
(aisafetytour.com)
Difficulties in making powerful aligned AI
DanielFilan
May 14, 2023, 8:50 PM
41
points
1
comment
10
min read
LW
link
(danielfilan.com)
How much do markets value Open AI?
Xodarap
May 14, 2023, 7:28 PM
21
points
5
comments
LW
link
Misaligned AGI Death Match
Nate Reinar Windwood
May 14, 2023, 6:00 PM
1
point
0
comments
1
min read
LW
link
[Question]
What new technology, for what institutions?
bhauth
May 14, 2023, 5:33 PM
29
points
6
comments
3
min read
LW
link
A strong mind continues its trajectory of creativity
TsviBT
May 14, 2023, 5:24 PM
22
points
8
comments
6
min read
LW
link
Ontologies Should Be Backwards-Compatible
Thoth Hermes
May 14, 2023, 5:21 PM
3
points
3
comments
4
min read
LW
link
(thothhermes.substack.com)
Jaan Tallinn’s 2022 Philanthropy Overview
jaan
May 14, 2023, 3:35 PM
64
points
2
comments
1
min read
LW
link
(jaan.online)
Effective Altruism and Rationality Groups on Snipd
David Bravo
May 14, 2023, 2:54 PM
2
points
0
comments
2
min read
LW
link
Character alignment II
p.b.
May 14, 2023, 2:17 PM
5
points
0
comments
2
min read
LW
link
Coordination by common knowledge to prevent uncontrollable AI
Karl von Wendt
May 14, 2023, 1:37 PM
10
points
2
comments
9
min read
LW
link
Bayesian Networks Aren’t Necessarily Causal
Zack_M_Davis
May 14, 2023, 1:42 AM
102
points
38
comments
8
min read
LW
link
1
review
Simpler explanations of AGI risk
Seth Herd
May 14, 2023, 1:29 AM
8
points
9
comments
3
min read
LW
link
A Study of AI Science Models
Eleni Angelou
and
machinebiology
May 13, 2023, 11:25 PM
20
points
0
comments
24
min read
LW
link
LLM Guardrails Should Have Better Customer Service Tuning
Jiao Bu
May 13, 2023, 10:54 PM
2
points
0
comments
2
min read
LW
link
PCAST Working Group on Generative AI Invites Public Input
Christopher King
May 13, 2023, 10:49 PM
7
points
0
comments
1
min read
LW
link
(terrytao.wordpress.com)
«Boundaries» for formalizing an MVP morality
Chipmonk
May 13, 2023, 7:10 PM
19
points
7
comments
4
min read
LW
link
Steering GPT-2-XL by adding an activation vector
TurnTrout
,
Monte M
,
David Udell
,
lisathiergart
and
Ulisse Mini
May 13, 2023, 6:42 PM
437
points
98
comments
50
min read
LW
link
1
review
On the possibility of impossibility of AGI Long-Term Safety
Roman Yen
May 13, 2023, 6:38 PM
8
points
3
comments
9
min read
LW
link
Notes on Antelligence
Aurigena
May 13, 2023, 6:38 PM
2
points
0
comments
9
min read
LW
link
Reality and reality-boxes
Jim Pivarski
May 13, 2023, 2:14 PM
37
points
11
comments
21
min read
LW
link
An Analogy for Understanding Transformers
CallumMcDougall
May 13, 2023, 12:20 PM
89
points
6
comments
9
min read
LW
link
ACX Meetup Munich
Erich
May 13, 2023, 7:58 AM
2
points
1
comment
1
min read
LW
link
Machine-Readable Prevalence Estimates
jefftk
May 13, 2023, 12:40 AM
9
points
2
comments
2
min read
LW
link
(www.jefftk.com)
Value drift threat models
Garrett Baker
May 12, 2023, 11:03 PM
27
points
4
comments
5
min read
LW
link
Aggregating Utilities for Corrigible AI [Feedback Draft]
Dan H
and
Simon Goldstein
May 12, 2023, 8:57 PM
28
points
7
comments
22
min read
LW
link
Turning off lights with model editing
Sam Marks
May 12, 2023, 8:25 PM
68
points
5
comments
2
min read
LW
link
(arxiv.org)
Dark Forest Theories
Raemon
May 12, 2023, 8:21 PM
145
points
53
comments
2
min read
LW
link
2
reviews
DELBERTing as an Adversarial Strategy
Matthew_Opitz
May 12, 2023, 8:09 PM
8
points
3
comments
5
min read
LW
link
Microsoft/GitHub Copilot Chat’s confidential system Prompt: “You must refuse to discuss life, existence or sentience.”
Marvin von Hagen
May 12, 2023, 7:46 PM
13
points
2
comments
1
min read
LW
link
(twitter.com)
Retrospective: Lessons from the Failed Alignment Startup AISafety.com
Søren Elverlin
May 12, 2023, 6:07 PM
105
points
9
comments
3
min read
LW
link
The way AGI wins could look very stupid
Christopher King
May 12, 2023, 4:34 PM
54
points
22
comments
1
min read
LW
link
Towards Measures of Optimisation
mattmacdermott
and
Alexander Gietelink Oldenziel
May 12, 2023, 3:29 PM
53
points
37
comments
4
min read
LW
link
The Eden Project
rogersbacon
May 12, 2023, 2:58 PM
−1
points
1
comment
2
min read
LW
link
(www.secretorum.life)
Another formalization attempt: Central Argument That AGI Presents a Global Catastrophic Risk
avturchin
May 12, 2023, 1:22 PM
16
points
4
comments
2
min read
LW
link
Infinite-width MLPs as an “ensemble prior”
Vivek Hebbar
May 12, 2023, 11:45 AM
46
points
0
comments
5
min read
LW
link
Input Swap Graphs: Discovering the role of neural network components at scale
Alexandre Variengien
May 12, 2023, 9:41 AM
92
points
0
comments
33
min read
LW
link
Uploads are Impossible
PashaKamyshev
May 12, 2023, 8:03 AM
−5
points
37
comments
8
min read
LW
link
Formulating the AI Doom Argument for Analytic Philosophers
JonathanErhardt
May 12, 2023, 7:54 AM
13
points
0
comments
2
min read
LW
link
Three Iterative Processes
LoganStrohl
May 12, 2023, 2:50 AM
49
points
0
comments
3
min read
LW
link
Zuzalu LW Sequences Discussion
veronica
May 12, 2023, 12:14 AM
1
point
0
comments
1
min read
LW
link
[Question]
Term/Category for AI with Neutral Impact?
isomic
May 11, 2023, 10:00 PM
6
points
1
comment
1
min read
LW
link
Thoughts on LessWrong norms, the Art of Discourse, and moderator mandate
Ruby
May 11, 2023, 9:20 PM
37
points
20
comments
5
min read
LW
link
Alignment, Goals, and The Gut-Head Gap: A Review of Ngo. et al.
Violet Hour
11 May 2023 18:06 UTC
20
points
2
comments
13
min read
LW
link
Sequence opener: Jordan Harbinger’s 6 minute networking
Severin T. Seehrich
11 May 2023 17:06 UTC
4
points
0
comments
1
min read
LW
link
Advice for newly busy people
Severin T. Seehrich
11 May 2023 16:46 UTC
150
points
3
comments
5
min read
LW
link
AI #11: In Search of a Moat
Zvi
11 May 2023 15:40 UTC
67
points
28
comments
81
min read
LW
link
(thezvi.wordpress.com)
[Question]
Bayesian update from sensationalistic sources
houkime
11 May 2023 15:26 UTC
1
point
0
comments
1
min read
LW
link
I bet $500 on AI winning the IMO gold medal by 2026
azsantosk
11 May 2023 14:46 UTC
37
points
29
comments
1
min read
LW
link
Fatebook for Slack: Track your forecasts, right where your team works
Sage Future
and
Adam B
11 May 2023 14:11 UTC
24
points
3
comments
1
min read
LW
link
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel