Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
1
Ruining an expected-log-money maximizer
philh
Aug 20, 2023, 9:20 PM
33
points
33
comments
1
min read
LW
link
1
review
(reasonableapproximation.net)
Steven Wolfram on AI Alignment
Bill Benzon
Aug 20, 2023, 7:49 PM
66
points
15
comments
4
min read
LW
link
[Question]
What value does personal prediction tracking have?
fx
Aug 20, 2023, 6:43 PM
7
points
3
comments
1
min read
LW
link
Jan Kulveit’s Corrigibility Thoughts Distilled
brook
Aug 20, 2023, 5:52 PM
22
points
1
comment
5
min read
LW
link
Memetic Judo #3: The Intelligence of Stochastic Parrots v.2
Max TK
Aug 20, 2023, 3:18 PM
8
points
33
comments
6
min read
LW
link
ACX/SSC Boulder meetup- September 23
Josh Sacks
Aug 20, 2023, 2:16 PM
1
point
4
comments
1
min read
LW
link
“Dirty concepts” in AI alignment discourses, and some guesses for how to deal with them
Nora_Ammann
and
peckzy
Aug 20, 2023, 9:13 AM
66
points
4
comments
3
min read
LW
link
Call for Papers on Global AI Governance from the UN
Chris_Leong
Aug 20, 2023, 8:56 AM
19
points
0
comments
LW
link
(www.linkedin.com)
How do I read things on the internet
Vlad Sitalo
Aug 20, 2023, 5:43 AM
16
points
2
comments
8
min read
LW
link
(vlad.roam.garden)
AI Forecasting: Two Years In
jsteinhardt
Aug 19, 2023, 11:40 PM
72
points
15
comments
11
min read
LW
link
(bounded-regret.ghost.io)
Four management/leadership book summaries
Nikola Jurkovic
Aug 19, 2023, 11:38 PM
25
points
2
comments
7
min read
LW
link
Interpreting a dimensionality reduction of a collection of matrices as two positive semidefinite block diagonal matrices
Joseph Van Name
Aug 19, 2023, 7:52 PM
16
points
2
comments
5
min read
LW
link
Will AI kill everyone? Here’s what the godfathers of AI have to say [RA video]
Writer
Aug 19, 2023, 5:29 PM
58
points
8
comments
LW
link
(youtu.be)
Ten variations on red-pill-blue-pill
Richard_Kennaway
Aug 19, 2023, 4:34 PM
23
points
34
comments
3
min read
LW
link
Are we running out of new music/movies/art from a metaphysical perspective? (updated)
stephen_s
Aug 19, 2023, 4:24 PM
4
points
23
comments
1
min read
LW
link
[Question]
Any ideas for a prediction market observable that quantifies “culture-warisation”?
Ppau
Aug 19, 2023, 3:11 PM
6
points
1
comment
1
min read
LW
link
[Question]
Clarifying how misalignment can arise from scaling LLMs
Util
Aug 19, 2023, 2:16 PM
3
points
1
comment
1
min read
LW
link
Chess as a case study in hidden capabilities in ChatGPT
AdamYedidia
Aug 19, 2023, 6:35 AM
47
points
32
comments
6
min read
LW
link
We can do better than DoWhatIMean (inextricably kind AI)
lemonhope
Aug 19, 2023, 5:41 AM
25
points
8
comments
2
min read
LW
link
Supervised Program for Alignment Research (SPAR) at UC Berkeley: Spring 2023 summary
mic
,
dx26
,
adamk
and
Carolyn Qian
Aug 19, 2023, 2:27 AM
23
points
2
comments
6
min read
LW
link
Could fabs own AI?
lemonhope
Aug 19, 2023, 12:16 AM
15
points
0
comments
3
min read
LW
link
Is Chinese total factor productivity lower today than it was in 1956?
Ege Erdil
Aug 18, 2023, 10:33 PM
43
points
0
comments
26
min read
LW
link
Rationality-ish Meetups Showcase: 2019-2021
jenn
Aug 18, 2023, 10:22 PM
10
points
0
comments
5
min read
LW
link
The U.S. is becoming less stable
lc
Aug 18, 2023, 9:13 PM
149
points
68
comments
2
min read
LW
link
Meetup Tip: Board Games
Screwtape
Aug 18, 2023, 6:11 PM
10
points
4
comments
7
min read
LW
link
[Question]
AI labs’ requests for input
Zach Stein-Perlman
Aug 18, 2023, 5:00 PM
29
points
0
comments
1
min read
LW
link
6 non-obvious mental health issues specific to AI safety
Igor Ivanov
Aug 18, 2023, 3:46 PM
147
points
24
comments
4
min read
LW
link
When discussing AI doom barriers propose specific plausible scenarios
anithite
Aug 18, 2023, 4:06 AM
5
points
0
comments
3
min read
LW
link
Risks from AI Overview: Summary
Dan H
,
Mantas Mazeika
and
TW123
Aug 18, 2023, 1:21 AM
25
points
1
comment
13
min read
LW
link
(www.safe.ai)
Managing risks of our own work
Beth Barnes
Aug 18, 2023, 12:41 AM
66
points
0
comments
2
min read
LW
link
ACI#5: From Human-AI Co-evolution to the Evolution of Value Systems
Akira Pyinya
Aug 18, 2023, 12:38 AM
0
points
0
comments
9
min read
LW
link
Memetic Judo #1: On Doomsday Prophets v.3
Max TK
Aug 18, 2023, 12:14 AM
25
points
17
comments
3
min read
LW
link
Looking for judges for critiques of Alignment Plans
Iknownothing
Aug 17, 2023, 10:35 PM
6
points
0
comments
1
min read
LW
link
How is ChatGPT’s behavior changing over time?
worse
Aug 17, 2023, 8:54 PM
3
points
0
comments
1
min read
LW
link
(arxiv.org)
Progress links digest, 2023-08-17: Cloud seeding, robotic sculptors, and rogue planets
jasoncrawford
Aug 17, 2023, 8:29 PM
15
points
1
comment
4
min read
LW
link
(rootsofprogress.org)
Model of psychosis, take 2
Steven Byrnes
Aug 17, 2023, 7:11 PM
34
points
13
comments
4
min read
LW
link
[Linkpost] Robustified ANNs Reveal Wormholes Between Human Category Percepts
Bogdan Ionut Cirstea
Aug 17, 2023, 7:10 PM
6
points
2
comments
1
min read
LW
link
Against Almost Every Theory of Impact of Interpretability
Charbel-Raphaël
Aug 17, 2023, 6:44 PM
329
points
91
comments
26
min read
LW
link
2
reviews
Goldilocks and the Three Optimisers
dkl9
Aug 17, 2023, 6:15 PM
−10
points
0
comments
5
min read
LW
link
(dkl9.net)
Announcing Foresight Institute’s AI Safety Grants Program
Allison Duettmann
Aug 17, 2023, 5:34 PM
35
points
2
comments
1
min read
LW
link
The Negentropy Cliff
mephistopheles
Aug 17, 2023, 5:08 PM
6
points
10
comments
1
min read
LW
link
“AI Wellbeing” and the Ongoing Debate on Phenomenal Consciousness
FlorianH
Aug 17, 2023, 3:47 PM
10
points
6
comments
7
min read
LW
link
AI #25: Inflection Point
Zvi
Aug 17, 2023, 2:40 PM
59
points
9
comments
36
min read
LW
link
(thezvi.wordpress.com)
[Question]
Why might General Intelligences have long term goals?
yrimon
Aug 17, 2023, 2:10 PM
3
points
17
comments
1
min read
LW
link
Understanding Counterbalanced Subtractions for Better Activation Additions
ojorgensen
Aug 17, 2023, 1:53 PM
21
points
0
comments
14
min read
LW
link
Reflections on “Making the Atomic Bomb”
boazbarak
Aug 17, 2023, 2:48 AM
51
points
7
comments
8
min read
LW
link
Autonomous replication and adaptation: an attempt at a concrete danger threshold
Hjalmar_Wijk
Aug 17, 2023, 1:31 AM
45
points
0
comments
13
min read
LW
link
[Question]
(Thought experiment) If you had to choose, which would you prefer?
kuira
Aug 17, 2023, 12:57 AM
9
points
2
comments
1
min read
LW
link
Some rules for life (v.0,0)
Neil
Aug 17, 2023, 12:43 AM
43
points
13
comments
12
min read
LW
link
(neilwarren.substack.com)
When AI critique works even with misaligned models
Fabien Roger
Aug 17, 2023, 12:12 AM
23
points
0
comments
2
min read
LW
link
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel