Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
1
Grokking, memorization, and generalization — a discussion
Kaarel
and
Dmitry Vaintrob
Oct 29, 2023, 11:17 PM
75
points
11
comments
23
min read
LW
link
Comp Sci in 2027 (Short story by Eliezer Yudkowsky)
sudo
Oct 29, 2023, 11:09 PM
202
points
24
comments
10
min read
LW
link
1
review
(nitter.net)
Mathematically-Defined Optimization Captures A Lot of Useful Information
J Bostock
Oct 29, 2023, 5:17 PM
19
points
0
comments
2
min read
LW
link
Clarifying the free energy principle (with quotes)
Ryo
Oct 29, 2023, 4:03 PM
8
points
0
comments
9
min read
LW
link
A new intro to Quantum Physics, with the math fixed
titotal
Oct 29, 2023, 3:11 PM
113
points
24
comments
17
min read
LW
link
(titotal.substack.com)
My idea of sacredness, divinity, and religion
Kaj_Sotala
Oct 29, 2023, 12:50 PM
40
points
10
comments
4
min read
LW
link
(kajsotala.fi)
The AI Boom Mainly Benefits Big Firms, but long-term, markets will concentrate
Hauke Hillebrandt
Oct 29, 2023, 8:38 AM
−1
points
0
comments
LW
link
What’s up with “Responsible Scaling Policies”?
habryka
and
ryan_greenblatt
Oct 29, 2023, 4:17 AM
99
points
9
comments
20
min read
LW
link
1
review
Experiments as a Third Alternative
Adam Zerner
Oct 29, 2023, 12:39 AM
48
points
21
comments
5
min read
LW
link
Comparing representation vectors between llama 2 base and chat
Nina Panickssery
Oct 28, 2023, 10:54 PM
36
points
5
comments
2
min read
LW
link
Vaniver’s thoughts on Anthropic’s RSP
Vaniver
Oct 28, 2023, 9:06 PM
46
points
4
comments
3
min read
LW
link
Book Review: Orality and Literacy: The Technologizing of the Word
Fergus Fettes
Oct 28, 2023, 8:12 PM
13
points
0
comments
16
min read
LW
link
Regrant up to $600,000 to AI safety projects with GiveWiki
Dawn Drescher
Oct 28, 2023, 7:56 PM
33
points
1
comment
LW
link
Shane Legg interview on alignment
Seth Herd
Oct 28, 2023, 7:28 PM
66
points
20
comments
2
min read
LW
link
(www.youtube.com)
AI Existential Safety Fellowships
mmfli
Oct 28, 2023, 6:07 PM
5
points
0
comments
1
min read
LW
link
AI Safety Hub Serbia Official Opening
DusanDNesic
and
Tanja T
Oct 28, 2023, 5:03 PM
55
points
0
comments
3
min read
LW
link
(forum.effectivealtruism.org)
Managing AI Risks in an Era of Rapid Progress
Algon
Oct 28, 2023, 3:48 PM
36
points
5
comments
11
min read
LW
link
(managing-ai-risks.com)
[Question]
ELI5 Why isn’t alignment *easier* as models get stronger?
Logan Zoellner
Oct 28, 2023, 2:34 PM
3
points
9
comments
1
min read
LW
link
Truthseeking, EA, Simulacra levels, and other stuff
Elizabeth
and
Vaniver
Oct 27, 2023, 11:56 PM
45
points
12
comments
9
min read
LW
link
[Question]
Do you believe “E=mc^2” is a correct and/or useful equation, and, whether yes or no, precisely what are your reasons for holding this belief (with such a degree of confidence)?
l8c
Oct 27, 2023, 10:46 PM
10
points
14
comments
1
min read
LW
link
Value systematization: how values become coherent (and misaligned)
Richard_Ngo
Oct 27, 2023, 7:06 PM
103
points
49
comments
13
min read
LW
link
Techno-humanism is techno-optimism for the 21st century
Richard_Ngo
Oct 27, 2023, 6:37 PM
88
points
5
comments
14
min read
LW
link
(www.mindthefuture.info)
Sanctuary for Humans
Nikola Jurkovic
Oct 27, 2023, 6:08 PM
22
points
9
comments
1
min read
LW
link
Wireheading and misalignment by composition on NetHack
pierlucadoro
Oct 27, 2023, 5:43 PM
34
points
4
comments
4
min read
LW
link
We’re Not Ready: thoughts on “pausing” and responsible scaling policies
HoldenKarnofsky
Oct 27, 2023, 3:19 PM
200
points
33
comments
8
min read
LW
link
Aspiration-based Q-Learning
Clément Dumas
and
Jobst Heitzig
Oct 27, 2023, 2:42 PM
38
points
5
comments
11
min read
LW
link
Linkpost: Rishi Sunak’s Speech on AI (26th October)
bideup
Oct 27, 2023, 11:57 AM
85
points
8
comments
7
min read
LW
link
(www.gov.uk)
ASPR & WARP: Rationality Camps for Teens in Taiwan and Oxford
Anna Gajdova
Oct 27, 2023, 8:40 AM
18
points
0
comments
1
min read
LW
link
[Question]
To what extent is the UK Government’s recent AI Safety push entirely due to Rishi Sunak?
Stephen Fowler
Oct 27, 2023, 3:29 AM
23
points
4
comments
1
min read
LW
link
Bayesian Punishment
Rob Lucas
Oct 27, 2023, 3:24 AM
1
point
1
comment
6
min read
LW
link
Online Dialogues Party — Sunday 5th November
Ben Pace
Oct 27, 2023, 2:41 AM
28
points
1
comment
1
min read
LW
link
OpenAI’s new Preparedness team is hiring
leopold
Oct 26, 2023, 8:42 PM
60
points
2
comments
1
min read
LW
link
Fake Deeply
Zack_M_Davis
Oct 26, 2023, 7:55 PM
33
points
7
comments
1
min read
LW
link
(unremediatedgender.space)
Symbol/Referent Confusions in Language Model Alignment Experiments
johnswentworth
Oct 26, 2023, 7:49 PM
116
points
50
comments
6
min read
LW
link
1
review
Unsupervised Methods for Concept Discovery in AlphaZero
aog
Oct 26, 2023, 7:05 PM
9
points
0
comments
1
min read
LW
link
(arxiv.org)
[Question]
Nonlinear limitations of ReLUs
magfrump
Oct 26, 2023, 6:51 PM
13
points
1
comment
1
min read
LW
link
AI Alignment Problem: Requirement not optional (A Critical Analysis through Mass Effect Trilogy)
TAWSIF AHMED
Oct 26, 2023, 6:02 PM
−9
points
0
comments
4
min read
LW
link
[Thought Experiment] Tomorrow’s Echo—The future of synthetic companionship.
Vimal Naran
Oct 26, 2023, 5:54 PM
−7
points
2
comments
2
min read
LW
link
Disagreements over the prioritization of existential risk from AI
Olivier Coutu
Oct 26, 2023, 5:54 PM
10
points
0
comments
6
min read
LW
link
[Question]
What if AGI had its own universe to maybe wreck?
mseale
Oct 26, 2023, 5:49 PM
−1
points
2
comments
1
min read
LW
link
Changing Contra Dialects
jefftk
Oct 26, 2023, 5:30 PM
25
points
2
comments
1
min read
LW
link
(www.jefftk.com)
5 psychological reasons for dismissing x-risks from AGI
Igor Ivanov
Oct 26, 2023, 5:21 PM
24
points
6
comments
4
min read
LW
link
5. Risks from preventing legitimate value change (value collapse)
Nora_Ammann
Oct 26, 2023, 2:38 PM
13
points
1
comment
9
min read
LW
link
4. Risks from causing illegitimate value change (performative predictors)
Nora_Ammann
Oct 26, 2023, 2:38 PM
8
points
3
comments
5
min read
LW
link
3. Premise three & Conclusion: AI systems can affect value change trajectories & the Value Change Problem
Nora_Ammann
Oct 26, 2023, 2:38 PM
28
points
4
comments
4
min read
LW
link
2. Premise two: Some cases of value change are (il)legitimate
Nora_Ammann
Oct 26, 2023, 2:36 PM
24
points
7
comments
6
min read
LW
link
1. Premise one: Values are malleable
Nora_Ammann
Oct 26, 2023, 2:36 PM
21
points
1
comment
15
min read
LW
link
0. The Value Change Problem: introduction, overview and motivations
Nora_Ammann
Oct 26, 2023, 2:36 PM
32
points
0
comments
5
min read
LW
link
EPUBs of MIRI Blog Archives and selected LW Sequences
mesaoptimizer
Oct 26, 2023, 2:17 PM
44
points
5
comments
1
min read
LW
link
(git.sr.ht)
UK Government publishes “Frontier AI: capabilities and risks” Discussion Paper
A.H.
26 Oct 2023 13:55 UTC
5
points
0
comments
2
min read
LW
link
(www.gov.uk)
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel