Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
Archive
Sequences
About
Search
Log In
All
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
All
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
All
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Page
1
OpenAI’s new Preparedness team is hiring
leopold
Oct 26, 2023, 8:42 PM
60
points
2
comments
1
min read
LW
link
Fake Deeply
Zack_M_Davis
Oct 26, 2023, 7:55 PM
33
points
7
comments
1
min read
LW
link
(unremediatedgender.space)
Symbol/Referent Confusions in Language Model Alignment Experiments
johnswentworth
Oct 26, 2023, 7:49 PM
116
points
50
comments
6
min read
LW
link
1
review
Unsupervised Methods for Concept Discovery in AlphaZero
aog
Oct 26, 2023, 7:05 PM
9
points
0
comments
1
min read
LW
link
(arxiv.org)
[Question]
Nonlinear limitations of ReLUs
magfrump
Oct 26, 2023, 6:51 PM
13
points
1
comment
1
min read
LW
link
AI Alignment Problem: Requirement not optional (A Critical Analysis through Mass Effect Trilogy)
TAWSIF AHMED
Oct 26, 2023, 6:02 PM
−9
points
0
comments
4
min read
LW
link
[Thought Experiment] Tomorrow’s Echo—The future of synthetic companionship.
Vimal Naran
Oct 26, 2023, 5:54 PM
−7
points
2
comments
2
min read
LW
link
Disagreements over the prioritization of existential risk from AI
Olivier Coutu
Oct 26, 2023, 5:54 PM
10
points
0
comments
6
min read
LW
link
[Question]
What if AGI had its own universe to maybe wreck?
mseale
Oct 26, 2023, 5:49 PM
−1
points
2
comments
1
min read
LW
link
Changing Contra Dialects
jefftk
Oct 26, 2023, 5:30 PM
25
points
2
comments
1
min read
LW
link
(www.jefftk.com)
5 psychological reasons for dismissing x-risks from AGI
Igor Ivanov
Oct 26, 2023, 5:21 PM
24
points
6
comments
4
min read
LW
link
5. Risks from preventing legitimate value change (value collapse)
Nora_Ammann
Oct 26, 2023, 2:38 PM
13
points
1
comment
9
min read
LW
link
4. Risks from causing illegitimate value change (performative predictors)
Nora_Ammann
Oct 26, 2023, 2:38 PM
8
points
3
comments
5
min read
LW
link
3. Premise three & Conclusion: AI systems can affect value change trajectories & the Value Change Problem
Nora_Ammann
Oct 26, 2023, 2:38 PM
28
points
4
comments
4
min read
LW
link
2. Premise two: Some cases of value change are (il)legitimate
Nora_Ammann
Oct 26, 2023, 2:36 PM
24
points
7
comments
6
min read
LW
link
1. Premise one: Values are malleable
Nora_Ammann
Oct 26, 2023, 2:36 PM
21
points
1
comment
15
min read
LW
link
0. The Value Change Problem: introduction, overview and motivations
Nora_Ammann
Oct 26, 2023, 2:36 PM
32
points
0
comments
5
min read
LW
link
EPUBs of MIRI Blog Archives and selected LW Sequences
mesaoptimizer
Oct 26, 2023, 2:17 PM
44
points
5
comments
1
min read
LW
link
(git.sr.ht)
UK Government publishes “Frontier AI: capabilities and risks” Discussion Paper
A.H.
Oct 26, 2023, 1:55 PM
5
points
0
comments
2
min read
LW
link
(www.gov.uk)
AI #35: Responsible Scaling Policies
Zvi
Oct 26, 2023, 1:30 PM
66
points
10
comments
55
min read
LW
link
(thezvi.wordpress.com)
RA Bounty: Looking for feedback on screenplay about AI Risk
Writer
Oct 26, 2023, 1:23 PM
32
points
6
comments
1
min read
LW
link
Sensor Exposure can Compromise the Human Brain in the 2020s
trevor
Oct 26, 2023, 3:31 AM
17
points
6
comments
10
min read
LW
link
Notes on “How do we become confident in the safety of a machine learning system?”
RohanS
Oct 26, 2023, 3:13 AM
4
points
0
comments
13
min read
LW
link
Apply to the Constellation Visiting Researcher Program and Astra Fellowship, in Berkeley this Winter
Nate Thomas
Oct 26, 2023, 3:07 AM
42
points
10
comments
1
min read
LW
link
CHAI internship applications are open (due Nov 13)
Erik Jenner
Oct 26, 2023, 12:53 AM
34
points
0
comments
3
min read
LW
link
Architects of Our Own Demise: We Should Stop Developing AI Carelessly
Roko
Oct 26, 2023, 12:36 AM
170
points
75
comments
3
min read
LW
link
EA Infrastructure Fund: June 2023 grant recommendations
Linch
Oct 26, 2023, 12:35 AM
21
points
0
comments
LW
link
Responsible Scaling Policies Are Risk Management Done Wrong
simeon_c
Oct 25, 2023, 11:46 PM
123
points
35
comments
22
min read
LW
link
1
review
(www.navigatingrisks.ai)
AI as a science, and three obstacles to alignment strategies
So8res
Oct 25, 2023, 9:00 PM
193
points
80
comments
11
min read
LW
link
My hopes for alignment: Singular learning theory and whole brain emulation
Garrett Baker
Oct 25, 2023, 6:31 PM
61
points
5
comments
12
min read
LW
link
[Question]
Lying to chess players for alignment
Zane
Oct 25, 2023, 5:47 PM
97
points
54
comments
1
min read
LW
link
Anthropic, Google, Microsoft & OpenAI announce Executive Director of the Frontier Model Forum & over $10 million for a new AI Safety Fund
Zach Stein-Perlman
Oct 25, 2023, 3:20 PM
31
points
8
comments
4
min read
LW
link
(www.frontiermodelforum.org)
“The Economics of Time Travel”—call for reviewers (Seeds of Science)
rogersbacon
Oct 25, 2023, 3:13 PM
4
points
2
comments
1
min read
LW
link
Compositional preference models for aligning LMs
Tomek Korbak
Oct 25, 2023, 12:17 PM
18
points
2
comments
5
min read
LW
link
[Question]
Should the US House of Representatives adopt rank choice voting for leadership positions?
jmh
Oct 25, 2023, 11:16 AM
16
points
6
comments
1
min read
LW
link
Researchers believe they have found a way for artists to fight back against AI style capture
vernamcipher
Oct 25, 2023, 10:54 AM
3
points
1
comment
1
min read
LW
link
(finance.yahoo.com)
Why We Disagree
zulupineapple
Oct 25, 2023, 10:50 AM
7
points
2
comments
2
min read
LW
link
Beyond the Data: Why aid to poor doesn’t work
Lyrongolem
Oct 25, 2023, 5:03 AM
2
points
31
comments
12
min read
LW
link
Announcing Epoch’s newly expanded Parameters, Compute and Data Trends in Machine Learning database
Robi Rahman
,
Jaime Sevilla Molina
,
Tamay
,
Ege Erdil
,
Pablo Villalobos
,
Ben Cottier
and
Matthew Barnett
Oct 25, 2023, 2:55 AM
18
points
0
comments
1
min read
LW
link
(epochai.org)
What is a Sequencing Read?
jefftk
Oct 25, 2023, 2:10 AM
17
points
2
comments
2
min read
LW
link
(www.jefftk.com)
Verifiable private execution of machine learning models with Risc0?
mako yass
Oct 25, 2023, 12:44 AM
30
points
2
comments
2
min read
LW
link
[Question]
How to Resolve Forecasts With No Central Authority?
Nathan Young
Oct 25, 2023, 12:28 AM
17
points
6
comments
1
min read
LW
link
Thoughts on responsible scaling policies and regulation
paulfchristiano
Oct 24, 2023, 10:21 PM
221
points
33
comments
6
min read
LW
link
The Screenplay Method
Yeshua God
Oct 24, 2023, 5:41 PM
−15
points
0
comments
25
min read
LW
link
Blunt Razor
fryolysis
Oct 24, 2023, 5:27 PM
3
points
0
comments
2
min read
LW
link
Halloween Problem
Saint Blasphemer
Oct 24, 2023, 4:46 PM
−10
points
1
comment
1
min read
LW
link
Who is Harry Potter? Some predictions.
Donald Hobson
Oct 24, 2023, 4:14 PM
23
points
7
comments
2
min read
LW
link
Book Review: Going Infinite
Zvi
Oct 24, 2023, 3:00 PM
244
points
113
comments
97
min read
LW
link
1
review
(thezvi.wordpress.com)
[Interview w/ Quintin Pope] Evolution, values, and AI Safety
fowlertm
Oct 24, 2023, 1:53 PM
11
points
0
comments
1
min read
LW
link
Lying is Cowardice, not Strategy
Connor Leahy
and
Gabriel Alfour
24 Oct 2023 13:24 UTC
29
points
73
comments
5
min read
LW
link
(cognition.cafe)
Back to top
Next
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel