Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
AI Risk
Tag
Last edit:
16 Jul 2020 10:29 UTC
by
Ben Pace
AI Risk
is analysis of the risks associated with building powerful AI systems.
Relevant
New
Old
What failure looks like
paulfchristiano
17 Mar 2019 20:18 UTC
240
points
48
comments
8
min read
LW
link
2
nominations
2
reviews
Superintelligence FAQ
Scott Alexander
20 Sep 2016 19:00 UTC
45
points
7
comments
27
min read
LW
link
Specification gaming examples in AI
Vika
3 Apr 2018 12:30 UTC
39
points
9
comments
1
min read
LW
link
Intuitions about goal-directed behavior
rohinmshah
1 Dec 2018 4:25 UTC
46
points
15
comments
6
min read
LW
link
Epistemological Framing for AI Alignment Research
adamShimi
8 Mar 2021 22:05 UTC
50
points
6
comments
9
min read
LW
link
What can the principal-agent literature tell us about AI risk?
Alexis Carlier
8 Feb 2020 21:28 UTC
99
points
31
comments
16
min read
LW
link
Developmental Stages of GPTs
orthonormal
26 Jul 2020 22:03 UTC
127
points
73
comments
7
min read
LW
link
[Question]
Will OpenAI’s work unintentionally increase existential risks related to AI?
adamShimi
11 Aug 2020 18:16 UTC
48
points
54
comments
1
min read
LW
link
How good is humanity at coordination?
Buck
21 Jul 2020 20:01 UTC
72
points
43
comments
3
min read
LW
link
Are minimal circuits deceptive?
evhub
7 Sep 2019 18:11 UTC
51
points
8
comments
8
min read
LW
link
Soft takeoff can still lead to decisive strategic advantage
Daniel Kokotajlo
23 Aug 2019 16:39 UTC
113
points
46
comments
8
min read
LW
link
2
nominations
4
reviews
Should we postpone AGI until we reach safety?
otto.barten
18 Nov 2020 15:43 UTC
23
points
36
comments
3
min read
LW
link
Critiquing “What failure looks like”
Grue_Slinky
27 Dec 2019 23:59 UTC
35
points
6
comments
3
min read
LW
link
The Main Sources of AI Risk?
Daniel Kokotajlo
and
Wei_Dai
21 Mar 2019 18:28 UTC
75
points
22
comments
2
min read
LW
link
Clarifying some key hypotheses in AI alignment
Ben Cottier
and
rohinmshah
15 Aug 2019 21:29 UTC
75
points
11
comments
9
min read
LW
link
“Taking AI Risk Seriously” (thoughts by Critch)
Raemon
29 Jan 2018 9:27 UTC
109
points
68
comments
13
min read
LW
link
Some conceptual highlights from “Disjunctive Scenarios of Catastrophic AI Risk”
Kaj_Sotala
12 Feb 2018 12:30 UTC
29
points
4
comments
6
min read
LW
link
(kajsotala.fi)
Non-Adversarial Goodhart and AI Risks
Davidmanheim
27 Mar 2018 1:39 UTC
22
points
9
comments
6
min read
LW
link
Six AI Risk/Strategy Ideas
Wei_Dai
27 Aug 2019 0:40 UTC
62
points
18
comments
4
min read
LW
link
2
nominations
1
review
[Question]
Did AI pioneers not worry much about AI risks?
lisperati
9 Feb 2020 19:58 UTC
42
points
9
comments
1
min read
LW
link
Some disjunctive reasons for urgency on AI risk
Wei_Dai
15 Feb 2019 20:43 UTC
36
points
24
comments
1
min read
LW
link
Drexler on AI Risk
PeterMcCluskey
1 Feb 2019 5:11 UTC
34
points
10
comments
9
min read
LW
link
(www.bayesianinvestor.com)
A shift in arguments for AI risk
Richard_Ngo
28 May 2019 13:47 UTC
32
points
7
comments
1
min read
LW
link
(fragile-credences.github.io)
Disentangling arguments for the importance of AI safety
Richard_Ngo
21 Jan 2019 12:41 UTC
123
points
23
comments
8
min read
LW
link
AI Safety “Success Stories”
Wei_Dai
7 Sep 2019 2:54 UTC
105
points
27
comments
4
min read
LW
link
2
nominations
1
review
Debate on Instrumental Convergence between LeCun, Russell, Bengio, Zador, and More
Ben Pace
4 Oct 2019 4:08 UTC
177
points
54
comments
15
min read
LW
link
2
nominations
2
reviews
[AN #80]: Why AI risk might be solved without additional intervention from longtermists
rohinmshah
2 Jan 2020 18:20 UTC
35
points
93
comments
10
min read
LW
link
(mailchi.mp)
The strategy-stealing assumption
paulfchristiano
16 Sep 2019 15:23 UTC
68
points
46
comments
12
min read
LW
link
2
nominations
3
reviews
Thinking soberly about the context and consequences of Friendly AI
Mitchell_Porter
16 Oct 2012 4:33 UTC
20
points
39
comments
1
min read
LW
link
Announcement: AI alignment prize winners and next round
cousin_it
15 Jan 2018 14:33 UTC
80
points
68
comments
2
min read
LW
link
What Failure Looks Like: Distilling the Discussion
Ben Pace
29 Jul 2020 21:49 UTC
71
points
11
comments
7
min read
LW
link
Uber Self-Driving Crash
jefftk
7 Nov 2019 15:00 UTC
110
points
1
comment
2
min read
LW
link
(www.jefftk.com)
Reply to Holden on ‘Tool AI’
Eliezer Yudkowsky
12 Jun 2012 18:00 UTC
146
points
357
comments
17
min read
LW
link
Stanford Encyclopedia of Philosophy on AI ethics and superintelligence
Kaj_Sotala
2 May 2020 7:35 UTC
41
points
19
comments
7
min read
LW
link
(plato.stanford.edu)
AGI Safety Literature Review (Everitt, Lea & Hutter 2018)
Kaj_Sotala
4 May 2018 8:56 UTC
13
points
1
comment
1
min read
LW
link
(arxiv.org)
Response to Oren Etzioni’s “How to know if artificial intelligence is about to destroy civilization”
Daniel Kokotajlo
27 Feb 2020 18:10 UTC
27
points
5
comments
8
min read
LW
link
Why don’t singularitarians bet on the creation of AGI by buying stocks?
John_Maxwell
11 Mar 2020 16:27 UTC
36
points
19
comments
4
min read
LW
link
The problem/solution matrix: Calculating the probability of AI safety “on the back of an envelope”
John_Maxwell
20 Oct 2019 8:03 UTC
22
points
4
comments
2
min read
LW
link
Three Stories for How AGI Comes Before FAI
John_Maxwell
17 Sep 2019 23:26 UTC
27
points
8
comments
6
min read
LW
link
Brainstorming additional AI risk reduction ideas
John_Maxwell
14 Jun 2012 7:55 UTC
19
points
37
comments
1
min read
LW
link
AI Alignment 2018-19 Review
rohinmshah
28 Jan 2020 2:19 UTC
115
points
6
comments
35
min read
LW
link
The Fusion Power Generator Scenario
johnswentworth
8 Aug 2020 18:31 UTC
104
points
25
comments
3
min read
LW
link
A guide to Iterated Amplification & Debate
Rafael Harth
15 Nov 2020 17:14 UTC
58
points
8
comments
15
min read
LW
link
Work on Security Instead of Friendliness?
Wei_Dai
21 Jul 2012 18:28 UTC
49
points
107
comments
2
min read
LW
link
An unaligned benchmark
paulfchristiano
17 Nov 2018 15:51 UTC
27
points
0
comments
9
min read
LW
link
Artificial Intelligence: A Modern Approach (4th edition) on the Alignment Problem
Zack_M_Davis
17 Sep 2020 2:23 UTC
72
points
12
comments
5
min read
LW
link
(aima.cs.berkeley.edu)
Clarifying “What failure looks like” (part 1)
Sam Clarke
20 Sep 2020 20:40 UTC
69
points
13
comments
17
min read
LW
link
Relaxed adversarial training for inner alignment
evhub
10 Sep 2019 23:03 UTC
54
points
10
comments
27
min read
LW
link
An overview of 11 proposals for building safe advanced AI
evhub
29 May 2020 20:38 UTC
147
points
30
comments
38
min read
LW
link
Risks from Learned Optimization: Introduction
evhub
,
Chris van Merwijk
,
vlad_m
,
Joar Skalse
and
Scott Garrabrant
31 May 2019 23:44 UTC
140
points
40
comments
12
min read
LW
link
3
nominations
3
reviews
Risks from Learned Optimization: Conclusion and Related Work
evhub
,
Chris van Merwijk
,
vlad_m
,
Joar Skalse
and
Scott Garrabrant
7 Jun 2019 19:53 UTC
70
points
4
comments
6
min read
LW
link
Deceptive Alignment
evhub
,
Chris van Merwijk
,
vlad_m
,
Joar Skalse
and
Scott Garrabrant
5 Jun 2019 20:16 UTC
69
points
11
comments
17
min read
LW
link
The Inner Alignment Problem
evhub
,
Chris van Merwijk
,
vlad_m
,
Joar Skalse
and
Scott Garrabrant
4 Jun 2019 1:20 UTC
76
points
17
comments
13
min read
LW
link
Conditions for Mesa-Optimization
evhub
,
Chris van Merwijk
,
vlad_m
,
Joar Skalse
and
Scott Garrabrant
1 Jun 2019 20:52 UTC
62
points
47
comments
12
min read
LW
link
AI risk hub in Singapore?
Daniel Kokotajlo
29 Oct 2020 11:45 UTC
50
points
18
comments
4
min read
LW
link
Thoughts on Robin Hanson’s AI Impacts interview
Steven Byrnes
24 Nov 2019 1:40 UTC
25
points
3
comments
7
min read
LW
link
The AI Safety Game (UPDATED)
Daniel Kokotajlo
5 Dec 2020 10:27 UTC
38
points
5
comments
3
min read
LW
link
[Question]
Suggestions of posts on the AF to review
adamShimi
16 Feb 2021 12:40 UTC
50
points
17
comments
1
min read
LW
link
Google’s Ethical AI team and AI Safety
magfrump
20 Feb 2021 9:42 UTC
12
points
15
comments
7
min read
LW
link
Behavioral Sufficient Statistics for Goal-Directedness
adamShimi
11 Mar 2021 15:01 UTC
21
points
12
comments
9
min read
LW
link
Review of “Fun with +12 OOMs of Compute”
adamShimi
,
Joe_Collman
and
Gyrodiot
28 Mar 2021 14:55 UTC
52
points
18
comments
8
min read
LW
link
April drafts
AI Impacts
1 Apr 2021 18:10 UTC
49
points
2
comments
1
min read
LW
link
(aiimpacts.org)
Access to AI: a human right?
dmtea
25 Jul 2020 9:38 UTC
5
points
3
comments
2
min read
LW
link
Agentic Language Model Memes
FactorialCode
1 Aug 2020 18:03 UTC
16
points
1
comment
2
min read
LW
link
Conversation with Paul Christiano
abergal
11 Sep 2019 23:20 UTC
44
points
6
comments
30
min read
LW
link
(aiimpacts.org)
Transcription of Eliezer’s January 2010 video Q&A
curiousepic
14 Nov 2011 17:02 UTC
109
points
9
comments
56
min read
LW
link
Responses to Catastrophic AGI Risk: A Survey
lukeprog
8 Jul 2013 14:33 UTC
17
points
8
comments
1
min read
LW
link
How can I reduce existential risk from AI?
lukeprog
13 Nov 2012 21:56 UTC
60
points
92
comments
8
min read
LW
link
Thoughts on Ben Garfinkel’s “How sure are we about this AI stuff?”
capybaralet
6 Feb 2019 19:09 UTC
25
points
17
comments
1
min read
LW
link
Reframing misaligned AGI’s: well-intentioned non-neurotypical assistants
zhukeepa
1 Apr 2018 1:22 UTC
46
points
14
comments
2
min read
LW
link
When is unaligned AI morally valuable?
paulfchristiano
25 May 2018 1:57 UTC
58
points
52
comments
10
min read
LW
link
Introducing the AI Alignment Forum (FAQ)
habryka
,
Ben Pace
,
Raemon
and
jimrandomh
29 Oct 2018 21:07 UTC
86
points
8
comments
6
min read
LW
link
Swimming Upstream: A Case Study in Instrumental Rationality
TurnTrout
3 Jun 2018 3:16 UTC
64
points
7
comments
8
min read
LW
link
Current AI Safety Roles for Software Engineers
ozziegooen
9 Nov 2018 20:57 UTC
69
points
9
comments
4
min read
LW
link
[Question]
Why is so much discussion happening in private Google Docs?
Wei_Dai
12 Jan 2019 2:19 UTC
93
points
21
comments
1
min read
LW
link
Problems in AI Alignment that philosophers could potentially contribute to
Wei_Dai
17 Aug 2019 17:38 UTC
70
points
14
comments
2
min read
LW
link
Two Neglected Problems in Human-AI Safety
Wei_Dai
16 Dec 2018 22:13 UTC
72
points
23
comments
2
min read
LW
link
Announcement: AI alignment prize round 4 winners
cousin_it
20 Jan 2019 14:46 UTC
74
points
41
comments
1
min read
LW
link
Soon: a weekly AI Safety prerequisites module on LessWrong
toonalfrink
30 Apr 2018 13:23 UTC
35
points
10
comments
1
min read
LW
link
And the AI would have got away with it too, if...
Stuart_Armstrong
22 May 2019 21:35 UTC
75
points
7
comments
1
min read
LW
link
2017 AI Safety Literature Review and Charity Comparison
Larks
24 Dec 2017 18:52 UTC
41
points
5
comments
23
min read
LW
link
Should ethicists be inside or outside a profession?
Eliezer Yudkowsky
12 Dec 2018 1:40 UTC
77
points
6
comments
9
min read
LW
link
A Gym Gridworld Environment for the Treacherous Turn
Michaël Trazzi
28 Jul 2018 21:27 UTC
66
points
9
comments
3
min read
LW
link
(github.com)
I Vouch For MIRI
Zvi
17 Dec 2017 17:50 UTC
34
points
9
comments
5
min read
LW
link
(thezvi.wordpress.com)
Beware of black boxes in AI alignment research
cousin_it
18 Jan 2018 15:07 UTC
39
points
10
comments
1
min read
LW
link
AI Alignment Prize: Round 2 due March 31, 2018
Zvi
12 Mar 2018 12:10 UTC
28
points
2
comments
3
min read
LW
link
(thezvi.wordpress.com)
Three AI Safety Related Ideas
Wei_Dai
13 Dec 2018 21:32 UTC
62
points
38
comments
2
min read
LW
link
A rant against robots
Lê Nguyên Hoang
14 Jan 2020 22:03 UTC
60
points
7
comments
5
min read
LW
link
Opportunities for individual donors in AI safety
alexflint
31 Mar 2018 18:37 UTC
28
points
3
comments
11
min read
LW
link
But exactly how complex and fragile?
KatjaGrace
3 Nov 2019 18:20 UTC
65
points
32
comments
3
min read
LW
link
2
nominations
1
review
(meteuphoric.com)
Course recommendations for Friendliness researchers
Louie
9 Jan 2013 14:33 UTC
90
points
112
comments
10
min read
LW
link
AI Safety Research Camp—Project Proposal
David_Kristoffersson
2 Feb 2018 4:25 UTC
29
points
11
comments
8
min read
LW
link
AI Summer Fellows Program
colm
21 Mar 2018 15:32 UTC
21
points
0
comments
1
min read
LW
link
The genie knows, but doesn’t care
Rob Bensinger
6 Sep 2013 6:42 UTC
88
points
519
comments
8
min read
LW
link
Alignment Newsletter #13: 07/02/18
rohinmshah
2 Jul 2018 16:10 UTC
70
points
12
comments
8
min read
LW
link
(mailchi.mp)
An Increasingly Manipulative Newsfeed
Michaël Trazzi
1 Jul 2019 15:26 UTC
57
points
14
comments
5
min read
LW
link
The simple picture on AI safety
alexflint
27 May 2018 19:43 UTC
25
points
10
comments
2
min read
LW
link
Elon Musk donates $10M to the Future of Life Institute to keep AI beneficial
Paul Crowley
15 Jan 2015 16:33 UTC
78
points
52
comments
1
min read
LW
link
Strategic implications of AIs’ ability to coordinate at low cost, for example by merging
Wei_Dai
25 Apr 2019 5:08 UTC
58
points
45
comments
2
min read
LW
link
2
nominations
1
review
Modeling AGI Safety Frameworks with Causal Influence Diagrams
Ramana Kumar
21 Jun 2019 12:50 UTC
43
points
6
comments
1
min read
LW
link
(arxiv.org)
Henry Kissinger: AI Could Mean the End of Human History
ESRogs
15 May 2018 20:11 UTC
17
points
12
comments
1
min read
LW
link
(www.theatlantic.com)
Toy model of the AI control problem: animated version
Stuart_Armstrong
10 Oct 2017 11:06 UTC
25
points
8
comments
1
min read
LW
link
A Visualization of Nick Bostrom’s Superintelligence
[deleted]
23 Jul 2014 0:24 UTC
59
points
28
comments
3
min read
LW
link
AI Alignment Research Overview (by Jacob Steinhardt)
Ben Pace
6 Nov 2019 19:24 UTC
42
points
0
comments
7
min read
LW
link
(docs.google.com)
A general model of safety-oriented AI development
Wei_Dai
11 Jun 2018 21:00 UTC
65
points
8
comments
1
min read
LW
link
Counterfactual Oracles = online supervised learning with random selection of training episodes
Wei_Dai
10 Sep 2019 8:29 UTC
44
points
26
comments
3
min read
LW
link
Siren worlds and the perils of over-optimised search
Stuart_Armstrong
7 Apr 2014 11:00 UTC
68
points
417
comments
7
min read
LW
link
Top 9+2 myths about AI risk
Stuart_Armstrong
29 Jun 2015 20:41 UTC
64
points
46
comments
2
min read
LW
link
Rohin Shah on reasons for AI optimism
abergal
31 Oct 2019 12:10 UTC
40
points
58
comments
1
min read
LW
link
(aiimpacts.org)
Plausibly, almost every powerful algorithm would be manipulative
Stuart_Armstrong
6 Feb 2020 11:50 UTC
38
points
25
comments
3
min read
LW
link
The Magnitude of His Own Folly
Eliezer Yudkowsky
30 Sep 2008 11:31 UTC
51
points
128
comments
6
min read
LW
link
AI alignment landscape
paulfchristiano
13 Oct 2019 2:10 UTC
40
points
3
comments
1
min read
LW
link
(ai-alignment.com)
Launched: Friendship is Optimal
iceman
15 Nov 2012 4:57 UTC
63
points
31
comments
1
min read
LW
link
Friendship is Optimal: A My Little Pony fanfic about an optimization process
iceman
8 Sep 2012 6:16 UTC
95
points
152
comments
1
min read
LW
link
Do Earths with slower economic growth have a better chance at FAI?
Eliezer Yudkowsky
12 Jun 2013 19:54 UTC
54
points
176
comments
4
min read
LW
link
Idea: Open Access AI Safety Journal
G Gordon Worley III
23 Mar 2018 18:27 UTC
28
points
11
comments
1
min read
LW
link
G.K. Chesterton On AI Risk
Scott Alexander
1 Apr 2017 19:00 UTC
11
points
0
comments
7
min read
LW
link
The Hidden Complexity of Wishes
Eliezer Yudkowsky
24 Nov 2007 0:12 UTC
105
points
135
comments
7
min read
LW
link
The Friendly AI Game
bentarm
15 Mar 2011 16:45 UTC
50
points
178
comments
1
min read
LW
link
Q&A with Jürgen Schmidhuber on risks from AI
XiXiDu
15 Jun 2011 15:51 UTC
54
points
45
comments
4
min read
LW
link
[Question]
What should an Einstein-like figure in Machine Learning do?
Razied
5 Aug 2020 23:52 UTC
3
points
3
comments
1
min read
LW
link
Takeaways from safety by default interviews
AI Impacts
and
abergal
3 Apr 2020 17:20 UTC
23
points
2
comments
13
min read
LW
link
(aiimpacts.org)
Field-Building and Deep Models
Ben Pace
13 Jan 2018 21:16 UTC
21
points
12
comments
4
min read
LW
link
Critique my Model: The EV of AGI to Selfish Individuals
ozziegooen
8 Apr 2018 20:04 UTC
19
points
9
comments
4
min read
LW
link
‘Dumb’ AI observes and manipulates controllers
Stuart_Armstrong
13 Jan 2015 13:35 UTC
52
points
19
comments
2
min read
LW
link
2019 AI Alignment Literature Review and Charity Comparison
Larks
19 Dec 2019 3:00 UTC
129
points
18
comments
62
min read
LW
link
Book review: Architects of Intelligence by Martin Ford (2018)
ofer
11 Aug 2020 17:30 UTC
15
points
0
comments
2
min read
LW
link
Qualitative Strategies of Friendliness
Eliezer Yudkowsky
30 Aug 2008 2:12 UTC
13
points
56
comments
12
min read
LW
link
Dreams of Friendliness
Eliezer Yudkowsky
31 Aug 2008 1:20 UTC
24
points
80
comments
9
min read
LW
link
Conceptual issues in AI safety: the paradigmatic gap
vedevazz
24 Jun 2018 15:09 UTC
33
points
0
comments
1
min read
LW
link
(www.foldl.me)
On unfixably unsafe AGI architectures
Steven Byrnes
19 Feb 2020 21:16 UTC
29
points
8
comments
5
min read
LW
link
A toy model of the treacherous turn
Stuart_Armstrong
8 Jan 2016 12:58 UTC
33
points
13
comments
6
min read
LW
link
Allegory On AI Risk, Game Theory, and Mithril
James_Miller
13 Feb 2017 20:41 UTC
41
points
57
comments
3
min read
LW
link
1hr talk: Intro to AGI safety
Steven Byrnes
18 Jun 2019 21:41 UTC
33
points
4
comments
24
min read
LW
link
The Evil AI Overlord List
Stuart_Armstrong
20 Nov 2012 17:02 UTC
44
points
80
comments
1
min read
LW
link
What I would like the SIAI to publish
XiXiDu
1 Nov 2010 14:07 UTC
36
points
225
comments
3
min read
LW
link
Evaluating the feasibility of SI’s plan
JoshuaFox
10 Jan 2013 8:17 UTC
37
points
188
comments
4
min read
LW
link
Q&A with experts on risks from AI #1
XiXiDu
8 Jan 2012 11:46 UTC
45
points
67
comments
9
min read
LW
link
Algo trading is a central example of AI risk
Vanessa Kosoy
28 Jul 2018 20:31 UTC
24
points
5
comments
1
min read
LW
link
Will the world’s elites navigate the creation of AI just fine?
lukeprog
31 May 2013 18:49 UTC
36
points
266
comments
2
min read
LW
link
Let’s talk about “Convergent Rationality”
capybaralet
12 Jun 2019 21:53 UTC
35
points
33
comments
6
min read
LW
link
Breaking Oracles: superrationality and acausal trade
Stuart_Armstrong
25 Nov 2019 10:40 UTC
25
points
15
comments
1
min read
LW
link
Q&A with Stan Franklin on risks from AI
XiXiDu
11 Jun 2011 15:22 UTC
36
points
10
comments
2
min read
LW
link
Muehlhauser-Goertzel Dialogue, Part 1
lukeprog
16 Mar 2012 17:12 UTC
42
points
161
comments
33
min read
LW
link
[LINK] NYT Article about Existential Risk from AI
[deleted]
28 Jan 2013 10:37 UTC
38
points
23
comments
1
min read
LW
link
Reframing the Problem of AI Progress
Wei_Dai
12 Apr 2012 19:31 UTC
32
points
47
comments
1
min read
LW
link
New AI risks research institute at Oxford University
lukeprog
16 Nov 2011 18:52 UTC
36
points
10
comments
1
min read
LW
link
Thoughts on the Feasibility of Prosaic AGI Alignment?
iamthouthouarti
21 Aug 2020 23:25 UTC
8
points
10
comments
1
min read
LW
link
Memes and Rational Decisions
inferential
9 Jan 2015 6:42 UTC
35
points
17
comments
10
min read
LW
link
Levels of AI Self-Improvement
avturchin
29 Apr 2018 11:45 UTC
9
points
0
comments
39
min read
LW
link
Optimising Society to Constrain Risk of War from an Artificial Superintelligence
JohnCDraper
30 Apr 2020 10:47 UTC
3
points
0
comments
51
min read
LW
link
Some Thoughts on Singularity Strategies
Wei_Dai
13 Jul 2011 2:41 UTC
37
points
29
comments
3
min read
LW
link
A trick for Safer GPT-N
Razied
23 Aug 2020 0:39 UTC
7
points
1
comment
2
min read
LW
link
against “AI risk”
Wei_Dai
11 Apr 2012 22:46 UTC
35
points
91
comments
1
min read
LW
link
“Smarter than us” is out!
Stuart_Armstrong
25 Feb 2014 15:50 UTC
41
points
57
comments
1
min read
LW
link
Analysing: Dangerous messages from future UFAI via Oracles
Stuart_Armstrong
22 Nov 2019 14:17 UTC
22
points
16
comments
4
min read
LW
link
Q&A with Abram Demski on risks from AI
XiXiDu
17 Jan 2012 9:43 UTC
33
points
71
comments
9
min read
LW
link
Q&A with experts on risks from AI #2
XiXiDu
9 Jan 2012 19:40 UTC
22
points
29
comments
7
min read
LW
link
AI Safety Discussion Day
Linda Linsefors
15 Sep 2020 14:40 UTC
20
points
0
comments
1
min read
LW
link
A long reply to Ben Garfinkel on Scrutinizing Classic AI Risk Arguments
Søren Elverlin
27 Sep 2020 17:51 UTC
16
points
6
comments
1
min read
LW
link
Online AI Safety Discussion Day
Linda Linsefors
8 Oct 2020 12:11 UTC
5
points
0
comments
1
min read
LW
link
Military AI as a Convergent Goal of Self-Improving AI
avturchin
13 Nov 2017 12:17 UTC
5
points
3
comments
1
min read
LW
link
Neural program synthesis is a dangerous technology
syllogism
12 Jan 2018 16:19 UTC
9
points
6
comments
2
min read
LW
link
New, Brief Popular-Level Introduction to AI Risks and Superintelligence
LyleN
23 Jan 2015 15:43 UTC
33
points
3
comments
1
min read
LW
link
FAI Research Constraints and AGI Side Effects
JustinShovelain
3 Jun 2015 19:25 UTC
26
points
59
comments
7
min read
LW
link
European Master’s Programs in Machine Learning, Artificial Intelligence, and related fields
Master Programs ML/AI
14 Nov 2020 15:51 UTC
25
points
8
comments
1
min read
LW
link
The mind-killer
Paul Crowley
2 May 2009 16:49 UTC
29
points
160
comments
2
min read
LW
link
[Question]
Should I do it?
MrLight
19 Nov 2020 1:08 UTC
−3
points
16
comments
2
min read
LW
link
Rationalising humans: another mugging, but not Pascal’s
Stuart_Armstrong
14 Nov 2017 15:46 UTC
7
points
1
comment
3
min read
LW
link
Machine learning could be fundamentally unexplainable
George
16 Dec 2020 13:32 UTC
25
points
15
comments
15
min read
LW
link
(cerebralab.com)
[Question]
What do you make of AGI:unaligned::spaceships:not enough food?
Ronny
22 Feb 2020 14:14 UTC
4
points
3
comments
1
min read
LW
link
Risk Map of AI Systems
VojtaKovarik
and
Jan_Kulveit
15 Dec 2020 9:16 UTC
24
points
3
comments
8
min read
LW
link
Edge of the Cliff
akaTrickster
5 Jan 2021 17:21 UTC
1
point
0
comments
5
min read
LW
link
[Question]
Does it become easier, or harder, for the world to coordinate around not building AGI as time goes on?
Eli Tyre
29 Jul 2019 22:59 UTC
86
points
31
comments
3
min read
LW
link
2
nominations
2
reviews
Grey Goo Requires AI
harsimony
15 Jan 2021 4:45 UTC
8
points
11
comments
4
min read
LW
link
(harsimony.wordpress.com)
AISU 2021
Linda Linsefors
30 Jan 2021 17:40 UTC
27
points
2
comments
1
min read
LW
link
Nonperson Predicates
Eliezer Yudkowsky
27 Dec 2008 1:47 UTC
42
points
176
comments
6
min read
LW
link
Engaging First Introductions to AI Risk
Rob Bensinger
19 Aug 2013 6:26 UTC
31
points
21
comments
3
min read
LW
link
Formal Solution to the Inner Alignment Problem
michaelcohen
18 Feb 2021 14:51 UTC
46
points
122
comments
2
min read
LW
link
[Question]
What are the biggest current impacts of AI?
Sam Clarke
7 Mar 2021 21:44 UTC
15
points
4
comments
1
min read
LW
link
[Question]
Is a Self-Iterating AGI Vulnerable to Thompson-style Trojans?
sxae
25 Mar 2021 14:46 UTC
15
points
7
comments
3
min read
LW
link
AI oracles on blockchain
Caravaggio
6 Apr 2021 20:13 UTC
3
points
0
comments
3
min read
LW
link
Another (outer) alignment failure story
paulfchristiano
7 Apr 2021 20:12 UTC
120
points
19
comments
12
min read
LW
link
What if AGI is near?
Wulky Wilkinsen
14 Apr 2021 0:05 UTC
12
points
5
comments
1
min read
LW
link
No comments.
Back to top