Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Inverse Reinforcement Learning
Tag
Relevant
New
Old
Model Mis-specification and Inverse Reinforcement Learning
Owain_Evans
and
jsteinhardt
9 Nov 2018 15:33 UTC
31
points
3
comments
16
min read
LW
link
Thoughts on “Human-Compatible”
TurnTrout
10 Oct 2019 5:24 UTC
63
points
35
comments
5
min read
LW
link
Learning biases and rewards simultaneously
Rohin Shah
6 Jul 2019 1:45 UTC
41
points
3
comments
4
min read
LW
link
Our take on CHAI’s research agenda in under 1500 words
Alex Flint
17 Jun 2020 12:24 UTC
101
points
19
comments
5
min read
LW
link
Problems integrating decision theory and inverse reinforcement learning
agilecaveman
8 May 2018 5:11 UTC
7
points
2
comments
3
min read
LW
link
IRL 1/8: Inverse Reinforcement Learning and the problem of degeneracy
RAISE
4 Mar 2019 13:11 UTC
20
points
2
comments
1
min read
LW
link
(app.grasple.com)
Delegative Inverse Reinforcement Learning
Vanessa Kosoy
12 Jul 2017 12:18 UTC
15
points
0
comments
16
min read
LW
link
[Question]
Can coherent extrapolated volition be estimated with Inverse Reinforcement Learning?
Jade Bishop
15 Apr 2019 3:23 UTC
12
points
5
comments
3
min read
LW
link
Cooperative Inverse Reinforcement Learning vs. Irrational Human Preferences
orthonormal
18 Jun 2016 0:55 UTC
13
points
0
comments
3
min read
LW
link
Inverse reinforcement learning on self, pre-ontology-change
Stuart_Armstrong
18 Nov 2015 13:23 UTC
0
points
0
comments
1
min read
LW
link
Biased reward-learning in CIRL
Stuart_Armstrong
5 Jan 2018 18:12 UTC
8
points
3
comments
7
min read
LW
link
CIRL Wireheading
tom4everitt
8 Aug 2017 6:33 UTC
3
points
0
comments
2
min read
LW
link
(C)IRL is not solely a learning process
Stuart_Armstrong
15 Sep 2016 8:35 UTC
1
point
0
comments
3
min read
LW
link
Book Review: Human Compatible
Scott Alexander
31 Jan 2020 5:20 UTC
76
points
6
comments
16
min read
LW
link
(slatestarcodex.com)
Book review: Human Compatible
PeterMcCluskey
19 Jan 2020 3:32 UTC
37
points
2
comments
5
min read
LW
link
(www.bayesianinvestor.com)
AXRP Episode 2 - Learning Human Biases with Rohin Shah
DanielFilan
29 Dec 2020 20:43 UTC
13
points
0
comments
35
min read
LW
link
My take on Michael Littman on “The HCI of HAI”
Alex Flint
2 Apr 2021 19:51 UTC
56
points
4
comments
7
min read
LW
link
AXRP Episode 8 - Assistance Games with Dylan Hadfield-Menell
DanielFilan
8 Jun 2021 23:20 UTC
22
points
1
comment
71
min read
LW
link
[Question]
Is CIRL a promising agenda?
Chris_Leong
23 Jun 2022 17:12 UTC
24
points
12
comments
1
min read
LW
link
RAISE is launching their MVP
null
26 Feb 2019 11:45 UTC
67
points
1
comment
1
min read
LW
link
Human-AI Collaboration
Rohin Shah
22 Oct 2019 6:32 UTC
42
points
7
comments
2
min read
LW
link
(bair.berkeley.edu)
Agents That Learn From Human Behavior Can’t Learn Human Values That Humans Haven’t Learned Yet
steven0461
11 Jul 2018 2:59 UTC
27
points
11
comments
1
min read
LW
link
Humans can be assigned any values whatsoever...
Stuart_Armstrong
13 Oct 2017 11:29 UTC
14
points
6
comments
4
min read
LW
link
Hardcode the AGI to need our approval indefinitely?
MichaelStJules
11 Nov 2021 7:04 UTC
2
points
2
comments
1
min read
LW
link
Machines vs Memes Part 3: Imitation and Memes
ceru23
1 Jun 2022 13:36 UTC
5
points
0
comments
7
min read
LW
link
No comments.
Back to top