Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
tom4everitt
(Tom Everitt)
Karma:
432
Research Scientist at DeepMind
tomeveritt.se
All
Posts
Comments
New
Top
Old
Reward Hacking from a Causal Perspective
tom4everitt
,
Francis Rhys Ward
,
sbenthall
,
James Fox
,
mattmacdermott
and
RyanCarey
21 Jul 2023 18:27 UTC
29
points
5
comments
7
min read
LW
link
Incentives from a causal perspective
tom4everitt
,
James Fox
,
RyanCarey
,
mattmacdermott
,
sbenthall
and
Jonathan Richens
10 Jul 2023 17:16 UTC
27
points
0
comments
6
min read
LW
link
Agency from a causal perspective
tom4everitt
,
mattmacdermott
,
James Fox
,
Francis Rhys Ward
and
Jonathan Richens
30 Jun 2023 17:37 UTC
38
points
5
comments
6
min read
LW
link
Causality: A Brief Introduction
tom4everitt
,
Lewis Hammond
,
Jonathan Richens
,
Francis Rhys Ward
,
RyanCarey
,
sbenthall
and
James Fox
20 Jun 2023 15:01 UTC
48
points
18
comments
6
min read
LW
link
Introduction to Towards Causal Foundations of Safe AGI
tom4everitt
,
Lewis Hammond
,
Francis Rhys Ward
,
RyanCarey
,
James Fox
,
mattmacdermott
and
sbenthall
12 Jun 2023 17:55 UTC
67
points
6
comments
4
min read
LW
link
Progress on Causal Influence Diagrams
tom4everitt
30 Jun 2021 15:34 UTC
73
points
6
comments
9
min read
LW
link
Specification gaming: the flip side of AI ingenuity
Vika
,
Vlad Mikulik
,
Matthew Rahtz
,
tom4everitt
,
Zac Kenton
and
janleike
6 May 2020 23:51 UTC
65
points
9
comments
6
min read
LW
link
CIRL Wireheading
tom4everitt
8 Aug 2017 6:33 UTC
3
points
4
comments
2
min read
LW
link
Sequential Extensions of Causal and Evidential Decision Theory
tom4everitt
15 Oct 2015 23:45 UTC
2
points
0
comments
1
min read
LW
link
(arxiv.org)
Back to top