Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Francis Rhys Ward
Karma:
310
All
Posts
Comments
New
Top
Old
On Agent Incentives to Manipulate Human Feedback in Multi-Agent Reward Learning Scenarios
Francis Rhys Ward
3 Apr 2022 18:20 UTC
27
points
10
comments
8
min read
LW
link
For every choice of AGI difficulty, conditioning on gradual take-off implies shorter timelines.
Francis Rhys Ward
21 Apr 2022 7:44 UTC
31
points
13
comments
3
min read
LW
link
Back to top