Hey Steve,
Thanks for linking to Abram’s excellent blog post.
We should have pointed this out in the paper, but there is a simple correspondence between Abram’s terminology and ours:
Easy wireheading problem = reward function tampering
Hard wireheading problem = feedback tampering.
Our current-RF optimization corresponds to Abram’s observation-utility agent.
We also discuss the RF-input tampering problem and solutions (sometimes called the delusion box problem), which I don’t fit into Abram’s distinction.
Hey Steve,
Thanks for linking to Abram’s excellent blog post.
We should have pointed this out in the paper, but there is a simple correspondence between Abram’s terminology and ours:
Easy wireheading problem = reward function tampering
Hard wireheading problem = feedback tampering.
Our current-RF optimization corresponds to Abram’s observation-utility agent.
We also discuss the RF-input tampering problem and solutions (sometimes called the delusion box problem), which I don’t fit into Abram’s distinction.