Caspar Oesterheld comments on Announcing the AI Alignment Prize

Caspar Oesterheld 21 Dec 2017 16:10 UTC
6 points
0
You don’t mention decision theory in your list of topics, but I guess it doesn’t hurt to try.
I have thought a bit about what one might call the “implementation problem of decision theory”. Let’s say you believe that some theory of rational decision making, e.g., evidential or updateless decision theory, is the right one for an AI to use. How would you design an AI to behave in accordance with such a normative theory? Conversely, if you just go ahead and build a system in some existing framework, how would that AI behave in Newcomb-like problems?
There are two pieces that I uploaded/finished on this topic in November and December. The first is a blog post noting that futarchy-type architectures would, per default, implement evidential decision theory. The second is a draft titled “Approval-directed agency and the decision theory of Newcomb-like problems”.
For anyone who’s interested in this topic, here are some other related papers and blog posts:
- Another one I wrote: “Doing what has worked well in the past leads to evidential decision theory” (I updated this in December, but it was first written and uploaded in September, so it doesn’t count for the competition.)
- Albert and Heiner (2001): “An Indirect-Evolution Approach to Newcomb’s Problem”
- Meyer, Feldmaier and Shen (2016): “Reinforcement Learning in Conflicting Environments for Autonomous Vehicles”
So far, my research and the papers by others I linked have focused on classic Newcomb-like problems. One could also discuss how existing AI paradigms related to other issues of naturalized agency, in particular self-locating beliefs and naturalized induction, though here it seems more as though existing frameworks just lead to really messy behavior.
Send comments to firstnameDOTlastnameATfoundational-researchDOTorg. (Of course, you can also comment here or send you a LW PM.)
- cousin_it 31 Dec 2017 12:56 UTC
  3 points
  0
  Parent
  Caspar, thanks for the amazing entry! Acknowledged.