Model-free decisions

paulfchristiano2 Dec 2014 17:39 UTC

LW: 6 AF: 4

Much concern about AI comes down to the scariness of goal-oriented behavior. A common response to such concerns is “why would we give an AI goals anyway?” I think there are good reasons to expect goal-oriented behavior, and I’ve been on that side of a lot of arguments. But I don’t think the issue is settled, and it might be possible to get better outcomes by directly specifying what actions are good. I flesh out one possible alternative here.

(As an experiment I wrote the post on medium, so that it is easier to provide sentence-level feedback, especially feedback on writing or low-level comments. Big-picture discussion should probably stay here.)

What links here?

Structural Risk Minimization by abramdemski (7 Jun 2015 3:40 UTC; 3 points)

paulfchristiano2 Dec 2014 17:39 UTC

LW: 6 AF: 4

4 comments1 min readLW link

IAFF-User-34 3 Jul 2015 22:45 UTC
0 points
0
Despite this essay’s age, I was linked by Structural Risk Minimization and felt I had to address some points you made. I think you may have dismissed a strawman of consequence-approval direction, and then later used a more robust version on your reasoning while avoiding those terms. See my comments on the essay.
danieldewey 13 Dec 2014 12:25 UTC
0 points
0
AF
It seems that if it is desired, the overseer could also set their behaviour and intentions so that the approval-directed agent acts as we would want an oracle or tool to act. This is a nice feature.
danieldewey 3 Dec 2014 14:15 UTC
0 points
0
AF
I think Nick Bostrom and Stuart Armstrong would also be interested in this, and might have good feedback for you.
danieldewey 3 Dec 2014 14:12 UTC
0 points
0
AF
High-level feedback: this is a really interesting proposal, and looks like a promising direction to me! Most of my inline comments on Medium are more critical, but that doesn’t reflect my overall assessment.