Stuart_Armstrong comments on ‘Dumb’ AI observes and manipulates controllers

Stuart_Armstrong 23 Jul 2015 8:41 UTC
0 points
0

I see no reason this story-writing AI would need to be allowed to plan more than one story at time.

Because the AI is programmed by people who hadn’t thought of this issue, and the other way turned out to be simpler/easier?

dynamic inconsistency can provide intrinsic protection from unwanted long-term strategies from the AI.

I know. The problem is that inconsistency is unstable (which is why we’re using other measures to maintain it, eg using a tool AI only). That’s one of the reasons I was interested in stable versions of these kind of unstable motivations http://lesswrong.com/r/discussion/lw/lws/closest_stable_alternative_preferences/ .
- V_V 23 Jul 2015 9:41 UTC
  −1 points
  0
  Parent
  
  Because the AI is programmed by people who hadn’t thought of this issue, and the other way turned out to be simpler/easier?
  
  Ok, but if this is a narrow AI rather than an AGI agent used for that particular activity, then it seems intuitive to me that designing it to plan over a single task at time would be simpler.
  
  I know. The problem is that inconsistency is unstable (which is why we’re using other measures to maintain it, eg using a tool AI only). That’s one of the reasons I was interested in stable versions of these kind of unstable motivations http://lesswrong.com/r/discussion/lw/lws/closest_stable_alternative_preferences/ .
  
  The post you liked doesn’t deal with dynamic inconsistency. It refers to agents that are expected utility maximizers under Von Neumann–Morgenstern utility theory, but this theory only deals with one-shot decision making, not decision making over time.
  
  You can reduce the problem of decision making over time to one-shot decision making by combining instantaneous utilities into a cumulative utility function ( * ) and then using it as a one-shot utility function.
  
  If you combine the instantaneous utilities by their (exponentially discounted) sum over an infinite time horizon, you obtain a dynamically consistent expected utility maximizer agent. But if you sum utilities up to a fixed time horizon, you still obtain an agent that at each instant is an expected utility maximizer, but it is not dynamically consistent.
  
  You may argue that dynamical inconsistency is not stable under evolution by random mutations and natural selection, but it is not obvious to me that AIs would face such scenario. Even an AI that modifies itself or generate successors has no incentive to maximize its evolutionary fitness unless you specifically program it to do so.
  - Stuart_Armstrong 23 Jul 2015 9:51 UTC
    0 points
    0
    Parent
    Actually, you could use corrigibility to get dynamic inconsistency https://intelligence.org/2014/10/18/new-report-corrigibility/ .