Manfred comments on [draft] Concepts are Difficult, and Unfriendliness is the Default: A Scary Idea Summary

Manfred 31 Mar 2012 12:17 UTC
2 points
The way to fix the quoted argument is to have the utility function be random, grafted on to some otherwise-functioning AI.

A random utility function is maximized by a random state of the universe. And most arrangements of the universe don’t contain humans. If the AI’s utility function doesn’t some how get maximized by one of the very few states that contains humans, it’s very clearly unfriendly because it wants to replace humans with something else.
- Dmytry 31 Mar 2012 12:56 UTC
  5 points
  Parent
  
  The way to fix the quoted argument is to have the utility function be random, grafted on to some otherwise-functioning AI.
  
  Not demonstrably doable, arises from wrong intuitions arising from thinking too much about the AIs with oracular powers of prediction which straightforwardly maximize the utility, rather than of realistic cases—on limited hardware—which have limited foresight and employ instrumental strategies and goals which have to be derived from the utility function (and which can alter the utility function unless it is protected. The fact that utility modification is against the utility itself is insufficient when employing strategies and limited foresight).
  
  Furthermore, an utility function can be self destructive.
  
  A random utility function is maximized by a random state of the universe.
  
  False. A random code for a function crashes (or never terminates). Of the codes that do not crash, simplest codes massively predominate. Demonstrably false if you try to generate random utility functions by generating random C code, which evaluate the utility of some test environment.
  
  The problem I have with those arguments is that a: many things are plain false, and b: you try to ‘fix’ stuff by bolting in more and more conjunctions (‘you can graft random utility functions onto well functioning AIs’) into your giant scary conjunction, instead of updating, when contradicted. That’s the definite sign of rationalization. It can also always be done no matter how much counter argument there exist—you can always add something into scary conjunction to make it happen. Adding conditions into conjunction should decrease it’s probability.
  - Manfred 31 Mar 2012 13:28 UTC
    0 points
    Parent
    Function as in function).
    - Dmytry 31 Mar 2012 13:51 UTC
      2 points
      Parent
      I’d rather be concerned with implementations of functions, like Turing machine tapes, or C code, or x86 instructions, or the like.
      
      In any case the point is rather moot because the function is human generated. Hopefully humans can do better than random, albeit i wouldn’t wager on this—the FAI attempts are potentially worrisome as humans are sloppy programmers, and bugged FAIs would follow different statistics entirely. Still, I would expect bugged FAIs to be predominantly self destructive. (I’m just not sure if the non-self-destructive bugged FAI attempts are predominantly mankind-destroying or not)
- David_Gerard 1 Apr 2012 20:16 UTC
  −1 points
  Parent
  In the days when Sussman was a novice, Minsky once came to him as he sat hacking at the PDP-6.
  
  “What are you doing?”, asked Minsky.
  
  “I am training a randomly wired neural net to play Tic-Tac-Toe” Sussman replied.
  
  “Why is the net wired randomly?”, asked Minsky.
  
  “I do not want it to have any preconceptions of how to play”, Sussman said.
  
  Minsky then shut his eyes.
  
  “Why do you close your eyes?”, Sussman asked his teacher.
  
  “So that the room will be empty.”
  
  At that moment, Sussman was enlightened.
  
  -- AI Koans