Wei Dai comments on Just Imitate Humans?

Wei Dai 16 Sep 2019 7:38 UTC
LW: 2 AF: 2
0
AF

Obviously, there’s other stuff to do to establish a stable unipolar world order

I was asking about this part. I’m not convinced HSIFAUH allows you to do this in a safe way (e.g., without triggering a war that you can’t necessarily win).

Given your lead time from having more computing power than the reckless team, one has to analyze how many doubling periods you have time for.

Another complication here is that the people trying to build ~AIXI can probably build an economically useful ~AIXI using less compute than you need for ~HSIFAUH (for jobs that don’t need to model humans), and start doing their own doublings.

But in terms of extinction threat to real-world humans, this starts to look more like the problem maintaining a power structure over a vast number of humans and less like typical AI alignment difficulties; historically, the former seems to be a solvable problem.

I don’t think we’ve seen a solution that’s very robust though. Plus, having to maintain such a power structure starts to become a human safety problem for the real humans (i.e., potentially causes their values to become corrupted).
- michaelcohen 18 Sep 2019 1:46 UTC
  LW: 1 AF: 1
  0
  AF Parent
  Another complication here is that the people trying to build ~AIXI can probably build an economically useful ~AIXI using less compute than you need for ~HSIFAUH (for jobs that don’t need to model humans), and start doing their own doublings.
  Good point.
  Regarding the other two points, my intuition was that a few dozen people could work out the details satisfactorily in a year. If you don’t share this intuition, I’ll adjust downward on that. But I don’t feel up to putting in those man-hours myself. It seems like there are lots of people without a technical background who are interested in helping avoid AI-based X-risk. Do you think this is a promising enough line of reasoning to be worth some people’s time?
  - Wei Dai 19 Sep 2019 17:58 UTC
    LW: 4 AF: 3
    0
    AF Parent
    
    Regarding the other two points, my intuition was that a few dozen people could work out the details satisfactorily in a year. If you don’t share this intuition, I’ll adjust downward on that.
    
    I’m pretty skeptical of this, but then I’m pretty skeptical of all current safety/alignment approaches and this doesn’t seem especially bad by comparison, so I think it might be worth including in a portfolio approach. But I’d like to better understand why you think it’s promising. Do you have more specific ideas of how ~HSIFAUH can be used to achieve a Singleton and to keep it safe, or just a general feeling that it should be possible?
    - michaelcohen 19 Sep 2019 18:58 UTC
      LW: 1 AF: 1
      0
      AF Parent
      My intuitions are mostly that if you can provide significant rewards and punishments basically for free in imitated humans (or more to the point, memories thereof), and if you can control the flow of information throughout the whole apparatus, and you have total surveillance automatically, this sort of thing is a dictator’s dream. Especially because it usually costs money to make people happy, and in this case, it hardly does—just a bit of computation time. In a world with all the technology in place that a dictator could want, but also it’s pretty cheap to make everyone happy, it strikes me as promising that the system itself could be kept under control.