http://kajsotala.fi/2016/04/simplifying-the-environment-a-new-convergent-instrumental-goal/

Convergent instrumental goals (also basic AI drives) are goals that are useful for pursuing almost any other goal, and are thus likely to be pursued by any agent that is intelligent enough to understand why they’re useful. They are interesting because they may allow us to roughly predict the behavior of even AI systems that are much more intelligent than we are.

Instrumental goals are also a strong argument for why sufficiently advanced AI systems that were indifferent towards human values could be dangerous towards humans, even if they weren’t actively malicious: because the AI having instrumental goals such as self-preservation or resource acquisition could come to conflict with human well-being. “The AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else.”

I’ve thought of a candidate for a new convergent instrumental drive: simplifying the environment to make it more predictable in a way that aligns with your goals.

[link] Simplifying the environment: a new convergent instrumental goal

Kaj_Sotala22 Apr 2016 6:48 UTC

10 points

4 comments1 min readLW link Archive

gwern 22 Apr 2016 14:49 UTC
8 points

All stable processes we shall predict. All unstable processes we shall control.
- Gunnar_Zarncke 24 Apr 2016 20:34 UTC
  2 points
  Sounds like the Serenity Prayer for AI.
lukeprog 22 Apr 2016 14:43 UTC
4 points
See also: https://scholar.google.com/scholar?cluster=9557614170081724663&hl=en&as_sdt=1,5
- Kaj_Sotala 23 Apr 2016 5:13 UTC
  3 points
  Neat, thanks!