This is really good, however i would love some additional discussion on the way that the current optimization changes the user.
Keep in mind, when facebook optimizes “clicks” or “scrolls”, it does so by altering user behavior, thus altering the user’s internal S1 model of what is important. This could frequently lead to a distortion of reality, beliefs and self-esteem. There have been many articles and studies correlating facebook usage with mental health. However, simply understanding “optimization” is enough evidence that this is happening.
While, a lot of these issues are pushed under the same umbrella of “digital addiction,” i think facebook is a lot more of a serious problem that, say video games. Video games do not, as a rule, act through the very social channels that are helpful to reducing mental illness. Facebook does.
Also another problem is facebook’s internal culture that, as of 4 years ago was very marked by the cool-aid that somehow promised unbelievable power(1 billion users, horray) without necessarily caring about responsibility (all we want to do is make the world open and connected, why is everyone mad at us).
This problem is also compounded by the fact that facebook get a lot of shitty critiques (like the critique of the fact that they run A/B tests at all) and has thus learned to ignore legitimate questions of value learning.
full disclosure, i used to work at FB.
Maybe this have been said before, but here is a simple idea:
Directly specify a utility function U which you are not sure about, but also discount AI’s own power as part of it. So the new utility function is U—power(AI), where power is a fast growing function of a mix of AI’s source code complexity, intelligence, hardware, electricity costs. One needs to be careful of how to define “self” in this case, as a careful redefinition by the AI will remove the controls.
One also needs to consider the creation of subagents with proper utilities as well, since in a naive implementation, sub-agents will just optimize U, without restrictions.
This is likely not enough, but has the advantage that the AI does not have a will to become stronger a priori, which is better than boxing an AI which does.