The question here is about goal-oriented systems that have done things that the creators did not expect and maybe does not want, but nonetheless which help achieve the goal as originally defined.
There is a lot of literature on misinterpretation of goals, goodhart’s law, etc. I wasn’t looking at that in specific in this question, though it is a part of overall concern.
The question here is about goal-oriented systems that have done things that the creators did not expect and maybe does not want, but nonetheless which help achieve the goal as originally defined.
There is a lot of literature on misinterpretation of goals, goodhart’s law, etc. I wasn’t looking at that in specific in this question, though it is a part of overall concern.