This would result in training data for the AI that establishes genuine goodness as the default.
And then the AI would convert everything into genuinely good paperclips.
Or the first manipulative psychopath would convince it that it is good to give him root access.
And then the AI would convert everything into genuinely good paperclips.
Or the first manipulative psychopath would convince it that it is good to give him root access.