We sometimes see with rationalists and utilitarian EAs do something like the same thing we worry about with AI: unaligned optimization that produces outcomes we don’t like. Unfortunately, because humans disagree on norms/ethics/values, it’s kind of hard to know the difference between “going off the rails” and “correcting a massive oversight or collective moral failing”, especially from the inside.
I’m gonna add an even more pessimistic hypothesis: That the disagreements around values are fundamentally irresolvable because there is no truth at the end of the tunnel.
Or, one man’s “going off the rails” is another man’s “correcting a massive oversight or collective moral failing”, and these perspectives can’t be reconciled.
Actually, this makes me think of something.
We sometimes see with rationalists and utilitarian EAs do something like the same thing we worry about with AI: unaligned optimization that produces outcomes we don’t like. Unfortunately, because humans disagree on norms/ethics/values, it’s kind of hard to know the difference between “going off the rails” and “correcting a massive oversight or collective moral failing”, especially from the inside.
I’m gonna add an even more pessimistic hypothesis: That the disagreements around values are fundamentally irresolvable because there is no truth at the end of the tunnel.
Or, one man’s “going off the rails” is another man’s “correcting a massive oversight or collective moral failing”, and these perspectives can’t be reconciled.