as it may be that the manking is a wrong thing entirely, and should be permitted to kill itself, and then the meteorite impacts should be allowed so that ants get a chance.
I don’t know much about my own extrapolated preferences but I can reason that as my preferences are the product of noise in the evolutionary process, reality is unlikely to align with them naturally. It’s possible that my preferences consider “mankind a wrong thing entirely”; but that they would align with whatever the universe happens to produce next on earth (assuming the rise of another dominant species is even plausible) is incredibly unlikely. Anything that happens without a causal line of descent from human values is unlikely to align with human values.
Anything that happens without a causal line of descent from human values is unlikely to align with human values.
Unlikely to align how exactly? There’s also the common causes, you know; A and B can be correlated when A causes B, when B causes A, or when C causes A and B.
It seems to me that you can require arbitrary degree of alignment to arrive at arbitrary unlikehood, but some alignment via common cause is nonetheless probable.
There’s such thing as over-fitting… if you have some noisy data, the theory that fits the data ideally is just the table of the data (e.g. heights and falling times); the useful theory doesn’t fit data exactly in practice. If we make the AI perfectly fit to what mankind does, we could just as well make a brick and proclaim it an omnipotent omniscient mankind-friendly AI that will never stop the mankind from doing something that mankind wants (including taking the extinction risks).
I don’t know much about my own extrapolated preferences but I can reason that as my preferences are the product of noise in the evolutionary process, reality is unlikely to align with them naturally. It’s possible that my preferences consider “mankind a wrong thing entirely”; but that they would align with whatever the universe happens to produce next on earth (assuming the rise of another dominant species is even plausible) is incredibly unlikely. Anything that happens without a causal line of descent from human values is unlikely to align with human values.
Unlikely to align how exactly? There’s also the common causes, you know; A and B can be correlated when A causes B, when B causes A, or when C causes A and B.
It seems to me that you can require arbitrary degree of alignment to arrive at arbitrary unlikehood, but some alignment via common cause is nonetheless probable.
Well yes, but I would assume you would want more alignment, not less.
There’s such thing as over-fitting… if you have some noisy data, the theory that fits the data ideally is just the table of the data (e.g. heights and falling times); the useful theory doesn’t fit data exactly in practice. If we make the AI perfectly fit to what mankind does, we could just as well make a brick and proclaim it an omnipotent omniscient mankind-friendly AI that will never stop the mankind from doing something that mankind wants (including taking the extinction risks).