I thought the “C” in CEV stood for “coherent” in the sense that it had been reconciled over all people (or over whatever set of preference-possessing entities you were taking into acount). Otherwise wouldn’t it just be “EV”?
I mean I guess, sure, if “CEV” means over-all-people then I just mean “EV” here.
Just “EV” is enough for the “basic challenge” of alignment as described on AGI Ruin.
So are you saying that it would literally have an internal function that represented “how good” it thought every possible state of the world was, and then solve an (approximate) optimization problem directly in terms of maximizing that function?
Or do something which has approximately that effect.
That doesn’t seem to me like a problem you could solve even with a Jupiter brain and perfect software.
I disagree! I think some humans right now (notably people particulalry focused on alignment) already do something vague EUmax-shaped, and definitely an ASI capable of running on current compute would be able to do something more EUmax-shaped. Very, very far from actual “pure” EUmax of course; but way sufficient to defeat all humans, who are quite further away from pure EUmax. Maybe see also this comment of mine.
I mean I guess, sure, if “CEV” means over-all-people then I just mean “EV” here.
Just “EV” is enough for the “basic challenge” of alignment as described on AGI Ruin.
Or do something which has approximately that effect.
I disagree! I think some humans right now (notably people particulalry focused on alignment) already do something vague EUmax-shaped, and definitely an ASI capable of running on current compute would be able to do something more EUmax-shaped. Very, very far from actual “pure” EUmax of course; but way sufficient to defeat all humans, who are quite further away from pure EUmax. Maybe see also this comment of mine.