Making a comparison between the very broad category of AGI and FAI may be helpful, but phrasing it so that the only AIs which count as Friendly are provably CEV implementing AIs makes the claim not very interesting. AGI research is in its infancy as are ideas about CEV- the probability that we’ll end up with some other notion of Friendliness that is easier to implement is high.
I was talking about provably CEV implementing AI because there seems to be a consensus on LW that this is the correct approach to take.
P(provably CEV implementing AI | other FAI | AGI turns out to be friendly anyway | safe singularity for any other reason) is quite a lot higher than P(provably CEV implementing AI).
This almost has to be false. I personally think CEV sounds like the best direction I currently know about, but maybe the process of extrapolation has a hidden ‘gotcha’. Hopefully a decision theory that can model self-modifying agents (like our extrapolated selves, perhaps, as well as the AI) will help us figure out what we should be asking. Settling on one approach before then seems premature, and in fact neither the SI nor Eliezer has done so.
Making a comparison between the very broad category of AGI and FAI may be helpful, but phrasing it so that the only AIs which count as Friendly are provably CEV implementing AIs makes the claim not very interesting. AGI research is in its infancy as are ideas about CEV- the probability that we’ll end up with some other notion of Friendliness that is easier to implement is high.
I would say that coming up with a better notion of Friendliness is a necessary condition for implementing a Friendly AI.
I was talking about provably CEV implementing AI because there seems to be a consensus on LW that this is the correct approach to take.
P(provably CEV implementing AI | other FAI | AGI turns out to be friendly anyway | safe singularity for any other reason) is quite a lot higher than P(provably CEV implementing AI).
This almost has to be false. I personally think CEV sounds like the best direction I currently know about, but maybe the process of extrapolation has a hidden ‘gotcha’. Hopefully a decision theory that can model self-modifying agents (like our extrapolated selves, perhaps, as well as the AI) will help us figure out what we should be asking. Settling on one approach before then seems premature, and in fact neither the SI nor Eliezer has done so.