The hack troubled me and I read part of the CEV document again.
The terms “Were more the people we wished we were” , “Extrapolated as we wish that extrapolated”, and “Interpreted as we wish that interpreted” are present in the CEV document explaining extrapolation. These pretty much guarantee that a hack such like what Wei Dai mentioned would be an extremely potent one.
However, the conservatism in the rest of the document, with phrases like below seem to take care of it fairly well.
“It should be easier to counter coherence than to create coherence. ”
“The narrower the slice of the future that our CEV wants to actively steer humanity into, the more consensus required. ”
“the initial dynamic for CEV should be conservative about saying “yes”, and listen carefully for “no”. ”
I just hope the actual numbers when entered match that. If they are, then I think the CEV might just come back with to the programmers saying “I see something weird. Kindly explain”
The narrower the slice of the future that our CEV wants to actively steer humanity into, the more consensus required.
This sounded really good when I read it in the CEV paper. But now I realize that I have no idea what it means. What is the area being measured for “narrowness”?
My understanding of narrower future is more choices taken away weighted by the number of people they are taken away from, compared to the matrix of choices present at the time of activation of CEV.
(1) it does not know how to weight choices of people not yet alive at time of activation.
(2) it does not know how to determine which choices count. For example, is Baskin Robbins to be preferred to Alinea, because Baskin Robbins offers 31 choices while Alinea offers just one (12 courses or 24)? Or Baskin Robbins^^^3 for most vs 4 free years of schooling in a subject of choice for all? Does it improve the future to give everyone additional unpalatable choices, even if few will choose them? I understand that CEV is supposed to be roughly the sum over what people would want, so some of the more absurd meanings would be screened off. But I don’t understand how this criterion is specific enough that if I were a Friendly superpower, I could use it to help me make decisions.
The hack troubled me and I read part of the CEV document again.
The terms “Were more the people we wished we were” , “Extrapolated as we wish that extrapolated”, and “Interpreted as we wish that interpreted” are present in the CEV document explaining extrapolation. These pretty much guarantee that a hack such like what Wei Dai mentioned would be an extremely potent one.
However, the conservatism in the rest of the document, with phrases like below seem to take care of it fairly well.
“It should be easier to counter coherence than to create coherence. ” “The narrower the slice of the future that our CEV wants to actively steer humanity into, the more consensus required. ” “the initial dynamic for CEV should be conservative about saying “yes”, and listen carefully for “no”. ”
I just hope the actual numbers when entered match that. If they are, then I think the CEV might just come back with to the programmers saying “I see something weird. Kindly explain”
This sounded really good when I read it in the CEV paper. But now I realize that I have no idea what it means. What is the area being measured for “narrowness”?
My understanding of narrower future is more choices taken away weighted by the number of people they are taken away from, compared to the matrix of choices present at the time of activation of CEV.
There are many problems with this definition:
(1) it does not know how to weight choices of people not yet alive at time of activation. (2) it does not know how to determine which choices count. For example, is Baskin Robbins to be preferred to Alinea, because Baskin Robbins offers 31 choices while Alinea offers just one (12 courses or 24)? Or Baskin Robbins^^^3 for most vs 4 free years of schooling in a subject of choice for all? Does it improve the future to give everyone additional unpalatable choices, even if few will choose them? I understand that CEV is supposed to be roughly the sum over what people would want, so some of the more absurd meanings would be screened off. But I don’t understand how this criterion is specific enough that if I were a Friendly superpower, I could use it to help me make decisions.