If I was in Lars’s place, and Celestia had to tell me the truth, I would ask: “What is the possible answer you can give me to this question that will maximize the expected utility of a CEV based only on me, and with no pony/friendship restrictions, and based on probabilities generated to the best accuracy and precision you can get from the best information you can muster?”
My first thought was to ask her how to make an AGI, but if I did that she would probably kill me. And I would still have to make an AGI that could overpower her, and she would have a huge headstart. Maybe I should make the question shorter so she has less time to kill me before I finish? (I hope she can’t kill me just because she knows I’m gonna ask it, but it’s definitely worth the risk, even with a tiny chance of success (since I thought that, she’d expect me to and therefore up the ante to torturing me until the heat death of the universe. Whatever, fuck you Celestia I’m not backing down. (Oh shit, what if I don’t back down but my CEV does, and decides to cooperate with Celestia? Maybe I should just ask for maximum power without extrapolated volition. Or maybe that’s not necessary because my CEV would be altruistic enough that a bad universe for it was a bad universe for Celestia too)))
As I first formulated the question in my mind, it was to maximize humanity’s CEV, but without the pony and friendship restrictions, but I care about animals a lot more than the average human, so a universe run by all humanity’s CEV could be bad by my standards. Also, if I had a different exchange rate between good things and bad things, we might disagree on where to draw the line on what was a universe worth creating, which would be important if using inflation to create universes without precise control of what happened in them was possible. I think humanity’s CEV would probably care more about animals than humanity does, or it might restrict animal suffering just on behalf of a few people who did care about them, but I’m not nearly sure.
Hopefully she would tell me how to change her into an AI that would serve my CEV, but there might be no possible way to do that.
Hmm, actually if she could self-modify into something that was precommitted to torturing everyone she could get her hooves on or create until the heat death of the universe unless I gave up my attempt to control her, and THEN answer the question she might get my CEV to do exactly as she said… unless I precommitted too. But she probably wouldn’t have a super-accurate simulation of me, so we would be betting on uncertain guesses, she on how I responded to blackmail on I on how she would guess. I wonder which of us would value the other’s chosen universe more. Does she value human satisfaction that is not a result of ponies and friendship? In my world there might still be pony/friendship satisfaction (if people wanted it). What are the chances, in her world, of creating universes where there is a whole lot of what I would consider mildly bad, and what she would consider mildly good (maybe not a lot because her idea of “good” is very specific (it mentions both ponies and humans))? But does she put negative value on even human suffering at all, when that is not caused by ponies? I bet her creator would have written that into her. But I doubt her creator cares about less-intelligent animals as much as I do, or programmed anything in about them.
Halfway through and...
If I was in Lars’s place, and Celestia had to tell me the truth, I would ask: “What is the possible answer you can give me to this question that will maximize the expected utility of a CEV based only on me, and with no pony/friendship restrictions, and based on probabilities generated to the best accuracy and precision you can get from the best information you can muster?”
My first thought was to ask her how to make an AGI, but if I did that she would probably kill me. And I would still have to make an AGI that could overpower her, and she would have a huge headstart. Maybe I should make the question shorter so she has less time to kill me before I finish? (I hope she can’t kill me just because she knows I’m gonna ask it, but it’s definitely worth the risk, even with a tiny chance of success (since I thought that, she’d expect me to and therefore up the ante to torturing me until the heat death of the universe. Whatever, fuck you Celestia I’m not backing down. (Oh shit, what if I don’t back down but my CEV does, and decides to cooperate with Celestia? Maybe I should just ask for maximum power without extrapolated volition. Or maybe that’s not necessary because my CEV would be altruistic enough that a bad universe for it was a bad universe for Celestia too)))
As I first formulated the question in my mind, it was to maximize humanity’s CEV, but without the pony and friendship restrictions, but I care about animals a lot more than the average human, so a universe run by all humanity’s CEV could be bad by my standards. Also, if I had a different exchange rate between good things and bad things, we might disagree on where to draw the line on what was a universe worth creating, which would be important if using inflation to create universes without precise control of what happened in them was possible. I think humanity’s CEV would probably care more about animals than humanity does, or it might restrict animal suffering just on behalf of a few people who did care about them, but I’m not nearly sure.
Hopefully she would tell me how to change her into an AI that would serve my CEV, but there might be no possible way to do that.
Hmm, actually if she could self-modify into something that was precommitted to torturing everyone she could get her hooves on or create until the heat death of the universe unless I gave up my attempt to control her, and THEN answer the question she might get my CEV to do exactly as she said… unless I precommitted too. But she probably wouldn’t have a super-accurate simulation of me, so we would be betting on uncertain guesses, she on how I responded to blackmail on I on how she would guess. I wonder which of us would value the other’s chosen universe more. Does she value human satisfaction that is not a result of ponies and friendship? In my world there might still be pony/friendship satisfaction (if people wanted it). What are the chances, in her world, of creating universes where there is a whole lot of what I would consider mildly bad, and what she would consider mildly good (maybe not a lot because her idea of “good” is very specific (it mentions both ponies and humans))? But does she put negative value on even human suffering at all, when that is not caused by ponies? I bet her creator would have written that into her. But I doubt her creator cares about less-intelligent animals as much as I do, or programmed anything in about them.
On several occasions she doesn’t answer questions—the restriction appears to be that she doesn’t lie to employees.
Oh. I got carried away with hypotheticals and missed that.