Came for the Harry Potter fanfic, a positive vision of what do with atheism, and the anti-death ethos. Stayed for the good thinking.
I really like math and physics. I can program but am not a programmer—I’d have to learn how to make a website and struggle to handle more than a couple hundred LOC. Learning a little Haskell made me enjoy coding again, I’d like to get better at it.
I’m currently very bad at following through on commitments. I hope to fix this by tomorrow. I’ll probably fail, but I shoot for getting it down before my fifty thousandth tomorrow.
I think that if you’re trying to give your AI ethics at the outset or do something like writing down a CEV utility function, something’s already gone deeply wrong[1], in the same way that capabilities-wise you shouldn’t be hardcoding quantum mechanics into the AI. It’s supposed to be superintelligent—it should learn what you/humans care about. The thing you need to figure out is how to get the AI to care about the thing that humans care about, which it then tries to learn and then optimize for.
Under this model, most of the difficulty would also be present in trying to get an AI that cares about the thing that some agent cares about, e.g. some aliens, or possibly chimpanzees and other animals, to the extent they have goals.
(in fact I think that somewhere from half to most of the problem is getting an AI that cares about any not super easy goal about the real world at all, like the classic “maximize the number of diamond atoms”. If you knew how to actually build a literal paperclip maximizer, I would expect that you’ve figured out much of alignment)
This includes it being the last resort alternative to other actors doing even dumber stuff. I count corrigibility as an example here—hopefully it’s easier, albeit worse.