Thank you! I saw that comment and responded there. I said that really clarified the argument and that given that clarification, I largely agree.
My one caveat is noting that if we screw up alignment we could easily kill more than our own chance at flourishing. I think it’s pretty easy to get a paperclipper expanding at near-C and snuffing out all civilizations in our light cone before they get their chance to prosper. So raising our odds of flourishing should be weighed against the risk of messing up a bunch of other civilizations chances. One likely non-simulation answer to the Fermi paradox is that we’re early to the game. We shouldn’t lose big if it keeps others from getting to play.
I hadn’t considered this tradeoff closely because in my world models survival and flourishing are still closely tied together. If we solve alignment we probably get near-optimal flourishing. If we don’t, we all die.
I realize there’s a lot of room in between; that model is down to the way I think goals and alignment and human beings work. I think intent alignment is more likely, which would put (a) human(s) in charge of the future. I think most humans would agree that flourishing sounds nice if they had long enough to contemplate it. Very few people are so sociopathic/sadistic that they’d want to not allow flourishing in the very long term.
But that’s just one theory! It’s quite hard to guess and I wouldn’t want to assume that’s correct.
I’ll look in more depth at your ideas of how to play for a big win. I’m sure most of it is compatible with trying our best to survive.
What do you mean by solve alignment? What is your optimal world? What you consider “near-optimal flourishing” is likely very different than many other people’s ideas of near-optimal flourishing. I think people working on alignment are just punting on this issue right now while they figure out how to implement intent and value alignment but I assume there will be a lot of conflict about what values a model will be aligned to and who a model will be aligned to if/when we have the technical ability to align powerful AIs.
Thank you! I saw that comment and responded there. I said that really clarified the argument and that given that clarification, I largely agree.
My one caveat is noting that if we screw up alignment we could easily kill more than our own chance at flourishing. I think it’s pretty easy to get a paperclipper expanding at near-C and snuffing out all civilizations in our light cone before they get their chance to prosper. So raising our odds of flourishing should be weighed against the risk of messing up a bunch of other civilizations chances. One likely non-simulation answer to the Fermi paradox is that we’re early to the game. We shouldn’t lose big if it keeps others from getting to play.
I hadn’t considered this tradeoff closely because in my world models survival and flourishing are still closely tied together. If we solve alignment we probably get near-optimal flourishing. If we don’t, we all die.
I realize there’s a lot of room in between; that model is down to the way I think goals and alignment and human beings work. I think intent alignment is more likely, which would put (a) human(s) in charge of the future. I think most humans would agree that flourishing sounds nice if they had long enough to contemplate it. Very few people are so sociopathic/sadistic that they’d want to not allow flourishing in the very long term.
But that’s just one theory! It’s quite hard to guess and I wouldn’t want to assume that’s correct.
I’ll look in more depth at your ideas of how to play for a big win. I’m sure most of it is compatible with trying our best to survive.
What do you mean by solve alignment? What is your optimal world? What you consider “near-optimal flourishing” is likely very different than many other people’s ideas of near-optimal flourishing. I think people working on alignment are just punting on this issue right now while they figure out how to implement intent and value alignment but I assume there will be a lot of conflict about what values a model will be aligned to and who a model will be aligned to if/when we have the technical ability to align powerful AIs.