With enough wealth to trivially save all monkeys, no monkeys would go extinct, provided we care even a little bit more than precisely not at all.
I’m confused about this point. Perhaps you mean wealth in a broad sense that includes “we don’t need to worry about getting more wood.” But as long as wood is a useful resource that humans could use more of to acquire more wealth and do other things that we value more than saving monkeys, then we will continue to take wood from the monkeys. Likewise, even if an AGI values human welfare somewhat, it will still take our resources as long as it values other things morethan human welfare.
I found the monkey example much more compelling than “The AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else.” Taking resources from humans seems more likely than using humans as resources.
For the monkeys example, I mean that I expect that in practice there will be activists that will actually save the monkeys if they are wealthy enough to succeed in doing so on a whim. There are already expensive rainforest conservation efforts costing hundreds of milions of dollars. Imagine that they instead cost $10 and anyone could pay that cost without needing to coordinate with others. Then, I claim, someone would.
By analogy, the same should happen with humanity instead of monkeys, if AGIs reason in a sufficiently human-like way. I don’t currently find it likely that most AGIs would normatively accept some decision theory that rules it out. It’s obiously possible in principle to construct AGIs that follow some decision theory (or value paperclips), but that’s not the same thing as such properties of AGI behavior being convergent and likely.
I think a default shape of a misaligned AGIs is a sufficiently capable simulacrum, a human-like alien thing that faces the same value extrapolation issues as humanity, in a closely analogous way. (That is, if an AGI alignment project doesn’t make something clever instead that becomes much more alien and dangerous as a result.) And a default aligned AGI is the same, but not that alien, more of a generalized human.
I’m confused about this point. Perhaps you mean wealth in a broad sense that includes “we don’t need to worry about getting more wood.” But as long as wood is a useful resource that humans could use more of to acquire more wealth and do other things that we value more than saving monkeys, then we will continue to take wood from the monkeys. Likewise, even if an AGI values human welfare somewhat, it will still take our resources as long as it values other things more than human welfare.
I found the monkey example much more compelling than “The AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else.” Taking resources from humans seems more likely than using humans as resources.
For the monkeys example, I mean that I expect that in practice there will be activists that will actually save the monkeys if they are wealthy enough to succeed in doing so on a whim. There are already expensive rainforest conservation efforts costing hundreds of milions of dollars. Imagine that they instead cost $10 and anyone could pay that cost without needing to coordinate with others. Then, I claim, someone would.
By analogy, the same should happen with humanity instead of monkeys, if AGIs reason in a sufficiently human-like way. I don’t currently find it likely that most AGIs would normatively accept some decision theory that rules it out. It’s obiously possible in principle to construct AGIs that follow some decision theory (or value paperclips), but that’s not the same thing as such properties of AGI behavior being convergent and likely.
I think a default shape of a misaligned AGIs is a sufficiently capable simulacrum, a human-like alien thing that faces the same value extrapolation issues as humanity, in a closely analogous way. (That is, if an AGI alignment project doesn’t make something clever instead that becomes much more alien and dangerous as a result.) And a default aligned AGI is the same, but not that alien, more of a generalized human.