In fact I think it’s safe to say that we’d collectively allocate much more than 1/millionth of our resources towards protecting the preferences of whatever weak agents happen to exist in the world (obviously the cows get only a small fraction of that).
Sure, but extrapolating this to unaligned AI is NOT an encouraging sign. We may allocate greater than 1/million of our resources to animal rights, but we allocate a whole lot more than that to goals which diametrically go against the preferences of those animals such as eating meat and cheese and eggs; we allocate MUCH more resources to “animal wrongs” than animal rights, so to speak.
So to show an AI will be “nice” to humans at all, it is not enough to suppose that it might have some 1/million “nice to humans” term. It requires showing that that term won’t be outweighed handily by the rest of its utility function.
Sure, but extrapolating this to unaligned AI is NOT an encouraging sign. We may allocate greater than 1/million of our resources to animal rights, but we allocate a whole lot more than that to goals which diametrically go against the preferences of those animals such as eating meat and cheese and eggs; we allocate MUCH more resources to “animal wrongs” than animal rights, so to speak.
So to show an AI will be “nice” to humans at all, it is not enough to suppose that it might have some 1/million “nice to humans” term. It requires showing that that term won’t be outweighed handily by the rest of its utility function.