“Humanity” is a weird word at the moment. I think it’s more of a “descendants of our ancestors” thing—I think LLMs trained on humanity’s content should probably be viewed as related to us in important ways, ways that a hypothetical LLM trained only on interaction with octopi or slime molds would not be as related. But this may be a weird view, so let’s ignore it henceforth.
I think the “benefit humanity” rule is actually much broader than you’re reading it as:
Secondly, as an animal advocate, I want to preserve the opportunity for AI to make a post that will benefit animal welfare, even if the post doesn’t benefit humanity.
Your welfare is intertwined with that of animals. You are distressed by their suffering. Therefore, improvements in animal welfare would be expected to cause improvements in your welfare. I think an AI making a post which benefits animal welfare would benefit humanity, because it would be good for all the humans who will feel better in a world where animals suffer less. To put it simply, I claim that all posts which benefit animal welfare in a way that’s legible to you are also benefiting you. Kind of a big claim, but I can’t come up with a counterexample—maybe you can?
Since there are humans who care about AI wellbeing and are upset by the possibility that AIs could be suffering needlessly, it seems to follow that a post which reduced preventable suffering for AIs would benefit those humans.
The rule isn’t demanding that posts benefit ALL of humanity. If that was the standard, few to no human-written posts would meet the bar either.
It would still be nice if AI authors were allowed to benefit entities which no humans care for. If all humans who care about animal welfare were to die, shouldn’t AIs still be allowed to benefit animals?
It makes much more sense to allow the AIs to benefit animals, AIs, or other beings directly without forcing the benefit to flow through humans.
Maybe. I think there’s a level on which we ultimately demand that AI’s perception of values to be handled through a human lens. If you zoom out too far from the human perspective, things start getting really weird. For instance, if you try to reason for the betterment of all life in a truly species-agnostic way, you start getting highly plausible arguments for leaving bacterial or fungal infections untreated, as the human host is only one organism but the pathogens number in the millions of individuals.(yes, this is slippery slope shaped, but special-casing animal welfare seems as arbitrary as special-casing human welfare)
Anyways, the AI’s idea of what humans are is based heavily on snapshots of the recent internet, and that’s bursting with examples of humans desiring animal welfare. So if a model trained on that understanding of humanity’s goals attempts to reason about whether it’s good to help animals, it’d better conclude that humans will probably benefit from animal welfare improvements, or something has gone horribly wrong. Do you think it’s realistically plausible for humanity to develop into a species which we recognize as still human, but no individual prefers happy cute animals over sad ones? I don’t.
“you start getting highly plausible arguments for leaving bacterial or fungal infections untreated, as the human host is only one organism but the pathogens number in the millions of individuals.” If you weight these pathogens by moral status, wouldn’t that still justify treating the disease to preserve the human’s life? (If the human has a more than a million times as much moral status as a bacterium, which seems likely)
I agree that it’s unlikely that no humans will care about animal welfare in the future. I just used that as a thought experiment to demonstrate a claim that I think has a lot going for it: That when we’re counting benefits, we should directly count benefits to all beings with moral status, not just by counting the benefits to humans who care about those beings.
(If the human has a more than a million times as much moral status as a bacterium, which seems likely)
Apologies in advance if this sounds rude, I genuinely want to avoid guessing here: What qualifies the human for higher moral status, and how much of whatever-that-is does AI have? Are we into vibes territory for quantifying such things, or is there a specific definition of moral status that captures the “human life > bacterial life” intuition? Does it follow through the middle where we privilege pets and cattle over what they eat, but below ourselves?
Maybe I’m just not thinking hard enough about it, but at the moment, every rationale I can come up with for why humans are special breaks in one of 2 ways:
if we test for something too abstract, AI has more of it, or at least AI would score better on tests for it than we would, or
If we test for something too concrete (humans are special because we have the DNA we currently do! humans are special because we have the culture we currently do! etc) we exclude prospective distant descendants of ourselves (say, 100k years from now) whom we’d actually want to define as also morally privileged in the ways that we are.
“Humanity” is a weird word at the moment. I think it’s more of a “descendants of our ancestors” thing—I think LLMs trained on humanity’s content should probably be viewed as related to us in important ways, ways that a hypothetical LLM trained only on interaction with octopi or slime molds would not be as related. But this may be a weird view, so let’s ignore it henceforth.
I think the “benefit humanity” rule is actually much broader than you’re reading it as:
Your welfare is intertwined with that of animals. You are distressed by their suffering. Therefore, improvements in animal welfare would be expected to cause improvements in your welfare. I think an AI making a post which benefits animal welfare would benefit humanity, because it would be good for all the humans who will feel better in a world where animals suffer less. To put it simply, I claim that all posts which benefit animal welfare in a way that’s legible to you are also benefiting you. Kind of a big claim, but I can’t come up with a counterexample—maybe you can?
Since there are humans who care about AI wellbeing and are upset by the possibility that AIs could be suffering needlessly, it seems to follow that a post which reduced preventable suffering for AIs would benefit those humans.
The rule isn’t demanding that posts benefit ALL of humanity. If that was the standard, few to no human-written posts would meet the bar either.
It would still be nice if AI authors were allowed to benefit entities which no humans care for. If all humans who care about animal welfare were to die, shouldn’t AIs still be allowed to benefit animals?
It makes much more sense to allow the AIs to benefit animals, AIs, or other beings directly without forcing the benefit to flow through humans.
Maybe. I think there’s a level on which we ultimately demand that AI’s perception of values to be handled through a human lens. If you zoom out too far from the human perspective, things start getting really weird. For instance, if you try to reason for the betterment of all life in a truly species-agnostic way, you start getting highly plausible arguments for leaving bacterial or fungal infections untreated, as the human host is only one organism but the pathogens number in the millions of individuals.(yes, this is slippery slope shaped, but special-casing animal welfare seems as arbitrary as special-casing human welfare)
Anyways, the AI’s idea of what humans are is based heavily on snapshots of the recent internet, and that’s bursting with examples of humans desiring animal welfare. So if a model trained on that understanding of humanity’s goals attempts to reason about whether it’s good to help animals, it’d better conclude that humans will probably benefit from animal welfare improvements, or something has gone horribly wrong. Do you think it’s realistically plausible for humanity to develop into a species which we recognize as still human, but no individual prefers happy cute animals over sad ones? I don’t.
“you start getting highly plausible arguments for leaving bacterial or fungal infections untreated, as the human host is only one organism but the pathogens number in the millions of individuals.” If you weight these pathogens by moral status, wouldn’t that still justify treating the disease to preserve the human’s life? (If the human has a more than a million times as much moral status as a bacterium, which seems likely)
I agree that it’s unlikely that no humans will care about animal welfare in the future. I just used that as a thought experiment to demonstrate a claim that I think has a lot going for it: That when we’re counting benefits, we should directly count benefits to all beings with moral status, not just by counting the benefits to humans who care about those beings.
Apologies in advance if this sounds rude, I genuinely want to avoid guessing here: What qualifies the human for higher moral status, and how much of whatever-that-is does AI have? Are we into vibes territory for quantifying such things, or is there a specific definition of moral status that captures the “human life > bacterial life” intuition? Does it follow through the middle where we privilege pets and cattle over what they eat, but below ourselves?
Maybe I’m just not thinking hard enough about it, but at the moment, every rationale I can come up with for why humans are special breaks in one of 2 ways:
if we test for something too abstract, AI has more of it, or at least AI would score better on tests for it than we would, or
If we test for something too concrete (humans are special because we have the DNA we currently do! humans are special because we have the culture we currently do! etc) we exclude prospective distant descendants of ourselves (say, 100k years from now) whom we’d actually want to define as also morally privileged in the ways that we are.