Artificial Utility Monsters as Effective Altruism

Dear effective altruist,

have you considered artificial utility monsters as a high-leverage form of altruism?

In the traditional sense, a utility monster is a hypothetical being which gains so much subjective wellbeing (SWB) from marginal input of resources that any other form of resource allocation is inferior on a utilitarian calculus. (as illustrated on SMBC)

This has been used to show that utilitarianism is not as egalitarian as it intuitively may appear, since it prioritizes some beings over others rather strictly—including humans.

The traditional utility monster is implausible even in principle—it is hard to imagine a mind that is constructed such that it will not succumb to diminishing marginal utility from additional resource allocation. There is probably some natural limit on how much SWB a mind can implement, or at least how much this can be improved by spending more on the mind. This would probably even be true for an algorithmic mind that can be sped up with faster computers, and there are probably limits to how much a digital mind can benefit in subjective speed from the parallelization of its internal subcomputations.

However, we may broaden the traditional definition somewhat and call any technology utility-monstrous if it implements high SWB with exceptionally good cost-effectiveness and in a scalable form—even if this scalability stems form a larger set of minds running in parallel, rather than one mind feeling much better or living much longer per additional joule/​dollar.

Under this definition, it may be very possible to create and sustain many artificial minds reliably and cheaply, while they all have a very high SWB level at or near subsistence. An important point here is that possible peak intensities of artificially implemented pleasures could be far higher than those commonly found in evolved minds: Our worst pains seem more intense than our best pleasures for evolutionary reasons—but the same does not have to be true for artifial sentience, whose best pleasures could be even more intense than our worst agony, without any need for suffering anywhere near this strong.

If such technologies can be invented—which seems highly plausible in principle, if not yet in practice—then the original conclusion for the utilitarian calculus is retained: It would be highly desirable for utilitarians to facilitate the invention and implementation of such utility-monstrous systems and allocate marginal resources to subsidize their existence. This makes it a potential high-value target for effective altruism.

Many tastes, many utility monsters

Human motivation is barely stimulated by abstract intellectual concepts, and “utilitronium” sounds more like “aluminium” than something to desire or empathize with. Consequently, the idea is as sexy as a brick. “Wireheading” evokes associations of having a piece of metal rammed into one’s head, which is understandably unattractive to any evolved primate (unless it’s attached to an iPod, which apparently makes it okay).

Technically, “utility monsters” suffer from a similar association problem, which is that the idea is dangerous or ethically monstrous. But since the term is so specific and established in ethical philosophy, and since “monster” can at least be given an emotive and amicable—almost endearing—tone, it seems realistic to use it positively. (Suggestions for a better name are welcome, of course.)

So a central issue for the actual implementation and funding is human attraction. It is more important to motivate humans to embrace the existence of utility monsters than it is for them to be optimally resource-efficient—after all, a technology that is never implemented or funded properly gains next to nothing from being efficient.

A compromise between raw efficiency of SWB per joule/​dollar and better forms to attract humans might be best. There is probably a sweet spot—perhaps various different ones for different target groups—between resource-efficiency and attractiveness. Only die-hard utilitarians will actually want to fund something like hedonium, but the rest of the world may still respond to “The Sims—now with real pleasures!”, likeable VR characters, or a new generation of reward-based Tamagotchis.

Once we step away somewhat from maximum efficiency, the possibilities expand drastically. Implementation forms may be:

  • decorative like gimmicks or screensavers,

  • fashionable like sentient wearables,

  • sophisticated and localized like works of art,

  • cute like pets or children,

  • personalized like computer game avatars retiring into paradise,

  • erotic like virtual lovers who continue to have sex without the user,

  • nostalgic like digital spirits of dead loved ones in artificial serenity,

  • crazy like hyperorgasmic flowers,

  • semi-functional like joyful household robots and software assistants,

  • and of course generally a wide range of human-like and non-human-like simulated characters embedded in all kinds of virtual narratives.

Possible risks and mitigation strategies

Open-souce utility monsters could be made public as templates to add additional control that the implementation of sentience is correct and positive, and to make better variations easy to explore. However, this would come with the downside of malicious abuse and reckless harm potential. Risks of suffering could come from artificial unhappiness desired by users, e.g. for narratives that contain sadism, dramatic violence or punishment of evil characters for quasi-moral gratification. Another such risk could come simply from bad local modifications that implement suffering by accident.

Despite these risks, one may hope that most humans who care enough to run artificial sentience are more benevolent and careful than malevolent and careless in a way that causes more positive SWB than suffering. After all, most people love their pets and do not torture them, and other people look down on those who do (compare this discussion of Norn abuse, which resulted in extremely hostile responses). And there may be laws against causing artificial suffering. Still, this is an important point of concern.

Closed-source utility monsters may further mitigate some of this risk by not making the sentient phenotypes directly available to the public, but encapsulating their internal implementation within a well-defined interface—like a physical toy or closed-source software that can be used and run by private users, but not internally manipulated beyond a well-tested state-space without hacking.

An extremely cautionary approach would be to run the utility monsters by externally controlled dedicated institutions and only give the public—such as voters or donors—some limited control over them through communication with the institution. For instance, dedicated charities could offer “virtual paradises” to donors so they can “adopt” utility monsters living there in certain ways without allowing those donors to actually lay hands on their implementation. On the other hand, this would require a high level of trustworthiness of the institutions or charities and their controllers.

Not for the sake of utility monsters alone

Human values are complex, and it has been argued on LessWrong that the resource allocation of any good future should not be spent for the sake of pleasure or happiness alone. As evolved primates, we all have more than one intuitive value we hold dear, even among self-identified intellectual utilitarians, who compose only a tiny fraction of the population.

However, some discussions in the rationalist community touching related technologies like pleasure wireheading, utilitronium, and so on, have suffered from implausible or orthogonal assumptions and associations. Since the utilitarian calculus favors SWB maximization above all else, it has been feared, we run the risk of losing a more complex future because

a) utilitarianism knows no compromise and

b) the future will be decided by one winning singleton who takes it all and

c) we have only one world with only one future to get it right

In addition, low status has been ascribed to wireheads, with the association of fake utility or cheating life as a form of low-status behavior. People have been competing for status by associating themselves with the miserable Socrates instead of the happy pig, without actually giving up real option value in their own lives.

On Scott Alexander’s blog, there’s a good example of a mostly pessimistic view both in the OP and in the comments. And in this comment on an effective altruism critique, Carl Shulman names hedonistic utilitarianism turning into a bad political ideology similar to communist states as a plausible failure mode of effective altruism.

So, will we all be killed by a singleton who turns us into utilitronium?

Be not afraid! These fears are plausibly unwarranted because:

a) Utilitarianism is consequentialism, and consequentialists are opportunistic compromisers—even within the conflicting impulses of their own evolved minds. The number of utilitarians who would accept existential risk for the sake of pleasure maximization is small, and practically all of them ascribe to the philosophy of cooperative compromise with orthogonal, non-exclusive values in the political marketplace. Those who don’t are incompetent almost by definition and will never gain much political traction.

b) The future may very well not be decided by one singleton but by a marketplace of competing agency. Building a singleton is hard and requires the strict subduction or absorption of all competition. Even if it were to succeed, the singleton will probably not implement only one human value, since it will be created by many humans with complex values, or at least it will have to make credible concessions to a critical mass of humans with diverse values who can stop it before it reaches singleton status. And if these mitigating assumptions are all false and a fooming singleton is possible and easy, then too much pleasure should be the least of humanity’s worries—after all, in this case the Taliban, the Chinese government, the US military or some modern King Joffrey are just as likely to get the singleton as the utilitarians.

c) There are plausibly many Everett branches and many hubble volumes like ours, implementing more than one future-earth outcome, as summed up by Max Tegmark here. Even if infinitarian multiverse theories should all end up false against current odds, a very large finite universe would still be far more realistic than a small one, given our physical observations. This makes a pre-existing value diversity highly probable if not inevitable. For instance, if you value pristine nature in addition to SWB, you should accept the high probability of many parallel earth-like planets with pristine nature irregardless of what you do, and consider that we may be in an exceptional minority position to improve the measure of other values that do not naturally evolve easily, such as a very high positive-SWB-over-suffering surplus.

From the present, into the future

If we accept the conclusion that utility-monstrous technology is a high-value vector for effective altruism (among others), then what could current EAs do as we transition into the future? To my best knowledge, we don’t have the capacity yet to create artificial utility monsters.

However, foundational research in neuroscience and artificial intelligence/​sentience theory is already ongoing today and certainly a necessity if we ever want to implement utility-monstrous systems. In addition, outreach and public discussion of the fundamental concepts is also possible and plausibly high-value (hence this post). Generally, the following steps seem all useful and could use the attention of EAs, as we progress into the future:

  1. spread the idea, refine the concepts, apply constructive criticism to all its weak spots until it becomes either solid or revealed as irredeemably undesirable

  2. identify possible misunderstandings, fears, biases etc. that may reduce human acceptance and find compromises and attraction factors to mitigate them

  3. fund and do the scientific research that, if successful, could lead to utility-monstrous technologies

  4. fund the implementation of the first actual utility monsters and test them thoroughly, then improve on the design, then test again, etc.

  5. either make the templates public (open-source approach) or make them available for specialized altruistic institutions, such as private charities

  6. perform outreach and fundraising to give existence donations to as many utility monsters as possible

All of this can be done without much self-sacrifice on the part of any individual. And all of this can be done within existing political systems, existing markets, and without violating anyone’s rights.