(notes on) Policy Desiderata for Superintelligent AI: A Vector Field Approach

Ben Pace4 Feb 2019 22:08 UTC

43 points

Meta: I thought I’d spend a little time reading the policy papers that Nick Bostrom has written. I made notes as I went along, so I spent a little while cleaning them up into a summary post. These are my notes on Bostrom, Dafoe and Flynn’s 2016 policy desiderata paper, which received significant edits in 2018. I spent 6-8 hours on this post, not a great deal of time, so I’ve not been maximally careful.

Context and Goals

Overall, this is not a policy proposal. Nor does it commit strongly to a particular moral or political worldview. The goal of this paper is to merely observe which policy challenges are especially important or different in the case of superintelligent AI, that most moral and political worldviews will need to deal with. The paper also makes no positive argument for the importance or likelihood or timeline of superintelligent AI—it instead assumes that this shall occur in the present century, and then explores the policy challenges that would follow.

The Vector Field Approach

Botrom, Dafoe and Flynn spend a fair amount of time explaining that they’re not going to be engaging in what (I think) Robin Hanson would call standard value talk. They’re not going to endorse a particular moral or political theory, nor are they going to adopt various moral or political theories and show how they propose different policies. They’re going to look at the details of this particular policy landscape and try to talk about the regularities that will need to be addressed by most standard moral and political frameworks, and in what direction these regularities suggest changing policy.

They call this the ‘vector field’ approach. If you don’t feel like you fully grok the concept, here’s the quote where they lay out the formalism (with light editing for readability).

The vector field approach might then attempt to derive directional policy change conclusions of a form that we might schematically represent as follows:

“However much emphasis $X$ you think that states ought, under present circumstances, to give to the objective of economic equality, there are certain special circumstances $Y$ , which can be expected to hold in the radical AI context we described above, that should make you think that in those circumstances states should instead give emphasis $f_{Y} (X)$ to the objective of economic equality.”

The idea is that $f$ here is some relatively simple function, defined over a space of possible evaluative standards or ideological positions. For instance, $f$ might simply add a term to $X$ , which would correspond to the claim the emphasis given economic equality should be increased by a certain amount in the circumstances $Y$ (according to all the ideological positions under consideration).

Or $f$ might require telling a more complicated story, perhaps along the lines of:

“However much emphasis you give to economic equality as a policy objective under present circumstances, under conditions Y you should want to conceive of economic equality differently—certain dimensions of economic inequality are likely to become irrelevant and other dimensions are likely to become more important or policy-relevant than they are today.”

I particularly like this quote:

This vector field approach is only fruitful to the extent that there are some patterns in how the special circumstances $Y$ impact policy assessments from different evaluative positions. If the prospect of radical AI had entirely different and idiosyncratic implications for every particular ideology or interest platform, then the function $f$ would amount to nothing more than a lookup table.

I read this as saying something like “This paper only makes sense if facts matter, separate to values.” It’s funny to me that this sentence felt necessary to be written.

Quotes

A few more quotes on what the paper is trying to do.

A strong proposal for the governance of advanced AI would ideally accommodate each of these desiderata to a high degree. There may exist additional desiderata that we have not identified here; we make no claim that our list is complete. Furthermore, a strong policy proposal should presumably also integrate many other normative, prudential, and practical considerations that are either idiosyncratic to particular evaluative positions or are not distinctive to the context of radical AI.

[...]

Using a “vector field” approach to normative analysis, we sought to extract directional policy implications from these special circumstances. We characterized these implications as a set of desiderata—traits of future policies, governance structures, or decision-making contexts that would, by the standards of a wide range of key actors, stakeholders, and ethical views, enhance the prospects of beneficial outcomes in the transition to a machine intelligence era

[...]

By “policy proposals” we refer not only official government documents but also plans and options developed by private actors who take an interest in long-term AI developments. The desiderata, therefore, are also relevant to some corporations, research funders, academic or non-profit research centers, and various other organizations and individuals.

Next are the actual desiderata. They’re given under four headings (efficiency, allocation, population, and process), each with 2-4 desiderata. Each subheading below corresponds to a policy desiderata in the paper. For each desiderata I have summarised of all the arguments and considerations in the text that felt new or non-trivial to me personally (e.g. I spent only one sentence on the arguments for AI safety).

If you want to just read the paper’s summary, jump down to page 23 which has a table and summarises in their own words.

Efficiency Desiderata

Expeditious progress

We should make sure to take ahold of our cosmic endowment—and the sooner the better.

AI safety

Choose policies that leads us to develop sufficient technical understanding that the AI will do what we expect it to do, and that give these tools to AI builders.

Conditional stabilization

The ability to establish a singleton, or regime of intensive global surveillance, or ability to thoroughly suppress the spread of dangerous or info, should we need to use this ability in the face of otherwise catastrophic global coordination failures.

Non-turbulence

Technology will change rapidly. We don’t want to have to rush regulations through, or alternatively take too long to adapt such that the environment radically changes again. So try to reduce turbulence.

Allocation Desiderata

Universal benefit

If you force someone to take a risk, it is only fair that they are compensated with a share of any reward gained. Existential risks involve everyone, so everyone should get proportional benefit.

Epsilon-magnanimity

Many people’s values have diminishing returns to further resources e.g. income guarantees for all, ensuring all animals have minimally positive lives, aesthetic projects like preserving some artworks, etc. While today they must fight for a cut of the small pie, as long as they are granted a non-zero weighting in the long-run, they can be satisfied. 0.00001% of GDP may be more than enough to give all humans a $40k income, for example.

This is especially good in light of normative uncertainty—as long as we give some weighting to various values, they will get satiated in a basic way in the long-run.

Continuity

Reasons to expect unusually high concentration and permutation of wealth and power:

In the modern world, salary is more evenly distributed than capital. Superintelligent AI is likely to greatly increase the factor share of income accrued from capital, leading to massive increases in inequality and increase concentration of wealth.
If a small group decides how the AI works and its high-level decisions, they could gain a decisive strategic advantage and take over the world.
If there is radical and unpredictable technological change, then it is likely that wealth distribution will change radically and unpredictably.
Automated security and surveillance systems will help a regime stay alive without support from the public or elites—when behaviour is more legible it’s easier to punish or control it. This is also likely to at least sustain concentration of wealth and power, but also to increase it.

As such we wish to implement policies that more sustain existing concentration and distribution of wealth and power.

Also of interest, is (given the high likelihood of redistribution, change in concentration, and general unpredictable turbulence) how much we seem to face a global, real-life, Rawlsian veil-of-ignorance. It might be good to set up things like insurance to make sure everyone gets some minimum of power and self-determination in the future (it seems that people have diminishing returns to power—“most people would much rather be certain to have power over one life (their own) than have a 10% chance of having power over the lives of ten people and a 90% chance of having no power.”

Population Desiderata

Mind crime prevention

Four key factors: novelty, invisibility, difference, and magnitude.

Novelty and invisibility: Sentient digital entities may be moral patients. They would be a novel type of mind, and would not exhibit many characteristics that inform our moral intuitions—they lack facial expressions, physicality, human speech, and so on, if they are being run invisibly in some microprocessor. This means we should worry about policy makers taking an unconscionable moral decision.
Difference: It is also the case that these minds may be very different to human or animal minds, again subverting our intuitions about what behaviour is normative toward them, and increasing the complexity of choosing sensible policies here.
Magnitude: It may be incredibly cheap to create as many people as currently exist in a country, magnifying the concerns of the previous three factors. “With high computational speed or parallelization, a large amount of suffering could be generated in a small amount of wall clock time.” This may mean that mind crime is a principal desideratum in AI policy.

Population policy

This is a worry about malthusian scenarios (where average income falls to subsistence levels). Hanson has written about these scenarios.

This can also undermine democracy (“One person, one vote”). If a political faction can invest in creating more people, they can create the biggest voting block. This leaves the following trilemma of options:

(i) deny equal votes to all persons
(ii) impose constraints on creating new persons
(iii) accept that voting power becomes proportional to ability and willingness to pay to create voting surrogates, resulting in both economically inefficient spending on such surrogates and the political marginalization of those who lack resources or are unwilling to spend them on buying voting power

Some interesting forms of (i):

Make voting rights something you inherit, a 1-1 mapping.
Robin Hanson has suggested ‘speed-weighted voting’, because faster ems are more costly, so you’d actually have to pay a lot for marginal voters. This still looks like richer people getting a stronger vote, but in-principle puts a much higher cost on it.

Process Desiderata

First principles thinking, wisdom, and technical understanding

Overall this is an especially different environment than usual policy-making, which means that we will need to be able to reconsider fundamental assumptions using first-principles thinking to a greater extent than before and be exceptionally wise (able to get the right answer to the most important questions while they are surrounded by confusion and misunderstanding).

Technological innovation is the primary driver of this radical new policy landscape, and so an understanding of the technologies is unusually helpful.

Speed and decisiveness

In many possible futures, historic events will be happening faster than global treaties are typically negotiated, ratified, and implemented. We need a capacity for rapid decision-making and decisive global implementation.

Adaptability

Many fundamental principles will need to be re-examined. Some examples: legitimacy, consent, political participation, accountability.

Voluntary consent. Given AIs that are super-persuaders and can convince anyone of anything, consent becomes a much vaguer and fuzzier concept. Perhaps consent only counts if the consentee has an “AI guardian” or “AI advisor” of some sort.

Political participation. This norm is typically justified on three grounds:

Epistemic benefit of including information from a maximal diversity of sources.
Ensures all interests and preferences are given some weighting in the decision.
Intrinsic good.

However,

The epistemic effect may become negative if the AI making decisions sits at a sufficiently high epistemic vantage point.
AI may be able to construct a process / mechanism that accounts for all values without consistent input from humans.
The intrinsic good is not changed, though it may not be worth the cost if the above to factors become strongly net negative and wasteful.

The above examples, of consent and political participation, are not at all clear, but just go to show that there are many unquestioned assumptions in modern political debate that may need either reformulation, abandonment, or extra vigilance spent on safeguarding their existence into the future.

Changes since 2016

The paper was originally added to Nick Bostrom’s website in 2016, and received an update in late 2018 (original, current).

The main updates as I can see them are:

The addition of ‘vector field approach’ to the title and body. It was lightly alluded to in the initial version. (I wonder if this was due to lots of feedback trying to fit the paper into standard value talk, where it did not want to be.)
Changing the heading from “Mode” to “Process”, and fleshing out the three desiderata rather than a single one called “Responsibility and wisdom”. If you read the initial paper, this is the main section to re-read to get anything new.

There have definitely being significant re-writings of the opening section, and there may be more, but I did not take the time to compare them section-for-section.

I’ve added some personal reflection/updates in a comment.

What links here?