Protected From Myself

Followup to: The Magnitude of His Own Folly, Entangled Truths, Contagious Lies

Every now and then, another one comes before me with the brilliant idea: “Let’s lie!”

Lie about what?—oh, various things. The expected time to Singularity, say. Lie and say it’s definitely going to be earlier, because that will get more public attention. Sometimes they say “be optimistic”, sometimes they just say “lie”. Lie about the current degree of uncertainty, because there are other people out there claiming to be certain, and the most unbearable prospect in the world is that someone else pull ahead. Lie about what the project is likely to accomplish—I flinch even to write this, but occasionally someone proposes to go and say to the Christians that the AI will create Christian Heaven forever, or go to the US government and say that the AI will give the US dominance forever.

But at any rate, lie. Lie because it’s more convenient than trying to explain the truth. Lie, because someone else might lie, and so we have to make sure that we lie first. Lie to grab the tempting benefits, hanging just within reach—

Eh? Ethics? Well, now that you mention it, lying is at least a little bad, all else being equal. But with so much at stake, we should just ignore that and lie. You’ve got to follow the expected utility, right? The loss of a lie is much less than the benefit to be gained, right?

Thus do they argue. Except—what’s the flaw in the argument? Wouldn’t it be irrational not to lie, if lying has the greatest expected utility?

When I look back upon my history—well, I screwed up in a lot of ways. But it could have been much worse, if I had reasoned like those who offer such advice, and lied.

Once upon a time, I truly and honestly believed that either a superintelligence would do what was right, or else there was no right thing to do; and I said so. I was uncertain of the nature of morality, and I said that too. I didn’t know if the Singularity would be in five years or fifty, and this also I admitted. My project plans were not guaranteed to deliver results, and I did not promise to deliver them. When I finally said “Oops”, and realized that I needed to go off and do more fundamental research instead of rushing to write code immediately—

—well, I can imagine the mess I would have had on my hands, if I had told the people who trusted me: that the Singularity was surely coming in ten years; that my theory was sure to deliver results; that I had no lingering confusions; and that any superintelligence would surely give them their own private island and a harem of catpersons of the appropriate gender. How exactly would one then explain why you’re now going to step back and look for math-inventors instead of superprogrammers, or why the code now has to be theorem-proved?

When you make an honest mistake, on some subject you were honest about, the recovery technique is straightforward: Just as you told people what you thought in the first place, you now list out the actual reasons that you changed your mind. This diff takes you to your current true thoughts, that imply your current desired policy. Then, just as people decided whether to aid you originally, they re-decide in light of the new information.

But what if you were “optimistic” and only presented one side of the story, the better to fulfill that all-important goal of persuading people to your cause? Then you’ll have a much harder time persuading them away from that idea you sold them originally—you’ve nailed their feet to the floor, which makes it difficult for them to follow if you yourself take another step forward.

And what if, for the sake of persuasion, you told them things that you didn’t believe yourself? Then there is no true diff from the story you told before, to the new story now. Will there be any coherent story that explains your change of heart?

Conveying the real truth is an art form. It’s not an easy art form—those darned constraints of honesty prevent you from telling all kinds of convenient lies that would be so much easier than the complicated truth. But, if you tell lots of truth, you get good at what you practice. A lot of those who come to me and advocate lies, talk earnestly about how these matters of transhumanism are so hard to explain, too difficult and technical for the likes of Joe the Plumber. So they’d like to take the easy way out, and lie.

We don’t live in a righteous universe where all sins are punished. Someone who practiced telling lies, and made their mistakes and learned from them, might well become expert at telling lies that allow for sudden changes of policy in the future, and telling more lies to explain the policy changes. If you use the various forbidden arts that create fanatic followers, they will swallow just about anything. The history of the Soviet Union and their sudden changes of policy, as presented to their ardent Western intellectual followers, helped inspire Orwell to write 1984.

So the question, really, is whether you want to practice truthtelling or practice lying, because whichever one you practice is the one you’re going to get good at. Needless to say, those who come to me and offer their unsolicited advice do not appear to be expert liars. For one thing, a majority of them don’t seem to find anything odd about floating their proposals in publicly archived, Google-indexed mailing lists.

But why not become an expert liar, if that’s what maximizes expected utility? Why take the constrained path of truth, when things so much more important are at stake?

Because, when I look over my history, I find that my ethics have, above all, protected me from myself. They weren’t inconveniences. They were safety rails on cliffs I didn’t see.

I made fundamental mistakes, and my ethics didn’t halt that, but they played a critical role in my recovery. When I was stopped by unknown unknowns that I just wasn’t expecting, it was my ethical constraints, and not any conscious planning, that had put me in a recoverable position.

You can’t duplicate this protective effect by trying to be clever and calculate the course of “highest utility”. The expected utility just takes into account the things you know to expect. It really is amazing, looking over my history, the extent to which my ethics put me in a recoverable position from my unanticipated, fundamental mistakes, the things completely outside my plans and beliefs.

Ethics aren’t just there to make your life difficult; they can protect you from Black Swans. A startling assertion, I know, but not one entirely irrelevant to current affairs.

If you’ve been following along my story, you’ll recall that the downfall of all my theories, began with a tiny note of discord. A tiny note that I wouldn’t ever have followed up, if I had only cared about my own preferences and desires. It was the thought of what someone else might think—someone to whom I felt I owed an ethical consideration—that spurred me to follow up that one note.

And I have watched others fail utterly on the problem of Friendly AI, because they simply try to grab the banana in one form or another—seize the world for their own favorite moralities, without any thought of what others might think—and so they never enter into the complexities and second thoughts that might begin to warn them of the technical problems.

We don’t live in a righteous universe. And so, when I look over my history, the role that my ethics have played is so important that I’ve had to take a step back and ask, “Why is this happening?” The universe isn’t set up to reward virtue—so why did my ethics help so much? Am I only imagining the phenomenon? That’s one possibility. But after some thought, I’ve concluded that, to the extent you believe that my ethics did help me, these are the plausible reasons in order of importance:

1) The honest Way often has a kind of simplicity that trangressions lack. If you tell lies, you have to keep track of different stories you’ve told different groups, and worry about which facts might encounter the wrong people, and then invent new lies to explain any unexpected policy shifts you have to execute on account of your mistake. This simplicity is powerful enough to explain a great deal of the positive influence that I attribute to my ethics, in a universe that doesn’t reward virtue per se.

2) I was stricter with myself, and held myself to a higher standard, when I was doing various things that I considered myself ethically obligated to do. Thus my recovery from various failures often seems to have begun with an ethical thought of some type—e.g. the whole development where “Friendly AI” led into the concept of AI as a precise art. That might just be a quirk of my own personality; but it seems to help account for the huge role my ethics played in leading me to important thoughts, which I cannot just explain by saying that the universe rewards virtue.

3) The constraints that the wisdom of history suggests, to avoid hurting other people, may also stop you from hurting yourself. When you have some brilliant idea that benefits the tribe, we don’t want you to run off and do X, Y, and Z, even if you say “the end justifies the means!” Evolutionarily speaking, one suspects that the “means” have more often benefited the person who executes them, than the tribe. But this is not the ancestral environment. In the more complicated modern world, following the ethical constraints can prevent you from making huge networked mistakes that would catch you in their collapse. Robespierre led a shorter life than Washington.

Part of the sequence Ethical Injunctions

Next post: “Ethical Inhibitions

Previous post: “Ends Don’t Justify Means (Among Humans)