How dath ilan coordinates around solving alignment

Summary

  • Eliezer Yudkowsky created dath ilan in 2014 as a fictional world where Yudkowsky is the median person. Since then, he has written hundreds of pages of fiction in this setting, mostly “rational fiction”, which means that Yudkowsky thinks most of the things he writes about in dath ilan would actually happen given the premise. This post is a series of quotes from various pieces of glowfic (a form of role-play fiction).

  • Dath ilan “successfully coordinated around maintaining a higher level of ability to solve coordination problems”.

  • When the idea of AGI was first discovered “a few generations” ago, the world government of dath ilan took action to orient their entire civilization around solving the alignment problem, including getting 20% of their researchers to do safety research, and slowing the development of multiple major technologies. AGI safety research has been ongoing on dath ilan for generations.

  • The possibility of AGI and other x-risks is considered an infohazard of the highest importance. It was even kept from government representatives at first. At our current tech level, the average dath ilani has not heard about the possibility of AGI; even so, education gives dath ilani intuitions that might be useful for alignment research.

  • Dath ilan prediction markets that include Keepers (who know about AGI) say they have a 97% chance of reviving cryonically preserved people. Yudkowsky has confirmed[1] this implies dath ilan has a >97% chance of solving alignment.

  • Yudkowsky thinks Earth will probably fail at alignment. It’s not clear whether the likely success of dath ilan is due to its higher average research ability, or its widespread coordination around civilizational priorities and infohazards, but I strongly suspect it’s the latter. Being fiction, Yudkowsky’s utopia is not the closest world to Earth where alignment is solved, but perhaps it can inspire us nonetheless.

The discovery of x-risk from AGI

This is dath ilan; and a few generations before Thellim’s time, the highest of the Keepers[2] called together the Nine Legislators in a secret meeting. Shortly after, the highest Keeper and the Nine called an emergency closed assembly of the 324 Representatives.

And the highest Keeper said, with the Nine Legislators at the peak of dath ilan standing beside her, that the universe had proven to be a harsher and colder and more dangerous place than had been hoped.

And that all Civilization needed to turn much of its effort away from thriving, and toward surviving. There needed to be controls and slowdowns and halts instituted on multiple major technologies. Which would need to be backed up by much more pervasive electronic surveillance than anybody had ever even considered allowing before. Roughly a fifth of all the present and future smart people in the world ought to publicly appear to burn out or retire, and privately work on a new secret project under maximum-security conditions. [...]

The reasoning behind this policy could, in principle, be laid out to the 324 Representatives. But that would represent a noticeable additional risk, if it happened now, while mechanisms to prevent information propagation hadn’t been set up yet.

How dath ilan deals with the infohazard of AGI

https://​​www.lesswrong.com/​​posts/​​gvA4j8pGYG4xtaTkw/​​i-m-from-a-parallel-earth-with-much-higher-coordination-ama

Emielle Potgieter: What is parallel earth’s biggest problem, then?

[...]

Eliezer Yudkowsky: I’d assume that Artificial General Intelligence is being seen by the Senior Very Serious People as a big problem, given the degree to which nobody ever talked about it, how relatively slow computing progress was compared to here, and how my general education just happened to prepare me to make a ton of correct inferences about it as soon as anybody mentioned the possibility to me. They claim to you it’s about hypothetical aliens and economic dysfunction scenarios, but boy howdy do you get a lot of Orthogonality and Goodhart’s Curse in the water supply.

https://​​yudkowsky.tumblr.com/​​post/​​81447230971/​​my-april-fools-day-confession

“I say this to complete the circle which began with my arrival: The world of dath ilan did *not* talk about existential risk. I strongly hypothesize that this is one of those things that the serious people and the shadarak had decided would not get better if everyone was talking about it. Nobody talked about nanotechnology, or superviruses, or advanced machine intelligence, and since I’m *damned* sure that our serious people had imaginations good enough to include that, the silence is conspicuous in retrospect. There was also a surprising amount of publicity about reflective consistency in decision theory and “imagine an agent which can modify itself”; I think, in retrospect, that this was to make sure that the basic theory their AI developers were using was exposed to as many eyes as possible. (That’s how I know about timeless decision theory in the first place, though the tiling agents stuff is being reconstructed from much dimmer recollections.) Our computing technology development stalled around 10 years before I came to Earth, and again, now that I’m on Earth, I’m reasonably certain that dath ilan could have built faster computer chips if that had been deemed wise by the serious people.

When I found myself in Eliezer Yudkowsky’s body, with new memories of all this rather important stuff that was somehow not talked about where I came from, I made my best guess that, if there was any purpose or meaning to my being here, it was handling Earth’s intelligence explosion. So that’s where I focused my efforts, and that’s why I haven’t tried to bring to this world any of the other aspects of dath ilan civilization… though I was rather dismayed, even given Yudkowsky’s memories, on how slow Earth’s support was for the mission I did try to prosecute, and I had to quixotically try to start Earth down the 200-year road to the de’a’na est shadarak before any kind of support developed at all for not having the intelligence explosion go entirely awry. And no, after I arrived I didn’t waste a lot of time on being upset or complaining about impossibility. It *is* impossible and it *was* upsetting, but rapid adaptation to the realities of a situation was a talked-up virtue where I came from.”

https://​​www.glowfic.com/​​replies/​​1780726#reply-1780726

“...that said, yes, the Keepers have bid against things in inscrutable ways, now and then. It wouldn’t be an especially helpful act to compile a public list of all the times they’ve done that, but they’ve done that even in markets I’ve been tracking. To this day I have absolutely no idea why the Keepers fear long-term consequences specified to the rest of us only as ‘people will later vote that was a bad idea’, if Civilization makes a harder push on teaching average kids more ‘computer-science’ once my generation’s kids are slightly smarter. I mean, it’s very credible that ‘computer-science’ reshapes some people’s thoughts in some internally-damaging direction, which the Keepers would rather not point out explicitly for obvious reasons. It doesn’t obviously fit into any plan of corrupt world domination. But… yeah, what the Keepers bid against, largely doesn’t get done, and if they were Hypothetically Corrupted, they could in fact be steering Civilization that way.”

https://​​www.glowfic.com/​​replies/​​1688763#reply-1688763

For the first time it occurs to Keltham to wonder if dath ilan used to have gods, and that’s what the Great Screen is meant to protect, because if you know the info for gods, you might pray to them… it would take a huge effort to keep not just the phenomenon but the physics behind it out of all the textbooks, but that’s the magnitude of effort dath ilan put in to the Great Screen. And if that’s not what’s going on, then there remains the unexplained question of why Keltham does not know any standard speculations about hypothetical superagents, that lots and lots of people could have hypothesized, hypotheses which pose a lot of interesting then-whats once you start looking in that direction

Dath ilan has a >97% chance to solve AI alignment

Note: Yudkowsky confirmed that this quote means he thinks dath ilan would actually solve alignment in 97% of worlds

“About a hundred people every year die for real.”

“Everyone else goes into the cold where time stops, to wait it out, and awaken to whatever the Future brings, when Civilization becomes that powerful. There are far prediction markets that say it’s going to happen eventually with—what I would think would be unreasonably high probability, for something that far out, except that those markets are flagged with Keepers being allowed to trade in them. Whatever secrets the Keepers keep, they would be turned to the purpose of protecting the Preserved, if they were turned to anything at all. So I guess that number reflects what the Keepers would do if they had to, that nobody but them knows they can do.”

“How sure are they”

“Ninety-seven percent, and without calibration training I expect you have no idea how flaming ridiculous that is for a prediction about the Future, but it’s really superheated ridiculous. Apparently the Keepers think they could make thirty completely different statements like that and be wrong once, and, them being the Keepers, they’ve already thought of every single possible reason worth considering for why that might not be true. And that’s not the probability of the tech working, it’s not the probability of revival being possible in principle, it’s not the probability that Civilization makes it that far, it’s not the probability of the Preserved being kept safe that long, it’s the final probability of the Preserved actually coming back.

  1. ^

    through personal communication

  2. ^

    people whose job it is to think carefully about infohazards and other unwieldy ideas