I’m confused why this is getting downvotes. Maybe because the title was originally “Yudkowsky: dath ilan has a 97% chance to solve alignment” which is misleading to people who don’t know what dath ilan is?
I originally downvoted because I couldn’t figure out the point. The title definitely contributed to that, though not for the reason you suggest: rather, the title made it sound like this fact about a fictional world was the point, in the same way that you would expect to come away from a post titled “Rowling: Wizarding Britain has a GDP of 1 million Galleons” pretty convinced about a fictional GDP but with possibly no new insights on the real world. I think the new title makes it a bit clearer what one might expect to get out of this: it’s more like “here’s one (admittedly fictional) story of how alignment was handled; could we do something similar in our world?”. I’m curious if that’s the intended point? If so then all the parts analyzing how well it worked for them (e.g. the stuff about cryonics and 97%) still don’t seem that relevant: “here’s an idea from fiction you might consider in reality” and “here’s an idea from fiction which worked out for them you might consider in reality” provide me almost identical information, since the valuable core worth potentially adopting is the idea, but all discussion of whether it worked in fiction tells me extremely little about whether it will work in reality (especially if, as in this case, very little detail is given on why it worked in fiction).
The “dying with dignity” ups the pattern that Elizer uses Aprils Fools for things which can not be constructively discussed as fact. With sufficiently low plausibility fact, maintaining the belief in the other partys rationality is infeasible so it might as well be processed as fiction. “I am not saying this” can also be construed as a method of making readers sink more deeply into fiction in trying to maintain plausibility. It does trigger false positives for using fiction as evidence.
I do not want that gathering evidence for UFOs would be impossible because it would be binned into fiction. “I didn’t see anything but if I would have seen something here is what I would have seen...”. If there were an alien injuction I wish it to be possible to detect. But in the same vein people do process very low odds bad enough that “stay out of low probablity considerations” is a prime candidate to be the best advice about it.
I think the case for why dath ilan is relevant to the real world basically rests on two pieces of context which most people don’t have:
Yudkowsky mostly writes rationalfic, so dath ilan is his actual expectations of what would happen if he were the median person, not just some worldbuilding meant to support a compelling narrative.
Yudkowsky tries pretty hard to convey useful thoughts about the world through his fiction. HPMOR was intended as a complement to the Sequences. Glowfic is an even weirder format, but I’ve heard Yudkowsky say that he can’t write anything else in large volume anymore, and glowfic is an inferior substitute still meant to edify. Overall, I’d guess that these quotes are roughly 30% random irrelevant worldbuilding, and 70% carefully written vignette to convey what good coordination is actually capable of.
I’m not down or upvoting, but I will say, I hope you’re not taking this exercise too seriously...
Are we really going to analyze one person’s fiction (even if rationalist, it’s still fiction), in an attempt to gain insight into this one person’s attempt to model an entire society and its market predictions – and all of this in order to try and better judge the probability of certain futures under a number of counterfactual assumptions? Could be fun, but I wouldn’t give its results much credence.
Don’t forget Yudkowsky’s own advice about not generalizing from fictional evidence and being wary of anchoring. If I had to guess, some of his use of fiction is just an attempt to provide alternative framings and anchors to those thrust on us by popular media (more mainstream TV shows, movies etc). That doesn’t mean we should hang on his every word though.
Yeah, I think the level of seriousness is basically the same as if someone asked Eliezer “what’s a plausible world where humanity solves alignment?” to which the reply would be something like “none unless my assumptions about alignment are wrong, but here’s an implausible world where alignment is solved despite my assumptions being right!”
The implausible world is sketched out in way too much detail, but lots of usefulness points are lost by its being implausible. The useful kernel remaining is something like “with infinite coordination capacity we could probably solve alignment” plus a bit because Eliezer fiction is substantially better for your epistemics than other fiction. Maybe there’s an argument for taking it even less seriously? That said, I’ve definitely updated down on the usefulness of this given the comments here.
I downvoted for the clickbait title, for making obviously wrong inferences from quoted material, and also for being about some fiction instead of anything relevant to the real world. In these quotes Eliezer is not claiming that his fictional dath ilan has a 97% chance of solving alignment, and even if he were, so what?
Eliezer certainly didn’t say that in anything quoted in the original post, and what he did write in that glowfic does not imply it. He may hold that dath ilan has a 97% chance to solve alignment, but that’s an additional claim about his fictional universe that does not follow from what his character Keltham said.
The combination of both statements also strains my suspension of disbelief in his setting even further than it already is. Either one alone is bad enough, but together they imply a great deal more than I think Eliezer intended.
Seeing the relative lack of pickup in terms of upvotes, I just want to thank you for putting this together. I’ve only read a couple of Dath Ilan posts, and this provided a nice coverage of the AI-in-Dath-Ilan concepts, many of the specifics of which I had not read previously.
I’m confused why this is getting downvotes. Maybe because the title was originally “Yudkowsky: dath ilan has a 97% chance to solve alignment” which is misleading to people who don’t know what dath ilan is?
I originally downvoted because I couldn’t figure out the point. The title definitely contributed to that, though not for the reason you suggest: rather, the title made it sound like this fact about a fictional world was the point, in the same way that you would expect to come away from a post titled “Rowling: Wizarding Britain has a GDP of 1 million Galleons” pretty convinced about a fictional GDP but with possibly no new insights on the real world. I think the new title makes it a bit clearer what one might expect to get out of this: it’s more like “here’s one (admittedly fictional) story of how alignment was handled; could we do something similar in our world?”. I’m curious if that’s the intended point? If so then all the parts analyzing how well it worked for them (e.g. the stuff about cryonics and 97%) still don’t seem that relevant: “here’s an idea from fiction you might consider in reality” and “here’s an idea from fiction which worked out for them you might consider in reality” provide me almost identical information, since the valuable core worth potentially adopting is the idea, but all discussion of whether it worked in fiction tells me extremely little about whether it will work in reality (especially if, as in this case, very little detail is given on why it worked in fiction).
The “dying with dignity” ups the pattern that Elizer uses Aprils Fools for things which can not be constructively discussed as fact. With sufficiently low plausibility fact, maintaining the belief in the other partys rationality is infeasible so it might as well be processed as fiction. “I am not saying this” can also be construed as a method of making readers sink more deeply into fiction in trying to maintain plausibility. It does trigger false positives for using fiction as evidence.
I do not want that gathering evidence for UFOs would be impossible because it would be binned into fiction. “I didn’t see anything but if I would have seen something here is what I would have seen...”. If there were an alien injuction I wish it to be possible to detect. But in the same vein people do process very low odds bad enough that “stay out of low probablity considerations” is a prime candidate to be the best advice about it.
Thanks, I made minor edits to clarify the post.
I think the case for why dath ilan is relevant to the real world basically rests on two pieces of context which most people don’t have:
Yudkowsky mostly writes rationalfic, so dath ilan is his actual expectations of what would happen if he were the median person, not just some worldbuilding meant to support a compelling narrative.
Yudkowsky tries pretty hard to convey useful thoughts about the world through his fiction. HPMOR was intended as a complement to the Sequences. Glowfic is an even weirder format, but I’ve heard Yudkowsky say that he can’t write anything else in large volume anymore, and glowfic is an inferior substitute still meant to edify. Overall, I’d guess that these quotes are roughly 30% random irrelevant worldbuilding, and 70% carefully written vignette to convey what good coordination is actually capable of.
I’m not down or upvoting, but I will say, I hope you’re not taking this exercise too seriously...
Are we really going to analyze one person’s fiction (even if rationalist, it’s still fiction), in an attempt to gain insight into this one person’s attempt to model an entire society and its market predictions – and all of this in order to try and better judge the probability of certain futures under a number of counterfactual assumptions? Could be fun, but I wouldn’t give its results much credence.
Don’t forget Yudkowsky’s own advice about not generalizing from fictional evidence and being wary of anchoring. If I had to guess, some of his use of fiction is just an attempt to provide alternative framings and anchors to those thrust on us by popular media (more mainstream TV shows, movies etc). That doesn’t mean we should hang on his every word though.
Yeah, I think the level of seriousness is basically the same as if someone asked Eliezer “what’s a plausible world where humanity solves alignment?” to which the reply would be something like “none unless my assumptions about alignment are wrong, but here’s an implausible world where alignment is solved despite my assumptions being right!”
The implausible world is sketched out in way too much detail, but lots of usefulness points are lost by its being implausible. The useful kernel remaining is something like “with infinite coordination capacity we could probably solve alignment” plus a bit because Eliezer fiction is substantially better for your epistemics than other fiction. Maybe there’s an argument for taking it even less seriously? That said, I’ve definitely updated down on the usefulness of this given the comments here.
I downvoted for the clickbait title, for making obviously wrong inferences from quoted material, and also for being about some fiction instead of anything relevant to the real world. In these quotes Eliezer is not claiming that his fictional dath ilan has a 97% chance of solving alignment, and even if he were, so what?
Some of those are valid reasons to downvote, but
Eliezer certainly didn’t say that in anything quoted in the original post, and what he did write in that glowfic does not imply it. He may hold that dath ilan has a 97% chance to solve alignment, but that’s an additional claim about his fictional universe that does not follow from what his character Keltham said.
The combination of both statements also strains my suspension of disbelief in his setting even further than it already is. Either one alone is bad enough, but together they imply a great deal more than I think Eliezer intended.
Seeing the relative lack of pickup in terms of upvotes, I just want to thank you for putting this together. I’ve only read a couple of Dath Ilan posts, and this provided a nice coverage of the AI-in-Dath-Ilan concepts, many of the specifics of which I had not read previously.