Though we’re not actually done yet, because we can invert these probabilities to get my estimate for the classic paperclip scenario where nothing of value is retained:
The basic point of Value Is Fragile is that if you lose any one of these essential properties, the value of the resulting universe drops to near-0.
Value isn’t just complicated, it’s fragile. There is more than one dimension of human value, where if just that one thing is lost, the Future becomes null. A single blow and all value shatters. Not every single blow will shatter all value—but more than one possible “single blow” will do so.
Personally, I don’t actually buy that. (For starters, I think a universe of trillions of identical beings replaying the same hyperoptimized meaning/bliss state is at least better than nothing. I think I could be convinced that it’s better than the transhuman galactic economy of delightfully strange beings that Eliezer is imagining.)
But, insofar as one buys the “value is fragile” thesis, the probability of the conjunction of each of the these dimensions “going the wrong way” isn’t the relevant thing to estimate. You should care about the probability of the disjunction, instead.
Given that assumption, and your own estimates, the probability of a valueless or near valueless AI future is 0.5381056. (eg 1 - the probability of all the dimensions “going right”)
(I’ll also observe that the simple multiplication assumes that these are all independent, which seems unlikely, but is probably fine for the granularity of analysis that you’re doing here?)
The basic point of Value Is Fragile is that if you lose any one of these essential properties, the value of the resulting universe drops to near-0.
Personally, I don’t actually buy that. (For starters, I think a universe of trillions of identical beings replaying the same hyperoptimized meaning/bliss state is at least better than nothing. I think I could be convinced that it’s better than the transhuman galactic economy of delightfully strange beings that Eliezer is imagining.)
But, insofar as one buys the “value is fragile” thesis, the probability of the conjunction of each of the these dimensions “going the wrong way” isn’t the relevant thing to estimate. You should care about the probability of the disjunction, instead.
Given that assumption, and your own estimates, the probability of a valueless or near valueless AI future is 0.5381056. (eg 1 - the probability of all the dimensions “going right”)
(I’ll also observe that the simple multiplication assumes that these are all independent, which seems unlikely, but is probably fine for the granularity of analysis that you’re doing here?)