Cool! Would it be easy for you to repeat this replacing the normal distribution with an exponential distribution? I think that’s a more natural way to model “waiting for something”.
You’re right, the probability should drop off in a kind of exponential curve since an AI only “gets created at year X” if it hasn’t been made before X. I did some thinking, and I think I can do one better. We can model the creation of the AI as a succession of subsequent “technological breakthroughs” for the most part, ie. unpredictable in advance insights about algorithms or processors or whatever that allow the project to “proceed to the next step”.
Each step can have an exponential distribution for when it will be completed, all (for simplicity) with the same average, set so that the average time for the final distribution will be 50 years (from the year 2000). The final distribution is then P(totaltime = x) = P(sum{time for each step} = x) which is just the repeated convolution of the distribution for each step. The density distribution turns out to be fairly funky:
!}),
where n is the number of steps involved and a is the parameter to the exponential distributions, which for our purposes is n/50 so that the mean of the pdf is 50 years. For n=1 this is of course just the exponential distribution. For n=5 we get a distribution something like this. The cumulative distribution function is actually a bit nicer in a way, just:
}{(n-1)!})
where γ(s, x) is the lower incomplete gamma function, or Γ(s) - Γ(s, x), which is normalized by (n-1)!. Ok, let’s plug in some numbers. The relevant Haskell expression is let ai n x = (let a = (n/50); p = incompleteGamma n (a*x) in (1.0 - p)/(1.0/0.9 - p))¹ where x is years since 2000 without AI. Then for the simple exponential case:
P(AI|not by 2011, n=1) = 0.88
P(AI|not by 2030, n=1) = 0.83
P(AI|not by 2050, n=1) = 0.77
P(AI|not by 2070, n=1) = 0.69
P(AI|not by 2080, n=1) = 0.65
P(AI|not by 2099, n=1) = 0.55
We seem to lose confidence at a somewhat constant gradual rate. By 2150 our probability is still 0.30 though, and it’s only by 2300 that it drops to ~2%. Perhaps I need to just cut off the distribution by 2100. Anyway, for n=5 we have more conclusive results:
P(AI|not by 2011, n=5) = 0.8995
P(AI|not by 2030, n=5) = 0.88
P(AI|not by 2050, n=5) = 0.80
P(AI|not by 2070, n=5) = 0.61
P(AI|not by 2080, n=5) = 0.47
P(AI|not by 2099, n=5) = 0.22
P(AI|not by 2150, n=5) = 0.0077
So we don’t change our confidence much until 2050, then quickly lose confidence, as that area contains “most of the drawers”, metaphorically. We will have pretty much disproved strong AI by 2150.
¹ Using the Statistics package again, specifically Statistics.Math.. Note that incompleteGamma is already normalized so we don’t need to divide by (n-1)!.
This isn’t even related to the law of large numbers, which says that if you flip many coins you expect to get close to half heads and half tails. This is as opposed to flipping 1 coin, where you expect to always get either 100% heads or 100% tails.
I personally expected that P(AI) would drop-off roughly linearly as n increased, so this certainly seems counter-intuitive to me.
Cool! Would it be easy for you to repeat this replacing the normal distribution with an exponential distribution? I think that’s a more natural way to model “waiting for something”.
You’re right, the probability should drop off in a kind of exponential curve since an AI only “gets created at year X” if it hasn’t been made before X. I did some thinking, and I think I can do one better. We can model the creation of the AI as a succession of subsequent “technological breakthroughs” for the most part, ie. unpredictable in advance insights about algorithms or processors or whatever that allow the project to “proceed to the next step”.
Each step can have an exponential distribution for when it will be completed, all (for simplicity) with the same average, set so that the average time for the final distribution will be 50 years (from the year 2000). The final distribution is then
!}),P(totaltime = x) = P(sum{time for each step} = x)
which is just the repeated convolution of the distribution for each step. The density distribution turns out to be fairly funky:where n is the number of steps involved and a is the parameter to the exponential distributions, which for our purposes is n/50 so that the mean of the pdf is 50 years. For n=1 this is of course just the exponential distribution. For n=5 we get a distribution something like this. The cumulative distribution function is actually a bit nicer in a way, just:
}{(n-1)!})where
γ(s, x)
is the lower incomplete gamma function, orΓ(s) - Γ(s, x)
, which is normalized by(n-1)!
. Ok, let’s plug in some numbers. The relevant Haskell expression islet ai n x = (let a = (n/50); p = incompleteGamma n (a*x) in (1.0 - p)/(1.0/0.9 - p))
¹ where x is years since 2000 without AI. Then for the simple exponential case:P(AI|not by 2011, n=1) = 0.88
P(AI|not by 2030, n=1) = 0.83
P(AI|not by 2050, n=1) = 0.77
P(AI|not by 2070, n=1) = 0.69
P(AI|not by 2080, n=1) = 0.65
P(AI|not by 2099, n=1) = 0.55
We seem to lose confidence at a somewhat constant gradual rate. By 2150 our probability is still 0.30 though, and it’s only by 2300 that it drops to ~2%. Perhaps I need to just cut off the distribution by 2100. Anyway, for n=5 we have more conclusive results:
P(AI|not by 2011, n=5) = 0.8995
P(AI|not by 2030, n=5) = 0.88
P(AI|not by 2050, n=5) = 0.80
P(AI|not by 2070, n=5) = 0.61
P(AI|not by 2080, n=5) = 0.47
P(AI|not by 2099, n=5) = 0.22
P(AI|not by 2150, n=5) = 0.0077
So we don’t change our confidence much until 2050, then quickly lose confidence, as that area contains “most of the drawers”, metaphorically. We will have pretty much disproved strong AI by 2150.
¹ Using the Statistics package again, specifically Statistics.Math.. Note that
incompleteGamma
is already normalized so we don’t need to divide by (n-1)!.This is great. The fact that P(AI) is dropping off faster for large n than for small n is a little counterintuitive, right?
Isn’t it just the law of large numbers?
This isn’t even related to the law of large numbers, which says that if you flip many coins you expect to get close to half heads and half tails. This is as opposed to flipping 1 coin, where you expect to always get either 100% heads or 100% tails.
I personally expected that P(AI) would drop-off roughly linearly as n increased, so this certainly seems counter-intuitive to me.