So my understanding then would be that initial skew tells you how fast you will approach the skew of a Gaussian (i.e. 0) and initial kurtosis tells you how fast you approach the kurtosis of a Gaussian (I.e. 3)?
Using my calibrated eyeball it looks like each time you convolve a function with itself the kurtosis moves half of the distance to 3. If this is true (or close to true) and if there is a similar rule for skew then that would seem super useful.
I do have some experience in distributions where kurtosis is very important. For one example I initially was modelling to a normal distribution but found as more data became available that I was better to replace that with a logistic distribution with thicker tails. This can be very important for analysing safety critical components where the tail of the distribution is key.
If you have two independent things with kurtoses k1,k2 and corresponding variances v1,v2 then their sum (i.e., the convolution of the probability distributions) has kurtosis (v1v1+v2)2k1+(v2v1+v2)2k2+6v1v2(v1+v2)2 (in general there are two more cross-terms involving “cokurtosis” values that equal 0 in this case, and the last term involves another cokurtosis that equals 1 in this case).
We can rewrite this as (v1v1+v2)2(k1−3)+(v2v1+v2)2(k2−3)+3((v1v1+v2)2+2v1v2(v1+v2)2+(v2v1+v2)2) which equals (v1v1+v2)2(k1−3)+(v2v1+v2)2(k2−3)+3. So if both kurtoses differ from 3 by at most δ then the new kurtosis differs from 3 by at most v21+v22(v1+v2)2δ which is at most δ, and strictly less provided both variances are nonzero. If v1=v2 then indeed the factor is exactly 1⁄2.
So Maxwell’s suspicions and Bucky’s calibrated eyeball are both correct.
So my understanding then would be that initial skew tells you how fast you will approach the skew of a Gaussian (i.e. 0) and initial kurtosis tells you how fast you approach the kurtosis of a Gaussian (I.e. 3)?
Using my calibrated eyeball it looks like each time you convolve a function with itself the kurtosis moves half of the distance to 3. If this is true (or close to true) and if there is a similar rule for skew then that would seem super useful.
I do have some experience in distributions where kurtosis is very important. For one example I initially was modelling to a normal distribution but found as more data became available that I was better to replace that with a logistic distribution with thicker tails. This can be very important for analysing safety critical components where the tail of the distribution is key.
If you have two independent things with kurtoses k1,k2 and corresponding variances v1,v2 then their sum (i.e., the convolution of the probability distributions) has kurtosis (v1v1+v2)2k1+(v2v1+v2)2k2+6v1v2(v1+v2)2 (in general there are two more cross-terms involving “cokurtosis” values that equal 0 in this case, and the last term involves another cokurtosis that equals 1 in this case).
We can rewrite this as (v1v1+v2)2(k1−3)+(v2v1+v2)2(k2−3)+3((v1v1+v2)2+2v1v2(v1+v2)2+(v2v1+v2)2) which equals (v1v1+v2)2(k1−3)+(v2v1+v2)2(k2−3)+3. So if both kurtoses differ from 3 by at most δ then the new kurtosis differs from 3 by at most v21+v22(v1+v2)2δ which is at most δ, and strictly less provided both variances are nonzero. If v1=v2 then indeed the factor is exactly 1⁄2.
So Maxwell’s suspicions and Bucky’s calibrated eyeball are both correct.
Wow! Cool—thanks!
Those possible approximate rules are interesting. I’m not sure about the answers to any of those questions.