It might be difficult to tell the difference between a periodic pattern and a random walk. A random walk also produces periods of “bad” and periods of “good” s.t. after every bad period there is a good period (because what else can there be?) and vice versa. The durations of different “cycles” would also usually be of similar magnitude. And, a priori, a random walk seems more likely and require less explanation.
Tangentially, I think it would be really great if there were standard benchmarks on which quantitative theories of society/history/economics could compete by measuring to which extent they can compress the data. That would make it much easier to understand what is real and what is overfitting.
You will probably be interested in Seshat: the Global History Databank, a project led by Turchin. We’re still a long way from standard benchmarks, mind you, but we’re at least on the way to a database upon which any kind of quantitative work can be done.
That’s what I think every time I hear “history repeats itself”. I wish Scott had considered the idea.
The biggest claim Turchin is making seems to be about the variance of the time intervals between “bad” periods. Random walk would imply that it is high, and “cycles” would imply that it is low.
Depending on how things were weighted would really make a big difference. Especially since a lot of these theories make use of (aggregates of) proxies to measure what they really care about as the data you care about is long lost to history.
WRT overfitting, it would not be too hard to measure the error on a holdout set. Turchin et al. essentially used China as their holdout set.
It might be difficult to tell the difference between a periodic pattern and a random walk. A random walk also produces periods of “bad” and periods of “good” s.t. after every bad period there is a good period (because what else can there be?) and vice versa. The durations of different “cycles” would also usually be of similar magnitude. And, a priori, a random walk seems more likely and require less explanation.
Tangentially, I think it would be really great if there were standard benchmarks on which quantitative theories of society/history/economics could compete by measuring to which extent they can compress the data. That would make it much easier to understand what is real and what is overfitting.
You will probably be interested in Seshat: the Global History Databank, a project led by Turchin. We’re still a long way from standard benchmarks, mind you, but we’re at least on the way to a database upon which any kind of quantitative work can be done.
They recently cracked the problem of whether big gods or big societies came first.
That’s what I think every time I hear “history repeats itself”. I wish Scott had considered the idea.
The biggest claim Turchin is making seems to be about the variance of the time intervals between “bad” periods. Random walk would imply that it is high, and “cycles” would imply that it is low.
Depending on how things were weighted would really make a big difference. Especially since a lot of these theories make use of (aggregates of) proxies to measure what they really care about as the data you care about is long lost to history.
WRT overfitting, it would not be too hard to measure the error on a holdout set. Turchin et al. essentially used China as their holdout set.