Assigning 10^(-13) -- i.e., suggesting that you’re so good that you can do this and only be wrong one in ten million million times—is just obviously wrong.
And worse:
Claudia’s entries lead to an even smaller probability.
Yes… 10^-13 is incredibly absurd. It isn’t consistent with our background knowledge: there aren’t even that many books—Google estimates there’s something like 400m, and there’s only 2m being added per year, so even if each book had 1 unique author (no one published multiple books etc) there still wouldn’t be that many authors and 1 error of identifying authors (for an error rate of 1/400m) would blow away the confidence interval for being able to do <10^-13. We don’t have that level of confidence in undisputed authors being who we think they are, because they could turn out to be someone else (see: the entire ghostwriting industry for all of history)! Any method which produces such an extreme confidence has simply disproven itself.
Without reading the book, my guess is that all the differentials are systematically exaggerated upwards by perhaps an order of magnitude—for example, the emphasis put on education strikes me as playing on naive beliefs and overconfidence, and if anyone did a systematic sample of undisputed Elizabethan and pre-Elizabethan writers, education would be found to be far weaker than whatever odds the characters give—and each assumed to be independent & uncorrelated, without awareness of how this biases upwards the results. (I also did this in my essay but I specifically highlighted it as a serious issue and tried to counter it with low estimates; so I wound up with high but not absurdly high estimates that I found intuitively acceptable after adjusting down only a relatively small amount.)
Which of course is not to say that the book couldn’t be educational and interesting, but it should definitely be approached with an adversarial attitude of ‘this is wrong; let me see what I can learn from it and how it went wrong for my own analyses’.
The education thing is actually an old issue. When the idea that Shakespeare didn’t write the plays first started coming up in the 19th century, it was heavily based on the education argument, which to some extent was possibly a proxy for British class issues- people in the nobility and upper classes not liking the idea that he wrote the plays. That aspect is still heavily present in a lot of the arguments about this.
Yes, I’ve heard the claim made that a man with ‘small Latin and smaller Greek’ (or however that went) could not have written Shakespeare’s plays; having read them, I don’t find the claim at all compelling, but my assumption is that by this point in the controversy, someone has compiled a representative selection of authors and estimated their education which would allow a direct empirical estimate of what the true correlation is.
Thanks for the clarification. I generally consider that kind of ‘failure in the analysis process’ to be a second order effect, something to be taught after the audience is familiar and comfortable with handling the numbers at all. While a little bit of knowledge is dangerous, it’s a phase everyone must pass through and is unavoidable.
And worse:
Yes… 10^-13 is incredibly absurd. It isn’t consistent with our background knowledge: there aren’t even that many books—Google estimates there’s something like 400m, and there’s only 2m being added per year, so even if each book had 1 unique author (no one published multiple books etc) there still wouldn’t be that many authors and 1 error of identifying authors (for an error rate of 1/400m) would blow away the confidence interval for being able to do <10^-13. We don’t have that level of confidence in undisputed authors being who we think they are, because they could turn out to be someone else (see: the entire ghostwriting industry for all of history)! Any method which produces such an extreme confidence has simply disproven itself.
Without reading the book, my guess is that all the differentials are systematically exaggerated upwards by perhaps an order of magnitude—for example, the emphasis put on education strikes me as playing on naive beliefs and overconfidence, and if anyone did a systematic sample of undisputed Elizabethan and pre-Elizabethan writers, education would be found to be far weaker than whatever odds the characters give—and each assumed to be independent & uncorrelated, without awareness of how this biases upwards the results. (I also did this in my essay but I specifically highlighted it as a serious issue and tried to counter it with low estimates; so I wound up with high but not absurdly high estimates that I found intuitively acceptable after adjusting down only a relatively small amount.)
Which of course is not to say that the book couldn’t be educational and interesting, but it should definitely be approached with an adversarial attitude of ‘this is wrong; let me see what I can learn from it and how it went wrong for my own analyses’.
The education thing is actually an old issue. When the idea that Shakespeare didn’t write the plays first started coming up in the 19th century, it was heavily based on the education argument, which to some extent was possibly a proxy for British class issues- people in the nobility and upper classes not liking the idea that he wrote the plays. That aspect is still heavily present in a lot of the arguments about this.
Yes, I’ve heard the claim made that a man with ‘small Latin and smaller Greek’ (or however that went) could not have written Shakespeare’s plays; having read them, I don’t find the claim at all compelling, but my assumption is that by this point in the controversy, someone has compiled a representative selection of authors and estimated their education which would allow a direct empirical estimate of what the true correlation is.
Thanks for the clarification. I generally consider that kind of ‘failure in the analysis process’ to be a second order effect, something to be taught after the audience is familiar and comfortable with handling the numbers at all. While a little bit of knowledge is dangerous, it’s a phase everyone must pass through and is unavoidable.