There is something I found really interesting about Ben’s submission, which is that his probabilities went as high as 88%. Ben’s was an outlier among the best entries in this sense: most of the entries that did best had a maximum probability in the 60-80 percent range. This confused me: could you really be 88% sure that a string was random, when you know you’re assigning something more like 60-65% to a typical random string? Can a string look really random (rather than merely random)? This sort of seems like an oxymoron: random strings don’t really stand out in any way sort of by definition.
In principle, at least, the explanation would be straightforward: suppose X is a measure of how many tests of randomness (i.e. tests of the presence of some feature which we expect to more commonly occur in real strings than fake ones) a string passes, and we assign our probability monotonically from X.
Your typical fake string fails many tests of randomness, whereas your typical real string fails few. Yet, there is still variation in X within each category. Hence some real strings will pass nearly all the tests, and get an unusually large probability of being real.
In principle, at least, the explanation would be straightforward: suppose X is a measure of how many tests of randomness (i.e. tests of the presence of some feature which we expect to more commonly occur in real strings than fake ones) a string passes, and we assign our probability monotonically from X.
Your typical fake string fails many tests of randomness, whereas your typical real string fails few. Yet, there is still variation in X within each category. Hence some real strings will pass nearly all the tests, and get an unusually large probability of being real.
I think this is mainly a feature of the relatively short length. Longer random strings would have less variation.