If we’re told that the data is a rule-generated sequence, then programs that are just lists can be excluded. That’s a conclusion from the data, not from the prior.
Plus, if you come across, out of nowhere, the sequence of prime numbers up to 11, what probability ought you assign that it is the sequence of primes instead of arbitrary numbers? I think the overhead of a computer program in comparison to a statement of mathematical requirement may be too great to give a realistic estimate. In particular, generating a nonexistence test in code carries a lot of overhead, while in math it’s quite compact.
Just listing them, we have “2,3,5,7,11”
That’s 10 symbols, if we let 11 be considered 1 symbol (its short enough that a reasonable representation would leave it at its present length), but leave the commas as delimiters.
Via mathematical criterion, “a:!b,c>1|b*c=a”
That’s 14 characters, of which 2 can be considered overhead (could leave ‘a:’ implicit). Note, we dropped order information here. I suppose in the event of non-uniformly-increasing sequences, we’ll need the spot used here for ‘a:’ for something else.
Now, code. Count each line as 2 characters, and I’m completely neglecting any runtime optimization.
Assign register 1 the value 2
(label NEXT) Assign register 2 the value 2.
(label CANDIDATE) test for equality between register 2 and register 1
(on success) jump to KEEP
Assign register 3 the value in register 1
apply modulo base register 2 to register 3
test for register 3 being 0.
(on success) jump to NEXT while incrementing register 1
jump to CANDIDATE while incrementing register 2
(label KEEP) output register 1
jump to NEXT while incrementing register 1
22 ‘characters’
It may be possible to trim a few more away, if the instruction set is optimized for efficient short-range jumps. If the test commands can include longer jumps than 1 step in them (e.g. a skip-2 test and a go-back-4 test) then you get to save 2 lines. If the modulo command takes 3 register arguments, you can save one more line.
With all of these, we’re down to 16 characters, still 4 more than the briefer variant of the mathematical criterion. There’s no way ‘ascending order’ counts for 4 characters, or even 2 for the longer variant (since those 2 characters could be taken to specify that), to make up that difference.
Yet… I really suspect that if you run across the sequence “2, 3, 5, 7, 11” with no context other than that 2 was the beginning of the sequence, you ought to assign more than a 1/(sizeof char)^2 probability that it’s the sequence of primes—that would be somewhere around 0.0001 probability.
(edited: mathematical statement of primality did not exclude factors of 1)
If I had random C program print me out 2,3,5,7,11 , I would still assume VERY low probability that it is going to print out primes correctly up to reasonably big number. Even more so for Turing machines or anything of this kind. Ditto for any natural processes. Ditto for seeing those numbers in idk child doodling when child didn’t learn the primes yet. Child might have invented primes but that list is not remotely enough evidence.
If a human writer of a test tells me this sequence, I would guess that he knows of primes, and is telling primes to see if I know of primes too. But if he didn’t get taught primes, I would not assume he invented primes from just this sequence.
The Kolmogorov’s complexity really is dependent to language. If we do it informally with human language, then ‘primes’ is an answer that is shorter than ‘two three five seven eleven’. But then the complexity depends to language and expectations of what’s more complex.
Consider the ‘petals around the roses’ joke as very extreme example. Most educated individuals just try to search some giant solution space and are blind to the dots themselves, seeing it as numbers. Unless they have encountered something of this kind before (e.g. the other joke about counting loops in numbers), in which case they solve it quite easily.
edit: There is other slightly less related example with regards to how distribution is different for humans. If you look at natural data, it most often begins with 1. If you look at human-forged data, the frequencies of first digits are much more equal unless the person committing the forgery is very clever.
All the cases in your first paragraph provide context. After the first few, the context essentially tells you whether it’s possible for the sequence to be an enumeration of primes.
In the first few cases, of unknown computer programs, do you really think that the prime number hypothesis should be struck with a 40 decibel probability penalty? I’d love to bet with you. Lots and lots of money, as often as possible.
Well, if the programming language had some primes(); function that prints primes, then no, it shouldn’t. If this is a random choice of a program written by a human being, ditto.
If we are talking of some programming language like C, or assembly, or especially Turing machine, and randomly generated programs, then i’m pretty sure if you see 2,3,5,7,11 it is still quite unlikely (on order of 10^-4 at least) that program would print primes correctly. (However the chance that program prints 13 next would be way higher than 10^-4)
In general, the generated programs have a tendency to do really weird stuff. There was an example posted right here:
And the likehood btw depends on programming language. With wolfram alpha, ‘5 primes’ will print you the primes. With x-86 assembly, division instruction may be used. With z-80 assembly, there’s no division or multiplication instruction. With Turing machine or anything of this sort even the addition needs to be ‘reinvented’.
With a language that has huge library of functions, with huffman-coded names (approximately the human language), the complexity will greatly depend to how often people who made that library expected to use primes.
I think if you want to write symbol-short programs for small problems, you should probably look at models which are more composition-based, so as to remove the identifiers/register names which are not really part of the complexity of the problem. To name some actual existant programming languages, (APL or J) or (Forth or Factor). APL in particular uses single-character symbols in a character set designed for the language.
Then there’s compression. If you want to keep it reasonably human-understandable and compositional, how about Huffman-coding the set of symbols?
If we’re told that the data is a rule-generated sequence, then programs that are just lists can be excluded. That’s a conclusion from the data, not from the prior.
Plus, if you come across, out of nowhere, the sequence of prime numbers up to 11, what probability ought you assign that it is the sequence of primes instead of arbitrary numbers? I think the overhead of a computer program in comparison to a statement of mathematical requirement may be too great to give a realistic estimate. In particular, generating a nonexistence test in code carries a lot of overhead, while in math it’s quite compact.
Just listing them, we have “2,3,5,7,11”
That’s 10 symbols, if we let 11 be considered 1 symbol (its short enough that a reasonable representation would leave it at its present length), but leave the commas as delimiters.
Via mathematical criterion, “a:!b,c>1|b*c=a”
That’s 14 characters, of which 2 can be considered overhead (could leave ‘a:’ implicit). Note, we dropped order information here. I suppose in the event of non-uniformly-increasing sequences, we’ll need the spot used here for ‘a:’ for something else.
Now, code. Count each line as 2 characters, and I’m completely neglecting any runtime optimization.
Assign register 1 the value 2
(label NEXT) Assign register 2 the value 2.
(label CANDIDATE) test for equality between register 2 and register 1
(on success) jump to KEEP
Assign register 3 the value in register 1
apply modulo base register 2 to register 3
test for register 3 being 0.
(on success) jump to NEXT while incrementing register 1
jump to CANDIDATE while incrementing register 2
(label KEEP) output register 1
jump to NEXT while incrementing register 1
22 ‘characters’
It may be possible to trim a few more away, if the instruction set is optimized for efficient short-range jumps. If the test commands can include longer jumps than 1 step in them (e.g. a skip-2 test and a go-back-4 test) then you get to save 2 lines. If the modulo command takes 3 register arguments, you can save one more line.
With all of these, we’re down to 16 characters, still 4 more than the briefer variant of the mathematical criterion. There’s no way ‘ascending order’ counts for 4 characters, or even 2 for the longer variant (since those 2 characters could be taken to specify that), to make up that difference.
Yet… I really suspect that if you run across the sequence “2, 3, 5, 7, 11” with no context other than that 2 was the beginning of the sequence, you ought to assign more than a 1/(sizeof char)^2 probability that it’s the sequence of primes—that would be somewhere around 0.0001 probability.
(edited: mathematical statement of primality did not exclude factors of 1)
Well, I dunno.
If I had random C program print me out 2,3,5,7,11 , I would still assume VERY low probability that it is going to print out primes correctly up to reasonably big number. Even more so for Turing machines or anything of this kind. Ditto for any natural processes. Ditto for seeing those numbers in idk child doodling when child didn’t learn the primes yet. Child might have invented primes but that list is not remotely enough evidence.
If a human writer of a test tells me this sequence, I would guess that he knows of primes, and is telling primes to see if I know of primes too. But if he didn’t get taught primes, I would not assume he invented primes from just this sequence.
The Kolmogorov’s complexity really is dependent to language. If we do it informally with human language, then ‘primes’ is an answer that is shorter than ‘two three five seven eleven’. But then the complexity depends to language and expectations of what’s more complex.
Consider the ‘petals around the roses’ joke as very extreme example. Most educated individuals just try to search some giant solution space and are blind to the dots themselves, seeing it as numbers. Unless they have encountered something of this kind before (e.g. the other joke about counting loops in numbers), in which case they solve it quite easily.
edit: There is other slightly less related example with regards to how distribution is different for humans. If you look at natural data, it most often begins with 1. If you look at human-forged data, the frequencies of first digits are much more equal unless the person committing the forgery is very clever.
Very low probability, but much, much larger than most other specific sequences of comparable length!
All the cases in your first paragraph provide context. After the first few, the context essentially tells you whether it’s possible for the sequence to be an enumeration of primes.
In the first few cases, of unknown computer programs, do you really think that the prime number hypothesis should be struck with a 40 decibel probability penalty? I’d love to bet with you. Lots and lots of money, as often as possible.
Well, if the programming language had some primes(); function that prints primes, then no, it shouldn’t. If this is a random choice of a program written by a human being, ditto.
If we are talking of some programming language like C, or assembly, or especially Turing machine, and randomly generated programs, then i’m pretty sure if you see 2,3,5,7,11 it is still quite unlikely (on order of 10^-4 at least) that program would print primes correctly. (However the chance that program prints 13 next would be way higher than 10^-4)
In general, the generated programs have a tendency to do really weird stuff. There was an example posted right here:
http://lesswrong.com/lw/9pl/automatic_programming_an_example/
And the likehood btw depends on programming language. With wolfram alpha, ‘5 primes’ will print you the primes. With x-86 assembly, division instruction may be used. With z-80 assembly, there’s no division or multiplication instruction. With Turing machine or anything of this sort even the addition needs to be ‘reinvented’.
With a language that has huge library of functions, with huffman-coded names (approximately the human language), the complexity will greatly depend to how often people who made that library expected to use primes.
I think if you want to write symbol-short programs for small problems, you should probably look at models which are more composition-based, so as to remove the identifiers/register names which are not really part of the complexity of the problem. To name some actual existant programming languages, (APL or J) or (Forth or Factor). APL in particular uses single-character symbols in a character set designed for the language.
Then there’s compression. If you want to keep it reasonably human-understandable and compositional, how about Huffman-coding the set of symbols?
As noted, language-dependent for sure. APL looks appropriate for this… but… Wikipedia says this code snipped finds all of the prime numbers up to R...
(~R∊R∘.×R)/R←1↓⍳R
which is 17 characters, and you need to feed it a top of range. Machine code wins!
By the way, machine-code symbols are already pretty close to Huffman-coded.