Visual systems — and other aspects of our cognition about physical objects — exploit the fact that physical objects are highly redundant.
For my purposes as a human, every square foot of a house wall is pretty much identical to every other square foot of it. One piece of wallboard is pretty much the same as any other, and minuscule differences are irrelevant: removing an atom or a crystal from the wallboard will not alter it much. So it is safe, both physically and epistemologically, to visually encode a whole wall as a bounded surface of a single color, material, and texture, rather than as a huge number of tiny individual constituent particles. There are of course many differences between one square foot of wall and the next, but those discrepancies are highly unlikely to be of any particular consequence.
The same is not true for code. Code is not highly redundant. One kilobyte of code is not pretty much the same as any other kilobyte of code; and minuscule differences are significant: altering a single bit can completely change the outcome of running the code. (Indeed, programmers try hard to reduce the redundancy of code; introducing redundancy into code is “cut & paste programming” which is considered a very bad practice.) Tiny differences in code are of substantial consequence.
With physical objects, most of the atomic-level details cancel out. With code, all of the bit-level details matter.
I like the example Oscar handed me: the sieve of Eratosthenes can be “compressed” as “a program that finds primes.”
The bit-level details matter to a CPU. A human given the instruction “write a program that finds prime numbers” wouldn’t start with bit 1 and then go to bit 2 and bit 3. They would start with high-level structure, maybe a key algorithm, and then move to fine details to make the high-level structure work. Another human looking at the code would be able to see this structure, see the key algorithm, and probably wouldn’t remember all the fine details.
A human given the instruction “write a program that finds prime numbers” must use their background knowledge of prime numbers and algorithms to derive the sieve of Eratosthenes or an equivalent algorithm.
You can study a piece of code, discern that it is an algorithm for finding primes, and label it as such. However, to glance at a new piece of code and recognize it as prime-finding code, rather than defective prime-finding code, requires close inspection and not compression. This does not lend itself to the kind of optimizations suggested by the expression “sensory modality”, i.e. analogy to human sensory processing which compresses away lots of details — even relevant details, which is why we have optical illusions, etc.
That sort of thing is quickly checkable, though—the trouble is if looks like it finds primes, does something else, and that something else isn’t one of the hypotheses generated. Which could definitely make this much less useful if it was common.
I dunno, maybe you’re right and a holistic approach is necessary. But I feel like I understand code modularly rather than holistically, and though I sometimes lose track of details I’m usually pretty good at knowing which ones will be important later.
It’s easy to understand code modularly when it was written straightforwardly. It’s a lot harder when it’s spaghetti code … or when it’s actually intentionally tricky. Past a certain point there’s no choice but to actually step through each possible branch of the code — which can be done, but it’s hardly the sort of automatic layers of approximation and pattern-matching that our senses use.
Someone linked me to the 2008 contest once. I’m more familiar with the Obfuscated C Contest, which would also be a problem. But I think it’s fine to only be able to “intuit” about straightfoward-ish code. The main use, after all, would be to understand code written by your own programmers in order to self-improve.
Visual systems — and other aspects of our cognition about physical objects — exploit the fact that physical objects are highly redundant.
For my purposes as a human, every square foot of a house wall is pretty much identical to every other square foot of it. One piece of wallboard is pretty much the same as any other, and minuscule differences are irrelevant: removing an atom or a crystal from the wallboard will not alter it much. So it is safe, both physically and epistemologically, to visually encode a whole wall as a bounded surface of a single color, material, and texture, rather than as a huge number of tiny individual constituent particles. There are of course many differences between one square foot of wall and the next, but those discrepancies are highly unlikely to be of any particular consequence.
The same is not true for code. Code is not highly redundant. One kilobyte of code is not pretty much the same as any other kilobyte of code; and minuscule differences are significant: altering a single bit can completely change the outcome of running the code. (Indeed, programmers try hard to reduce the redundancy of code; introducing redundancy into code is “cut & paste programming” which is considered a very bad practice.) Tiny differences in code are of substantial consequence.
With physical objects, most of the atomic-level details cancel out. With code, all of the bit-level details matter.
I like the example Oscar handed me: the sieve of Eratosthenes can be “compressed” as “a program that finds primes.”
The bit-level details matter to a CPU. A human given the instruction “write a program that finds prime numbers” wouldn’t start with bit 1 and then go to bit 2 and bit 3. They would start with high-level structure, maybe a key algorithm, and then move to fine details to make the high-level structure work. Another human looking at the code would be able to see this structure, see the key algorithm, and probably wouldn’t remember all the fine details.
A human given the instruction “write a program that finds prime numbers” must use their background knowledge of prime numbers and algorithms to derive the sieve of Eratosthenes or an equivalent algorithm.
You can study a piece of code, discern that it is an algorithm for finding primes, and label it as such. However, to glance at a new piece of code and recognize it as prime-finding code, rather than defective prime-finding code, requires close inspection and not compression. This does not lend itself to the kind of optimizations suggested by the expression “sensory modality”, i.e. analogy to human sensory processing which compresses away lots of details — even relevant details, which is why we have optical illusions, etc.
That sort of thing is quickly checkable, though—the trouble is if looks like it finds primes, does something else, and that something else isn’t one of the hypotheses generated. Which could definitely make this much less useful if it was common.
I dunno, maybe you’re right and a holistic approach is necessary. But I feel like I understand code modularly rather than holistically, and though I sometimes lose track of details I’m usually pretty good at knowing which ones will be important later.
Ever hear of the Underhanded C Contest?
It’s easy to understand code modularly when it was written straightforwardly. It’s a lot harder when it’s spaghetti code … or when it’s actually intentionally tricky. Past a certain point there’s no choice but to actually step through each possible branch of the code — which can be done, but it’s hardly the sort of automatic layers of approximation and pattern-matching that our senses use.
Someone linked me to the 2008 contest once. I’m more familiar with the Obfuscated C Contest, which would also be a problem. But I think it’s fine to only be able to “intuit” about straightfoward-ish code. The main use, after all, would be to understand code written by your own programmers in order to self-improve.