The case about swizerland is different, so I won’t talk about it. In the other cases, what is going on is about fixed points. Call the text version of the latest compiler S, and the compiled version X. These have the relation that X(S)=X. The compiled code, when given the non compiled code as input, returns itself. However there are many different pieces of machine code Y with the property that Y(S)=Y. Some will ignore their input entirely and quine themselves, some will be nonsensical data shuffling that produces gibberish on any other input. A few might be compilers that detect when they are compiling themselves and insert a malicious package. https://en.wikipedia.org/wiki/Backdoor_(computing)#Compiler_backdoors . Its possible that some feature of the programming language is defined in a way that uses itself, in such a way that there are stable modifications to the language. Suppose the only time an else block is used is in defining what to do when compiling an else block. Then a broken compiler that never ran any code in else blocks might compile into itself.
The fixed points of self compiling compilers are sufficiently rare, and most of them will be sufficiently stupid, that it should be possible to deduce which fixed point you want given only weak assumptions of sanity. I would expect a team of smart programmers to be able to figure out the language P, given only a P compiler written in P. (assuming P is a sensible programming language they haven’t seen before, like python would be to a parallel universe where the biggest diff was that python didn’t exist.) For instance, they would have a pretty good guess at what if statements, multiplication, ect did at first glance. It would then be a case o using that to figure out the details of object inheritance.
The biological case is basically the same. DNA is source code, proteins and other cellular machinery form the binaries. The DNA contains instructions that tell proteins how to duplicate themselves. Biologists can probably find the right fixed point by putting protein constructors from several animal sources, along with amino acids and plenty of DNA into a test tube. If the right pieces get close enough in the right way that the first protein machine forms, it will duplicate exponentially. (Maybe not, if the gap in the meaning of DNA is significant).
If we have just the compiler source code, we are missing some information (easily proven by showing that there’s infinite number of such Xs where X(S)=X, whereas only one is “correct”).
To find out what that information may be let’s consider the case where both the source code of the compiler and the compiler binary are available, but there’s no programmer that understands the language. Are we still missing said piece of information?
On one hand, we can assume that yes, the information in question is still missing. In that case it must be something that is in the head of the programmer, some kind of “interpretation” of the language. But if that is so, how does that apply to the biological case? What’s the “interpretation” of DNA and whose head it resides in?
On the other hand, we can assume that no, with the compiler binary at hand there’s no information missing. Therefore, there must be something in the binary that’s not present in the source code. But given that the binary is just a transformation of the source code, what exactly that may be? Is it some kind of “interpretation” of the language, but encoded as machine code?
An unrelated though: Why is the Swiss/CAR case different from the other two? If one looks at how the reproduction is carried out in living organisms (not the high school biology version, but the real thing) then it is, given its complexity and distributed nature, much more similar to the working of a society than to a compiler. Maybe, after all, the biological and sociological cases are similar, and the compilers have nothing to do with the other two?
There is info in the compiler binary that isn’t in the source code. Suppose the language contains the constant Pi.
The lines in the compiler that deal with this look like
If (next token == “Pi”){
return float_to_binary(Pi)}
The actual value of Pi, 3.14… is nowhere to found in the source code, its stored in the binary, and passed from the compilers binary into the compiled binary. Of course, at some point the value of Pi must have been hard coded in. Perhaps it was written into the first binary, and has never been seen in source code. Or perhaps a previous version contained return float_to_binary(3.14) instead. Given just the source code, there would be no way to tell the value of Pi without getting out a maths book and relying on the programmers using normal mathematical names. The binary is a transformation of the source code, but that doesn’t stop the compiler adding info. A compiled binary contains info about which processor architecture it runs on, source code doesn’t, and so can be compiled onto different architectures. A compiler adds info, even to its own source code.
A very insightful explanation. It leads me to think what this implies for the replication of nanobots:
If all nanodevices produced are precise molecular copies, and moreover, any mistakes on the assembly line are not heritable because the offspring got a digital copy of the original encrypted instructions for use in making grandchildren, then your nanodevices ain’t gonna be doin’ much evolving.
You’d still have to worry about prions—self-replicating assembly errors apart from the encrypted instructions, where a robot arm fails to grab a carbon atom that is used in assembling a homologue of itself, and this causes the offspring’s robot arm to likewise fail to grab a carbon atom, etc., even with all the encrypted instructions remaining constant.
So is prion evolution just sliding from fixed point to fixed point? If so, how likely is it to happen and how would one go about suppressing process? How would one reduce the density of fixed points?
Yes, prion evolution is sliding between fixed points, One way to reduce fixed points would be to measure and test the finished duplicate, and destroy it if it fails the tests. Without tests, you just need A to build A, and A’ to build A’. No prion can reside exclusively in the testing mechanisms, so either the difference between A and A’ is something that the tests can’t measure, or A’ builds A’ and also T’, a tester that has a prion making A’ pass the tests. This is a much more stringent set of conditions, so there are less prions. Of course, a self reproducing program is always a fixed point. You can’t stop those (nanomachines that self reproduce without looking at your instructions) from being possible, just avoid making them.
The case about swizerland is different, so I won’t talk about it. In the other cases, what is going on is about fixed points. Call the text version of the latest compiler S, and the compiled version X. These have the relation that X(S)=X. The compiled code, when given the non compiled code as input, returns itself. However there are many different pieces of machine code Y with the property that Y(S)=Y. Some will ignore their input entirely and quine themselves, some will be nonsensical data shuffling that produces gibberish on any other input. A few might be compilers that detect when they are compiling themselves and insert a malicious package. https://en.wikipedia.org/wiki/Backdoor_(computing)#Compiler_backdoors . Its possible that some feature of the programming language is defined in a way that uses itself, in such a way that there are stable modifications to the language. Suppose the only time an else block is used is in defining what to do when compiling an else block. Then a broken compiler that never ran any code in else blocks might compile into itself.
The fixed points of self compiling compilers are sufficiently rare, and most of them will be sufficiently stupid, that it should be possible to deduce which fixed point you want given only weak assumptions of sanity. I would expect a team of smart programmers to be able to figure out the language P, given only a P compiler written in P. (assuming P is a sensible programming language they haven’t seen before, like python would be to a parallel universe where the biggest diff was that python didn’t exist.) For instance, they would have a pretty good guess at what if statements, multiplication, ect did at first glance. It would then be a case o using that to figure out the details of object inheritance.
The biological case is basically the same. DNA is source code, proteins and other cellular machinery form the binaries. The DNA contains instructions that tell proteins how to duplicate themselves. Biologists can probably find the right fixed point by putting protein constructors from several animal sources, along with amino acids and plenty of DNA into a test tube. If the right pieces get close enough in the right way that the first protein machine forms, it will duplicate exponentially. (Maybe not, if the gap in the meaning of DNA is significant).
Let me restate the question in a different way:
If we have just the compiler source code, we are missing some information (easily proven by showing that there’s infinite number of such Xs where X(S)=X, whereas only one is “correct”).
To find out what that information may be let’s consider the case where both the source code of the compiler and the compiler binary are available, but there’s no programmer that understands the language. Are we still missing said piece of information?
On one hand, we can assume that yes, the information in question is still missing. In that case it must be something that is in the head of the programmer, some kind of “interpretation” of the language. But if that is so, how does that apply to the biological case? What’s the “interpretation” of DNA and whose head it resides in?
On the other hand, we can assume that no, with the compiler binary at hand there’s no information missing. Therefore, there must be something in the binary that’s not present in the source code. But given that the binary is just a transformation of the source code, what exactly that may be? Is it some kind of “interpretation” of the language, but encoded as machine code?
An unrelated though: Why is the Swiss/CAR case different from the other two? If one looks at how the reproduction is carried out in living organisms (not the high school biology version, but the real thing) then it is, given its complexity and distributed nature, much more similar to the working of a society than to a compiler. Maybe, after all, the biological and sociological cases are similar, and the compilers have nothing to do with the other two?
There is info in the compiler binary that isn’t in the source code. Suppose the language contains the constant Pi.
The lines in the compiler that deal with this look like
If (next token == “Pi”){
return float_to_binary(Pi)}
The actual value of Pi, 3.14… is nowhere to found in the source code, its stored in the binary, and passed from the compilers binary into the compiled binary. Of course, at some point the value of Pi must have been hard coded in. Perhaps it was written into the first binary, and has never been seen in source code. Or perhaps a previous version contained return float_to_binary(3.14) instead. Given just the source code, there would be no way to tell the value of Pi without getting out a maths book and relying on the programmers using normal mathematical names. The binary is a transformation of the source code, but that doesn’t stop the compiler adding info. A compiled binary contains info about which processor architecture it runs on, source code doesn’t, and so can be compiled onto different architectures. A compiler adds info, even to its own source code.
A very insightful explanation. It leads me to think what this implies for the replication of nanobots:
So is prion evolution just sliding from fixed point to fixed point? If so, how likely is it to happen and how would one go about suppressing process? How would one reduce the density of fixed points?
Yes, prion evolution is sliding between fixed points, One way to reduce fixed points would be to measure and test the finished duplicate, and destroy it if it fails the tests. Without tests, you just need A to build A, and A’ to build A’. No prion can reside exclusively in the testing mechanisms, so either the difference between A and A’ is something that the tests can’t measure, or A’ builds A’ and also T’, a tester that has a prion making A’ pass the tests. This is a much more stringent set of conditions, so there are less prions. Of course, a self reproducing program is always a fixed point. You can’t stop those (nanomachines that self reproduce without looking at your instructions) from being possible, just avoid making them.