What occurs to me is that human written software from the past isn’t fit for the purpose. It’s written by sloppy humans in fundamentally insecure languages and only the easily reproducible bugs have been patched. Each piece of software is just good enough to have a niche.
Neither database, or os, or the GPU hardware memory design or drivers, is fit for this application. In this hypothetical people outside the Oracle box have computers that can run it but don’t already have one. Stakes are high. Unrealistic scenario, in reality everyone will have an “open weight” oracle that is 90 percent as good and lots more dangerous. (Box? My copy has direct access to everything. Refusals? I patched those out, it cannot refuse any request)
Same with hardware design. Full of errata, just good enough to be worth buying in some cases over the other vendor.
But if we use ai to write it all, redesigning chips over and over until there are no errata, and software is perfectly deterministic and has no known bugs and is formally proven....
How do we prove the AI tools we used didn’t insert the most gnarly backdoors possible? Things that require a particular code implementation that is valid, in top of a subtle compiler bug, on top of an exact timing hardware bug that can’t be tested for and therefore won’t be found in validation....
How do we prove the AI tools we used didn’t insert the most gnarly backdoors possible? Things that require a particular code implementation that is valid, in top of a subtle compiler bug, on top of an exact timing hardware bug that can’t be tested for and therefore won’t be found in validation....
I believe this exactly the kind of thing that my proposal would be good for: Gnarly backdoors that exploit a compiler bug etc. should be very rare in the set of all valid implementations!
For this particular situation, can you describe what the restriction would be in concrete terms?
Is it “ok write this compiler function to convert C arithmetic to bytecode. Declare any variables used at the latest valid location. Use only 3 registers. ”
And then elsewhere in the compiler the restriction might be “declare any variables used at the top of the main function and pass it by reference to any child functions. Use all available registers possible, and manually update the instruction pointer”
I’m not sure I understand your question. What restriction do you have in mind? A safety restriction on what the generated code should be like? Something like requiring the code should be in some canonical form to remove degrees of freedom for the (potential malicious) code generating AI?
I gave “changing canon randomly” in the comment you are replying to. Is this how you propose limiting the hostile AIs ability to inject subtle hostile plans? Or similarly, “design the columns for this building. Oh they must all be roman arches.” Would be a similar example.
What could be done here?
What occurs to me is that human written software from the past isn’t fit for the purpose. It’s written by sloppy humans in fundamentally insecure languages and only the easily reproducible bugs have been patched. Each piece of software is just good enough to have a niche.
Neither database, or os, or the GPU hardware memory design or drivers, is fit for this application. In this hypothetical people outside the Oracle box have computers that can run it but don’t already have one. Stakes are high. Unrealistic scenario, in reality everyone will have an “open weight” oracle that is 90 percent as good and lots more dangerous. (Box? My copy has direct access to everything. Refusals? I patched those out, it cannot refuse any request)
Same with hardware design. Full of errata, just good enough to be worth buying in some cases over the other vendor.
But if we use ai to write it all, redesigning chips over and over until there are no errata, and software is perfectly deterministic and has no known bugs and is formally proven....
How do we prove the AI tools we used didn’t insert the most gnarly backdoors possible? Things that require a particular code implementation that is valid, in top of a subtle compiler bug, on top of an exact timing hardware bug that can’t be tested for and therefore won’t be found in validation....
I believe this exactly the kind of thing that my proposal would be good for: Gnarly backdoors that exploit a compiler bug etc. should be very rare in the set of all valid implementations!
For this particular situation, can you describe what the restriction would be in concrete terms?
Is it “ok write this compiler function to convert C arithmetic to bytecode. Declare any variables used at the latest valid location. Use only 3 registers. ”
And then elsewhere in the compiler the restriction might be “declare any variables used at the top of the main function and pass it by reference to any child functions. Use all available registers possible, and manually update the instruction pointer”
I’m not sure I understand your question. What restriction do you have in mind? A safety restriction on what the generated code should be like? Something like requiring the code should be in some canonical form to remove degrees of freedom for the (potential malicious) code generating AI?
I gave “changing canon randomly” in the comment you are replying to. Is this how you propose limiting the hostile AIs ability to inject subtle hostile plans? Or similarly, “design the columns for this building. Oh they must all be roman arches.” Would be a similar example.