In particular, building “friendly AI” and assuming it will remain friendly seems naive at best, since it will copied and then the friendly part will be modified by hostile actors.
This is correct. Any reasonable AGI development strategy must have strong closure and security measures in place to minimize the risk of leaks, and deployment has to meet the conditions in Dewey (2014).
It seems like someone with a security mindset will want to avoid making any assumption of friendliness and instead work on making critical systems that are simple enough to be mathematically proven secure.
This is also correct, if the idea is to ensure that developers understand their system (and safety-critical subsystems in particular) well enough for the end product to be “friendly”/”aligned.” If you’re instead saying that alignment isn’t a good target to shoot for, then I’m not sure I understand what you’re saying. Why not? How do we achieve good long-term outcomes without routing through alignment?
I think odds are good that, assuming general AI happens at all, someone will build a hostile AI and connect it to the Internet. I think a proper understanding the security mindset is that the assumption “nobody will connect a hostile AI to the Internet” is something we should stop relying on. (In particular, maintaining secrecy and internatonal cooperation seems unlikely. We shouldn’t assume they will work.)
We should be looking for defenses that aren’t dependent of the IQ level of the attacker, similar to how mathematical proofs are independent of IQ. AI alignment is an important research problem, but doesn’t seem directly relevant for this.
In particular, I don’t see why you think “routing through alignment” is important for making sound mathematical proofs. Narrow AI should be sufficient for making advances in mathematics.
I think odds are good that, assuming general AI happens at all, someone will build a hostile AI and connect it to the Internet. I think a proper understanding the security mindset is that the assumption “nobody will connect a hostile AI to the Internet” is something we should stop relying on. (In particular, maintaining secrecy and internatonal cooperation seems unlikely. We shouldn’t assume they will work.)
Yup, all of this seems like the standard MIRI/Eliezer view.
In particular, I don’t see why you think “routing through alignment” is important for making sound mathematical proofs. Narrow AI should be sufficient for making advances in mathematics.
I don’t know what the relevance of “mathematical proofs” is. Are you talking about applying formal methods of some kind to the problem of ensuring that AGI technology doesn’t leak, and saying that AGI is unnecessary for this task? I’m guessing that part of the story you’re missing is that proliferation of AGI technology is at least as much about independent discovery as it is about leaks, splintering, or espionage. You have to address those issues, but the overall task of achieving nonproliferation is much larger than that, and it doesn’t do a lot of good to solve part of the problem without solving the whole problem. AGI is potentially a route to solving the whole problem, not to solving the (relatively easy, though still very important) leaks/espionage problem.
I mean things like using mathematical proofs to ensure that Internet-exposed services have no bugs that a hostile agent might exploit. We don’t need to be able to build an AI to improve defences.
This is correct. Any reasonable AGI development strategy must have strong closure and security measures in place to minimize the risk of leaks, and deployment has to meet the conditions in Dewey (2014).
This is also correct, if the idea is to ensure that developers understand their system (and safety-critical subsystems in particular) well enough for the end product to be “friendly”/”aligned.” If you’re instead saying that alignment isn’t a good target to shoot for, then I’m not sure I understand what you’re saying. Why not? How do we achieve good long-term outcomes without routing through alignment?
I think odds are good that, assuming general AI happens at all, someone will build a hostile AI and connect it to the Internet. I think a proper understanding the security mindset is that the assumption “nobody will connect a hostile AI to the Internet” is something we should stop relying on. (In particular, maintaining secrecy and internatonal cooperation seems unlikely. We shouldn’t assume they will work.)
We should be looking for defenses that aren’t dependent of the IQ level of the attacker, similar to how mathematical proofs are independent of IQ. AI alignment is an important research problem, but doesn’t seem directly relevant for this.
In particular, I don’t see why you think “routing through alignment” is important for making sound mathematical proofs. Narrow AI should be sufficient for making advances in mathematics.
Yup, all of this seems like the standard MIRI/Eliezer view.
I don’t know what the relevance of “mathematical proofs” is. Are you talking about applying formal methods of some kind to the problem of ensuring that AGI technology doesn’t leak, and saying that AGI is unnecessary for this task? I’m guessing that part of the story you’re missing is that proliferation of AGI technology is at least as much about independent discovery as it is about leaks, splintering, or espionage. You have to address those issues, but the overall task of achieving nonproliferation is much larger than that, and it doesn’t do a lot of good to solve part of the problem without solving the whole problem. AGI is potentially a route to solving the whole problem, not to solving the (relatively easy, though still very important) leaks/espionage problem.
I mean things like using mathematical proofs to ensure that Internet-exposed services have no bugs that a hostile agent might exploit. We don’t need to be able to build an AI to improve defences.