OK, so I was saying here that software can optimize for something (e.g. predicting a string of bits on the basis of other bits) and it’s by default not particularly dangerous, as long as the optimization does not involve an intelligent foresight-based search through real-world causal pathways to reach the desired goal. My argument for this was (1) Such a system can do Level-1 optimization but not Level-2 optimization (with regards to real-world causal pathways unrelated to implementing the algorithm as intended), and (2) only the latter is unusually dangerous. From your response, it seems like you agree with (1) but disagree with (2). Is that right? If you disagree with (2), can you make up a scenario of something really bad and dangerous, something that couldn’t happen with today’s software, something like a Global Catastrophic Risk, that is caused by a future AI that is optimizing something but is not more specifically using a world-model to do an intelligent search through real-world causal pathways towards a desired goal?
Sure. Let’s construct the 0-optimizer. Its purpose is simply to cause there to be lots of 0s in memory (as opposed to 1s). It only knows about Algorithm Land, and even then it’s a pretty narrow model: it knows about memory and can read and write to it. Now at some point the 0-optimizer manages to get all the bits set to 0 in its addressable memory, so it would seem to have reached maximum attainment.
But it’s a hungry optimizer and keeps trying to find ways to set more bits to 0. It eventually stumbles upon a gap in security of the operating system that allows it to gain access to memory outside its address space, so it can now set those bits to 0. Obviously it does this all “accidentally”, never knowing it’s using a security exploit, it just stumbles into it and just sees memory getting written with 0s so it’s happy (this has plenty of precedent; human minds are great examples of complex systems that have limited introspective access that do lots of complex things without knowing how or why they are doing them). With some luck, it doesn’t immediately destroy itself and gets a chance to be hungry for more 0s.
Next it accidentally starts using the network interface on the computer. Although it doesn’t exactly understand what’s going on, it figures out how to get responses that just contain lots of 0s. Unfortunately for us what this is actually doing is performing a denial of service attack against other computers to get back the 0s. Now we have a powerful optimization process that’s hungry for 0s and it satisfies its hunger by filling our networks with garbage traffic.
Couple of hops on, it’s gone from denial of service attacks to wiping out our ability to use Internet service to our ability to use any EM communication channel to generating dangerously high levels of radiation that kill all life on Earth.
This story involved a lot of luck, but my expectation is that we should not underestimate how “lucky” a powerful optimizer can be, given evolution is a similarly ontologically simple process that nonetheless managed to produce some pretty complex results.
OK, so I was saying here that software can optimize for something (e.g. predicting a string of bits on the basis of other bits) and it’s by default not particularly dangerous, as long as the optimization does not involve an intelligent foresight-based search through real-world causal pathways to reach the desired goal. My argument for this was (1) Such a system can do Level-1 optimization but not Level-2 optimization (with regards to real-world causal pathways unrelated to implementing the algorithm as intended), and (2) only the latter is unusually dangerous. From your response, it seems like you agree with (1) but disagree with (2). Is that right? If you disagree with (2), can you make up a scenario of something really bad and dangerous, something that couldn’t happen with today’s software, something like a Global Catastrophic Risk, that is caused by a future AI that is optimizing something but is not more specifically using a world-model to do an intelligent search through real-world causal pathways towards a desired goal?
Sure. Let’s construct the 0-optimizer. Its purpose is simply to cause there to be lots of 0s in memory (as opposed to 1s). It only knows about Algorithm Land, and even then it’s a pretty narrow model: it knows about memory and can read and write to it. Now at some point the 0-optimizer manages to get all the bits set to 0 in its addressable memory, so it would seem to have reached maximum attainment.
But it’s a hungry optimizer and keeps trying to find ways to set more bits to 0. It eventually stumbles upon a gap in security of the operating system that allows it to gain access to memory outside its address space, so it can now set those bits to 0. Obviously it does this all “accidentally”, never knowing it’s using a security exploit, it just stumbles into it and just sees memory getting written with 0s so it’s happy (this has plenty of precedent; human minds are great examples of complex systems that have limited introspective access that do lots of complex things without knowing how or why they are doing them). With some luck, it doesn’t immediately destroy itself and gets a chance to be hungry for more 0s.
Next it accidentally starts using the network interface on the computer. Although it doesn’t exactly understand what’s going on, it figures out how to get responses that just contain lots of 0s. Unfortunately for us what this is actually doing is performing a denial of service attack against other computers to get back the 0s. Now we have a powerful optimization process that’s hungry for 0s and it satisfies its hunger by filling our networks with garbage traffic.
Couple of hops on, it’s gone from denial of service attacks to wiping out our ability to use Internet service to our ability to use any EM communication channel to generating dangerously high levels of radiation that kill all life on Earth.
This story involved a lot of luck, but my expectation is that we should not underestimate how “lucky” a powerful optimizer can be, given evolution is a similarly ontologically simple process that nonetheless managed to produce some pretty complex results.