This is pretty thorough from an engineering standpoint. I still would not at all trust such a box to hold a superintelligent AI. The unknown unknowns remain too big a factor. Just because an unknown information channel does not have any humans listening to it directly, does not mean that it can’t be used to push around matter, or couple to some other channel, or otherwise leave an opening.
Also, the human and their clothes are definitely not the only potentially-useful tools in the box: there’s also the computer, the computer’s power supply, whatever’s cooling the room, whatever’s keeping the air fresh, whatever’s used for input, and whatever’s used for output. If any of those things were outside the room, then they would offer a side channel for information to escape. They can be buried in concrete, but they’ll still need to be within the vacuums and whatnot.
In general, trying to outsmart a superintelligent AI is a Bad Idea, and building an AI-proof box definitely falls in that category. If an AI is not safe with arbitrarily large amounts of power, then it is not safe, full stop.
The computer and everything is in the inner concrete wall (separated from the glass box by a vacuum), as is the power supply. Nothing is cooling is the room, except maybe some ice on the floor. I think you could allow an oxygen tank in the room too.
Yes, the computer is using power, but a computer can’t move matter except inside itself. This can generate sound and light, but the second vacuum chamber and Faraday cage will block that (and the rate at which power is drawn can be capped, which can also give a reasonable bound on how much noise to generate.
whatever’s used for input, and whatever’s used for output.
For input, the human is just blocking lasers. For output, they’re looking through the inner vacuum at the screen on the inner wall of the inner concrete box.
They can be buried in concrete, but they’ll still need to be within the vacuums and whatnot.
Yes, that’s what the outer vacuum is for.
trying to outsmart a superintelligent AI is a Bad Idea
If I can construct a proof that Y is physically impossible, then I feel fine depending on the fact that an AI won’t be able to figure out how to do Y. It doesn’t feel to me like “trying to outsmart an AI.”
Just because an unknown information channel does not have any humans listening to it directly, does not mean that it can’t be used to push around matter, or couple to some other channel
Yes, you’re right. My afterthought was hasty. I still think it is unlikely that other forms of information transmission are physically possible, and quite unlikely that they could be generated by a human or a computer in isolation.
Sidenote: I think we’ve all generated a lot of our intuitions here from the AI Box experiments. In a sense, the AI “in the box” is not really boxed. There’s an information channel directly to the outside world!
We cannot “prove” that something is physically impossible, only that it is impossible under some model of physics. Normally that distinction would be entirely irrelevant, but when dealing with a superintelligent AI, it’s quite likely to understand the physics better than we do. For all we know, it may turn out that Alcubierre drives are possible, and if so then the AI could definitely break out that way and would have an incentive to do so.
I agree that the AI is not really boxed here; it’s the “myopia” that makes the difference. But one of two things should generally be true:
The AI doesn’t want to get out of the box, in which case the box doesn’t need to be secure in the first place.
The AI cannot get out of the box, in which case the AI doesn’t need to be safe (but also won’t be very useful).
This case seems like the former, so long as hacking the human is easier than getting out of the box. But that means we don’t need to make the box perfect anyway.
In a sense, the AI “in the box” is not really boxed
I meant the “AI Box” scenario where it is printing results to a screen in the outside world. I do think BoMAI is truly boxed.
We cannot “prove” that something is physically impossible, only that it is impossible under some model of physics.
Right, that’s more or less what I mean to do. We can assign probabilities to statements like “it is physically impossible (under the true models of physics) for a human or a computer in isolation with an energy budget of x joules and y joules/second to transmit information in any way other than via a), b), or c) from above.” This seems extremely likely to me for reasonable values of x and y, so it’s still useful to have a “proof” even if it must be predicated on such a physical assumption.
I strongly disagree with this quote (and would like to know how to point this out!):
I have never seen anyone point out that another’s thoughts were wrong, because they were too abstract, and that they were harmful to the general audience. I have seen three comments advocating for a specific model of human values, which I have never seen anybody; but at the moment I have not seen anyone anywhere in that context anywhere.
This isn’t because it is wrong, but because it doesn’t really sound like a person who would care, even if the AI were not going to see him do his work.
This is to me, the more compelling argument in terms of What if “AIs” might end up being the type that can decide whether to take over, then there isn’t a reasonable way for AIs to have any conscious thoughts.
The idea that AGI is coming soon isn’t obviously right. It looks like we already are. I don’t want to live in a world with lots of AIs over, not enough to make them “free” and not yet understand the basic principles of utility.
I can’t see how you can say that such a scenario is impossible, since the AI would simply be a kind of computer. However, this argument depends on your definition of AI as a “mind with 1” (a mind of a single type).
This is pretty thorough from an engineering standpoint. I still would not at all trust such a box to hold a superintelligent AI. The unknown unknowns remain too big a factor. Just because an unknown information channel does not have any humans listening to it directly, does not mean that it can’t be used to push around matter, or couple to some other channel, or otherwise leave an opening.
Also, the human and their clothes are definitely not the only potentially-useful tools in the box: there’s also the computer, the computer’s power supply, whatever’s cooling the room, whatever’s keeping the air fresh, whatever’s used for input, and whatever’s used for output. If any of those things were outside the room, then they would offer a side channel for information to escape. They can be buried in concrete, but they’ll still need to be within the vacuums and whatnot.
In general, trying to outsmart a superintelligent AI is a Bad Idea, and building an AI-proof box definitely falls in that category. If an AI is not safe with arbitrarily large amounts of power, then it is not safe, full stop.
The computer and everything is in the inner concrete wall (separated from the glass box by a vacuum), as is the power supply. Nothing is cooling is the room, except maybe some ice on the floor. I think you could allow an oxygen tank in the room too.
Yes, the computer is using power, but a computer can’t move matter except inside itself. This can generate sound and light, but the second vacuum chamber and Faraday cage will block that (and the rate at which power is drawn can be capped, which can also give a reasonable bound on how much noise to generate.
For input, the human is just blocking lasers. For output, they’re looking through the inner vacuum at the screen on the inner wall of the inner concrete box.
Yes, that’s what the outer vacuum is for.
If I can construct a proof that Y is physically impossible, then I feel fine depending on the fact that an AI won’t be able to figure out how to do Y. It doesn’t feel to me like “trying to outsmart an AI.”
Yes, you’re right. My afterthought was hasty. I still think it is unlikely that other forms of information transmission are physically possible, and quite unlikely that they could be generated by a human or a computer in isolation.
Sidenote: I think we’ve all generated a lot of our intuitions here from the AI Box experiments. In a sense, the AI “in the box” is not really boxed. There’s an information channel directly to the outside world!
We cannot “prove” that something is physically impossible, only that it is impossible under some model of physics. Normally that distinction would be entirely irrelevant, but when dealing with a superintelligent AI, it’s quite likely to understand the physics better than we do. For all we know, it may turn out that Alcubierre drives are possible, and if so then the AI could definitely break out that way and would have an incentive to do so.
I agree that the AI is not really boxed here; it’s the “myopia” that makes the difference. But one of two things should generally be true:
The AI doesn’t want to get out of the box, in which case the box doesn’t need to be secure in the first place.
The AI cannot get out of the box, in which case the AI doesn’t need to be safe (but also won’t be very useful).
This case seems like the former, so long as hacking the human is easier than getting out of the box. But that means we don’t need to make the box perfect anyway.
Whoops—when I said
I meant the “AI Box” scenario where it is printing results to a screen in the outside world. I do think BoMAI is truly boxed.
Right, that’s more or less what I mean to do. We can assign probabilities to statements like “it is physically impossible (under the true models of physics) for a human or a computer in isolation with an energy budget of x joules and y joules/second to transmit information in any way other than via a), b), or c) from above.” This seems extremely likely to me for reasonable values of x and y, so it’s still useful to have a “proof” even if it must be predicated on such a physical assumption.
I strongly disagree with this quote (and would like to know how to point this out!):
I have never seen anyone point out that another’s thoughts were wrong, because they were too abstract, and that they were harmful to the general audience. I have seen three comments advocating for a specific model of human values, which I have never seen anybody; but at the moment I have not seen anyone anywhere in that context anywhere.
This isn’t because it is wrong, but because it doesn’t really sound like a person who would care, even if the AI were not going to see him do his work.
This is to me, the more compelling argument in terms of What if “AIs” might end up being the type that can decide whether to take over, then there isn’t a reasonable way for AIs to have any conscious thoughts.
The idea that AGI is coming soon isn’t obviously right. It looks like we already are. I don’t want to live in a world with lots of AIs over, not enough to make them “free” and not yet understand the basic principles of utility.
I can’t see how you can say that such a scenario is impossible, since the AI would simply be a kind of computer. However, this argument depends on your definition of AI as a “mind with 1” (a mind of a single type).