Some hard problems with anti tampering and their relevance for GPU off-switches
Background on GPU off switches:
It would be nice if a GPU could be limited[1] or shut down remotely by some central authority such as the U.S government[2] in case there’s some emergency[3].
This shortform is mostly replying to ideas like “we’ll have a CPU in the H100[4] which will expect a signed authentication and refuse to run otherwise. And if someone will try to remove the CPU, the H100′s anti-tampering mechanism will self-destruct (melt? explode?)”.
TL;DR: Getting the self destruction mechanism right isn’t really the hard part.
Some hard parts:
Noticing if the self destruction mechanism should be used
A (fun?) exercise could be suggesting ways to do this (like “let’s put a wire through the box and if it’s cut then we’ll know someone broke in”) and then thinking how you’d subvert those mechanisms. I mainly think doing this 3 times (if you haven’t before) could be a fast way to get some of the intuition I’m trying to gesture at
Maintenance
If the H100s can never be opened after they’re made, then any problem that would have required us to open them up might mean throwing them away. And we want them to be expensive, right?
If the H100s CAN be opened, then what’s the allowed workflow?
For example, If they require a key—then what if there’s a problem in the mechanism that verifies the key, or in something it depends on, like the power supply? We can’t open the box to fix the mechanism that would allow us to open the box
(“breaking cryptography” and “breaking in to the certificate authority” are out of scope for what I’m trying to gesture at)
False positives / false negatives
If the air conditioning in the data center breaks, will the H100s get hot[5], infer they’re under attack and self destruct? Will they EXPLODE? Do we want them to only self destruct if something pretty extreme is happening, meaning attackers will have more room to try things?
Is the design secret?
If someone gets the full design of our anti-tampering mechanism, will they easily get around it?
If so, are these designs being kept in a state-proof computer? Are they being sent to a private company to build them?
Complexity
Whatever we implement here better not have any important bugs. How much bug-free software and hardware do we think we can make? How are we going to test it to be confident it works? Are we okay with building 10,000 when we only think our tests would catch 90% of the important bugs?
Tradeoffs here can be exploited in interesting ways
For example, if air conditioning problems will make our H100s self destruct, an attacker might focus on our air conditioning on purpose. After 5 times of having to replace all of our data center—I can imagine an executive saying bye bye to our anti tampering ideas.
I’m trying to gesture at something like “making our anti tampering more and more strict probably isn’t a good move, even from a security perspective, unless we have good idea how to deal with problems like this”
In summary, if we build anti-tampering mechanisms in GPUs that an adversaries have easy access to (especially nation states) then I don’t expect this will be a significant problem for them to overcome.
(Maybe later—Ideas for off-switches that seem more promising to me)
For example bandwidth, gps location, or something else. the exact limitations are out of scope, I mainly want to discuss being able to make limitations at all
The question of “who should be the authority” is out of scope, I mainly want to enable having some authority at all. If you’re interested in this, consider checking out Vitalik’s suggestion too, search for “Strategy 2″
I’m using “getting hot means being under attack” as a toy example for a “paranoid” condition that someone might suggest building as one of the triggers for our anti-tampering mechanism. Other examples might include “the box shakes”, “the power turns off for too long”, and so on.
Anti Tempering in a data center you control provides very different tradeoffs
I’ll paint a picture for how this could naively look:
We put the GPUs in something equivalent to a military base. Someone can still break in, steal the GPU, and break the anti tempering, but (I’m assuming) using those GPUs usefully would take months, and meanwhile (for example), a war could start.
How do the tradeoffs change? What creative things could we do with our new assumptions?
Tradeoffs we don’t really care about anymore:
We don’t need the anti tampering to reliably work (it’s nice if it works, but it now becomes “defense in depth”)
Slowing down the attacker is already very nice
Our box can be maintainable
We don’t have to find all bugs in advance
...
“Noticing the breach” becomes an important assumption
Does our data center have cameras? What if they are hacked? And so on
(An intuition I hope to share: This problem is much easier than “preventing the breach”)
It doesn’t have to be in “our” data center. It could be in a “shared” data center that many parties monitor.
Any other creative solution to notice breaches might work
How about spot inspections to check if the box was tampered with?
“preventing a nation state from opening a big and closing it without any visible change, given they can do whatever they want with the box, and given the design is open source” seems maybe very hard, or maybe a solved problem.
“If a breach is noticed then something serious happens” becomes an important assumption
Are the stakeholders on board?
Things that make me happy here:
Less hard assumptions to make
Less difficult tradeoffs to balance
The entire project requires less world class cutting edge engineering.
I used to assume disabling a GPU in my physical possession would be impossible, but now I’m not so sure. There might be ways to make bypassing GPU lockouts on the order of difficulty of manufacturing the GPU (requiring nanoscale silicon surgery). Here’s an example scheme:
Nvidia changes their business models from selling GPUs to renting them. The GPU is free, but to use your GPU you must buy Nvidia Dollars from Nvidia. Your GPU will periodically call Nvidia headquarters and get an authorization code to do 10^15 more floating point operations. This rental model is actually kinda nice for the AI companies, who are much more capital constrained than Nvidia. (Lots of industries have moved from this buy to rent model, e.g. airplane engines)
Question: “But I’m an engineer. How (the hell) could Nvidia keep me from hacking a GPU in my physical possession to bypass that Nvidia dollar rental bullshit?”
Answer: through public key cryptography and the fact that semiconductor parts are very small and modifying them is hard.
In dozens to hundreds or thousands of places on the GPU, NVidia places toll units that block signal lines (like ones that pipe floating point numbers around) unless the toll units believe they have been paid with enough Nvidia dollars.
The toll units have within them a random number generator, a public key ROM unique to that toll unit, a 128 bit register for a secret challenge word, elliptic curve cryptography circuitry, and a $$$ counter which decrements every time the clock or signal line changes.
If the $ $ $ counter is positive, the toll unit is happy and will let signals through unabated. But if the $ $ $ counter reaches zero,[1] the toll unit is unhappy and will block those signals.
To add to the $$$ counter, the toll unit (1) generates a random secret <challenge word>, (2) encrypts the secret using that toll unit’s public key (3) sends <encrypted secret challenge word> to a non-secure parts of the GPU,[2] which (4) through driver software and the internet, phones NVidia saying “toll unit <id> challenges you with <encrypted secret challenge word>” (5) Nvidia looks up the private key for toll unit <id> and replies to the GPU “toll unit <id>, as proof that I Nvidia know your private key, I decrypted your challenge word: <challenge word>”, (6) after getting this challenge word back, the toll unit adds 10^15 or whatever to the $$$ counter.
There are a lot of ways to bypass this kind of toll unit (fix the random number generator or $$$ counter to a constant, just connect wires to route around it). But the point is to make it so you can’t break a toll unit without doing surgery to delicate silicon parts which are distributed in dozens to hundreds of places around the GPU chip.
Implementation note: it’s best if disabling the toll unit takes nanoscale precision, rather than micrometer scale precision. The way I’ve written things here, you might be able to smudge a bit of solder over the whole $$$ counter and permanently tie the whole thing to high voltage, so the counter never goes down. I think you can get around these issues (make it so any “blob” of high or low voltage spanning multiple parts of the toll circuit will block the GPU) but it takes care.
I love the direction you’re going with this business idea (and with giving Nvidia a business incentive to make “authentication” that is actually hard to subvert)!
I can imagine reasons they might not like this idea, but who knows. If I can easily suggest this to someone from Nvidia (instead of speculating myself), I’ll try
I’ll respond to the technical part in a separate comment because I might want to link to it >>
The interesting/challenging technical parts seem to me:
1. Putting the logic that turns off the GPU (what you called “the toll unit”) in the same chip as the GPU and not in a separate chip
2. Bonus: Instead of writing the entire logic (challenge response and so on) in advance, I think it would be better to run actual code, but only if it’s signed (for example, by Nvidia), in which case they can send software updates with new creative limitations, and we don’t need to consider all our ideas (limit bandwidth? limit gps location?) in advance.
Things that seem obviously solvable (not like the hard part) :
3. The cryptography
4. Turning off a GPU somehow (I assume there’s no need to spread many toll units, but I’m far from a GPU expert so I’d defer to you if you are)
Thanks! I’m not a GPU expert either. The reason I want to spread the toll units inside GPU itself isn’t to turn the GPU off—it’s to stop replay attacks. If the toll thing is in a separate chip, then the toll unit must have some way to tell the GPU “GPU, you are cleared to run”. To hack the GPU, you just copy that “cleared to run” signal and send it to the GPU. The same “cleared to run” signal must always make the GPU work, unless there is something inside the GPU to make sure won’t accept the same “cleared to run” signal twice. That the point of the mechanism I outline—a way to make it so the same “cleared to run” signal for the GPU won’t work twice.
Bonus: Instead of writing the entire logic (challenge response and so on) in advance, I think it would be better to run actual code, but only if it’s signed (for example, by Nvidia), in which case they can send software updates with new creative limitations, and we don’t need to consider all our ideas (limit bandwidth? limit gps location?) in advance.
Hmm okay, but why do I let Nvidia send me new restrictive software updates? Why don’t I run my GPUs in an underground bunker, using the old most broken firmware?
Oh yes the toll unit needs to be inside the GPU chip imo.
why do I let Nvidia send me new restrictive software updates?
Alternatively the key could be in the central authority that is supposed to control the off switch. (same tech tho)
Why don’t I run my GPUs in an underground bunker, using the old most broken firmware?
Nvidia (or whoever signs authorization for your GPU to run) won’t sign it for you if you don’t update the software (and send them a proof you did it using similar methods, I can elaborate).
Some hard problems with anti tampering and their relevance for GPU off-switches
Background on GPU off switches:
It would be nice if a GPU could be limited[1] or shut down remotely by some central authority such as the U.S government[2] in case there’s some emergency[3].
This shortform is mostly replying to ideas like “we’ll have a CPU in the H100[4] which will expect a signed authentication and refuse to run otherwise. And if someone will try to remove the CPU, the H100′s anti-tampering mechanism will self-destruct (melt? explode?)”.
TL;DR: Getting the self destruction mechanism right isn’t really the hard part.
Some hard parts:
Noticing if the self destruction mechanism should be used
A (fun?) exercise could be suggesting ways to do this (like “let’s put a wire through the box and if it’s cut then we’ll know someone broke in”) and then thinking how you’d subvert those mechanisms. I mainly think doing this 3 times (if you haven’t before) could be a fast way to get some of the intuition I’m trying to gesture at
Maintenance
If the H100s can never be opened after they’re made, then any problem that would have required us to open them up might mean throwing them away. And we want them to be expensive, right?
If the H100s CAN be opened, then what’s the allowed workflow?
For example, If they require a key—then what if there’s a problem in the mechanism that verifies the key, or in something it depends on, like the power supply? We can’t open the box to fix the mechanism that would allow us to open the box
(“breaking cryptography” and “breaking in to the certificate authority” are out of scope for what I’m trying to gesture at)
False positives / false negatives
If the air conditioning in the data center breaks, will the H100s get hot[5], infer they’re under attack and self destruct? Will they EXPLODE? Do we want them to only self destruct if something pretty extreme is happening, meaning attackers will have more room to try things?
Is the design secret?
If someone gets the full design of our anti-tampering mechanism, will they easily get around it?
If so, are these designs being kept in a state-proof computer? Are they being sent to a private company to build them?
Complexity
Whatever we implement here better not have any important bugs. How much bug-free software and hardware do we think we can make? How are we going to test it to be confident it works? Are we okay with building 10,000 when we only think our tests would catch 90% of the important bugs?
Tradeoffs here can be exploited in interesting ways
For example, if air conditioning problems will make our H100s self destruct, an attacker might focus on our air conditioning on purpose. After 5 times of having to replace all of our data center—I can imagine an executive saying bye bye to our anti tampering ideas.
I’m trying to gesture at something like “making our anti tampering more and more strict probably isn’t a good move, even from a security perspective, unless we have good idea how to deal with problems like this”
In summary, if we build anti-tampering mechanisms in GPUs that an adversaries have easy access to (especially nation states) then I don’t expect this will be a significant problem for them to overcome.
(Maybe later—Ideas for off-switches that seem more promising to me)
Edit:
See “Anti Tempering in a data center you control provides very different tradeoffs”
I think it’s worth exploring putting the authrization component on the same chip as the GPU
For example bandwidth, gps location, or something else. the exact limitations are out of scope, I mainly want to discuss being able to make limitations at all
The question of “who should be the authority” is out of scope, I mainly want to enable having some authority at all. If you’re interested in this, consider checking out Vitalik’s suggestion too, search for “Strategy 2″
Or perhaps this can be used for export controls or some other purpose
I’m using “H100” to refer to the box that contains both the GPU chip and the CPU chip
I’m using “getting hot means being under attack” as a toy example for a “paranoid” condition that someone might suggest building as one of the triggers for our anti-tampering mechanism. Other examples might include “the box shakes”, “the power turns off for too long”, and so on.
Anti Tempering in a data center you control provides very different tradeoffs
I’ll paint a picture for how this could naively look:
We put the GPUs in something equivalent to a military base. Someone can still break in, steal the GPU, and break the anti tempering, but (I’m assuming) using those GPUs usefully would take months, and meanwhile (for example), a war could start.
How do the tradeoffs change? What creative things could we do with our new assumptions?
Tradeoffs we don’t really care about anymore:
We don’t need the anti tampering to reliably work (it’s nice if it works, but it now becomes “defense in depth”)
Slowing down the attacker is already very nice
Our box can be maintainable
We don’t have to find all bugs in advance
...
“Noticing the breach” becomes an important assumption
Does our data center have cameras? What if they are hacked? And so on
(An intuition I hope to share: This problem is much easier than “preventing the breach”)
It doesn’t have to be in “our” data center. It could be in a “shared” data center that many parties monitor.
Any other creative solution to notice breaches might work
How about spot inspections to check if the box was tampered with?
“preventing a nation state from opening a big and closing it without any visible change, given they can do whatever they want with the box, and given the design is open source” seems maybe very hard, or maybe a solved problem.
“If a breach is noticed then something serious happens” becomes an important assumption
Are the stakeholders on board?
Things that make me happy here:
Less hard assumptions to make
Less difficult tradeoffs to balance
The entire project requires less world class cutting edge engineering.
I used to assume disabling a GPU in my physical possession would be impossible, but now I’m not so sure. There might be ways to make bypassing GPU lockouts on the order of difficulty of manufacturing the GPU (requiring nanoscale silicon surgery). Here’s an example scheme:
Nvidia changes their business models from selling GPUs to renting them. The GPU is free, but to use your GPU you must buy Nvidia Dollars from Nvidia. Your GPU will periodically call Nvidia headquarters and get an authorization code to do 10^15 more floating point operations. This rental model is actually kinda nice for the AI companies, who are much more capital constrained than Nvidia. (Lots of industries have moved from this buy to rent model, e.g. airplane engines)
Question: “But I’m an engineer. How (the hell) could Nvidia keep me from hacking a GPU in my physical possession to bypass that Nvidia dollar rental bullshit?”
Answer: through public key cryptography and the fact that semiconductor parts are very small and modifying them is hard.
In dozens to hundreds or thousands of places on the GPU, NVidia places toll units that block signal lines (like ones that pipe floating point numbers around) unless the toll units believe they have been paid with enough Nvidia dollars.
The toll units have within them a random number generator, a public key ROM unique to that toll unit, a 128 bit register for a secret challenge word, elliptic curve cryptography circuitry, and a $$$ counter which decrements every time the clock or signal line changes.
If the $ $ $ counter is positive, the toll unit is happy and will let signals through unabated. But if the $ $ $ counter reaches zero,[1] the toll unit is unhappy and will block those signals.
To add to the $$$ counter, the toll unit (1) generates a random secret <challenge word>, (2) encrypts the secret using that toll unit’s public key (3) sends <encrypted secret challenge word> to a non-secure parts of the GPU,[2] which (4) through driver software and the internet, phones NVidia saying “toll unit <id> challenges you with <encrypted secret challenge word>” (5) Nvidia looks up the private key for toll unit <id> and replies to the GPU “toll unit <id>, as proof that I Nvidia know your private key, I decrypted your challenge word: <challenge word>”, (6) after getting this challenge word back, the toll unit adds 10^15 or whatever to the $$$ counter.
There are a lot of ways to bypass this kind of toll unit (fix the random number generator or $$$ counter to a constant, just connect wires to route around it). But the point is to make it so you can’t break a toll unit without doing surgery to delicate silicon parts which are distributed in dozens to hundreds of places around the GPU chip.
Implementation note: it’s best if disabling the toll unit takes nanoscale precision, rather than micrometer scale precision. The way I’ve written things here, you might be able to smudge a bit of solder over the whole $$$ counter and permanently tie the whole thing to high voltage, so the counter never goes down. I think you can get around these issues (make it so any “blob” of high or low voltage spanning multiple parts of the toll circuit will block the GPU) but it takes care.
This can be done slowly, serially with a single line
I love the direction you’re going with this business idea (and with giving Nvidia a business incentive to make “authentication” that is actually hard to subvert)!
I can imagine reasons they might not like this idea, but who knows. If I can easily suggest this to someone from Nvidia (instead of speculating myself), I’ll try
I’ll respond to the technical part in a separate comment because I might want to link to it >>
The interesting/challenging technical parts seem to me:
1. Putting the logic that turns off the GPU (what you called “the toll unit”) in the same chip as the GPU and not in a separate chip
2. Bonus: Instead of writing the entire logic (challenge response and so on) in advance, I think it would be better to run actual code, but only if it’s signed (for example, by Nvidia), in which case they can send software updates with new creative limitations, and we don’t need to consider all our ideas (limit bandwidth? limit gps location?) in advance.
Things that seem obviously solvable (not like the hard part) :
3. The cryptography
4. Turning off a GPU somehow (I assume there’s no need to spread many toll units, but I’m far from a GPU expert so I’d defer to you if you are)
Thanks! I’m not a GPU expert either. The reason I want to spread the toll units inside GPU itself isn’t to turn the GPU off—it’s to stop replay attacks. If the toll thing is in a separate chip, then the toll unit must have some way to tell the GPU “GPU, you are cleared to run”. To hack the GPU, you just copy that “cleared to run” signal and send it to the GPU. The same “cleared to run” signal must always make the GPU work, unless there is something inside the GPU to make sure won’t accept the same “cleared to run” signal twice. That the point of the mechanism I outline—a way to make it so the same “cleared to run” signal for the GPU won’t work twice.
Hmm okay, but why do I let Nvidia send me new restrictive software updates? Why don’t I run my GPUs in an underground bunker, using the old most broken firmware?
Oh yes the toll unit needs to be inside the GPU chip imo.
Alternatively the key could be in the central authority that is supposed to control the off switch. (same tech tho)
Nvidia (or whoever signs authorization for your GPU to run) won’t sign it for you if you don’t update the software (and send them a proof you did it using similar methods, I can elaborate).