One “9” of uptime reliability for dangerously powerful AIs might suffice (for the most dangerous and/or important applications)
Currently, AI labs try to ensure that APIs for accessing their best models have quite good reliability, e.g. >99.9% uptime. If we had to maintain such high levels of uptime reliability for future dangerously powerful AIs, then we’d need to ensure that our monitoring system has very few false alarms. That is, few cases in which humans monitoring these dangerously powerful AIs pause usage of the AIs pending further investigation. Ensuring this would make it harder to ensure that AIs are controlled. But, for the most important and dangerous usages of future powerful AIs (e.g. arbitrary R&D, and especially AI safety research), we can probably afford to have the powerful AIs down a moderate fraction of the time. For instance, it wouldn’t be that costly if the AIs always took a randomly selected day of the week off from doing research: this would probably reduce productivity by not much more than 1 part in 7[1]. More generally, a moderate fraction of downtime like 10% probably isn’t considerably worse than a 10% productivity hit and it seems likely that we can tolerate such productivity hits to improve safety. Thus, we don’t need even one “9” of reliability.
This implies that we can afford to do things like:
Have semi-regular false alarms where AIs go down for a few hours or a day while we investigate.
Run AI shutdown drills. These drills would help ensure that our processes for shut down are working and there aren’t unknown use cases where shutdown could cause very problematic failures. (Catching such usages early via shut down drills is similar to scheduled brownouts in software engineering; see here for an example.)
Temporarily shut down some usages of the AI if we encounter unexpected operating conditions. (AKA violations of our design basis.)
[For context, this is a point that seems important to me, but didn’t naturally make it into any of our posts on control thus far. So, I decided to write a quick standalone shortform post. I think Buck originally made this point to me.]
Thank you for this! A lot of us have a very bad habit of over-systematizing our thinking, and treating all uses of AI (and even all interfaces to a given model instance) as one singular thing. Different tool-level AI instances probably SHOULD strive for 4 or 5 nines of availability, in order to have mundane utility in places where small downtime for a use blocks a lot of value. Research AIs, especially self-improving or research-on-AI ones, don’t need that reliability, and both triggered downtime (scheduled, or enforced based on Schelling-line events) as well as unplanned downtime (it just broke, and we have to spend time to figure out why) can be valuable to give humans time to react.
One “9” of uptime reliability for dangerously powerful AIs might suffice (for the most dangerous and/or important applications)
Currently, AI labs try to ensure that APIs for accessing their best models have quite good reliability, e.g. >99.9% uptime. If we had to maintain such high levels of uptime reliability for future dangerously powerful AIs, then we’d need to ensure that our monitoring system has very few false alarms. That is, few cases in which humans monitoring these dangerously powerful AIs pause usage of the AIs pending further investigation. Ensuring this would make it harder to ensure that AIs are controlled. But, for the most important and dangerous usages of future powerful AIs (e.g. arbitrary R&D, and especially AI safety research), we can probably afford to have the powerful AIs down a moderate fraction of the time. For instance, it wouldn’t be that costly if the AIs always took a randomly selected day of the week off from doing research: this would probably reduce productivity by not much more than 1 part in 7[1]. More generally, a moderate fraction of downtime like 10% probably isn’t considerably worse than a 10% productivity hit and it seems likely that we can tolerate such productivity hits to improve safety. Thus, we don’t need even one “9” of reliability.
This implies that we can afford to do things like:
Have semi-regular false alarms where AIs go down for a few hours or a day while we investigate.
Run AI shutdown drills. These drills would help ensure that our processes for shut down are working and there aren’t unknown use cases where shutdown could cause very problematic failures. (Catching such usages early via shut down drills is similar to scheduled brownouts in software engineering; see here for an example.)
Temporarily shut down some usages of the AI if we encounter unexpected operating conditions. (AKA violations of our design basis.)
[For context, this is a point that seems important to me, but didn’t naturally make it into any of our posts on control thus far. So, I decided to write a quick standalone shortform post. I think Buck originally made this point to me.]
And likely somewhat less than this due to substitution effects.
Thank you for this! A lot of us have a very bad habit of over-systematizing our thinking, and treating all uses of AI (and even all interfaces to a given model instance) as one singular thing. Different tool-level AI instances probably SHOULD strive for 4 or 5 nines of availability, in order to have mundane utility in places where small downtime for a use blocks a lot of value. Research AIs, especially self-improving or research-on-AI ones, don’t need that reliability, and both triggered downtime (scheduled, or enforced based on Schelling-line events) as well as unplanned downtime (it just broke, and we have to spend time to figure out why) can be valuable to give humans time to react.