This is a control theory problem obscured by terminology like “oneshotness”.
I interpret the phenomenon EY is gesturing at as a stability margin failure. That is, a system going off course at a rate that exceeds a controller’s ability to correct. Most of the disagreement is not about this model at a high level, but about how the interaction dynamics play out and what levels of uncertainty to apply.
Controlling the Viking failed immediately upon losing the only correction channel. The control rate going to zero means game over.
The Mars Observer failed slowly as vapor accumulated over 11 months with no sensor detecting it as a problem. Zero control rate for a different reason. This time, the drift off course wasn’t even observed until too late.
The Maginot Line failed because France was miscalibrated on both rates. They assumed the Germans would advance (“off course”) more slowly and that their mobilization (“correction”) would be faster.
ASI fits the pattern but has increased levels of cursedness affecting both rates. An AI can act faster than humans can observe and respond, interfere with corrective mechanisms, and obfuscate observability (e.g., sandbagging and playing the training game). Trying to control a strategic adversarial opponent goes beyond classical control theory with its known engineering techniques into the territory of dynamic games.
The disagreement is not whether there is a level of criticality where the situation is unrecoverable (most reasonable people agree with that), but how fast the AI might take a “sharp left turn” or undergo an RSI loop phase change, as well as how fast humans can adapt scalable oversight and meaningful alignment strategies.
This is not a novel framing. Elija Perrier lays out a more formal description here: Out of Control—Why Alignment Needs Formal Control Theory (and an Alignment Control Stack), and Daniel Kokotajlo is making similar decompositions in other comments. Beren Millidge has a more optimistic take here: Maintaining Alignment during RSI as a Feedback Control Problem.
Let’s drop “oneshotness” and discuss in terms that can be modeled more precisely than debating what counts as “one shot”.
I can’t think of a single term for this, but it can be articulated as is-ought laundering through accountability sinks. The interlocutor describes what the structure is that diffuses local fault and treats that description as if it settles whether the outcome ought to happen.