Do you expect AI labs would actually run extensive experimental tests in this world? I would be surprised if they did, even if such a window does arise.
(To roughly operationalize: I would be surprised to hear a major lab spent more than 5 FTE-years conducting such tests, or that the tests decreased the p(doom) of the average reasonably-calibrated external observer by more than 10%).
Thanks! Edited to fix.