(I meant sampling x repeatedly from the distribution ^qi, I agree that sampling x at random won’t help identify rare catastrophes.)
The main qualitative difference from sampling from ^qi is that we’re targeting a specific tradeoff between catastrophes and reward, rather than zero probability of catastrophe. I agree that when τ=0 we’re just sampling from ^qi.
(I meant sampling x repeatedly from the distribution ^qi, I agree that sampling x at random won’t help identify rare catastrophes.)
The main qualitative difference from sampling from ^qi is that we’re targeting a specific tradeoff between catastrophes and reward, rather than zero probability of catastrophe. I agree that when τ=0 we’re just sampling from ^qi.