Let’s say I want to evaluate an algorithmic Texas Hold’em player against a field of algorithmic opponents.
The simplest approach I could take would be pure monte-carlo: run the strategy for 100 million hands and see how it does. This works, but wastes compute.
Alternatively, I could use the importance sampled approach:
Start with 100,000 pre-flop scenarios (i.e., all players have received their pocket cards, button position is fixed, no community cards yet)
Do 100 rollouts from each scenario
Most rollouts won’t be “interesting” (e.g., player has 7-2o UTG, player folds every time → EV = 0BB). If the simulation hits these states, you can say with high confidence how they’ll turn out, so additional rollouts won’t significantly change your EV estimate—you’ve effectively “locked in” your EV for the “boring” parts of the pre-flop possibility space.
Pick the 10,000 highest-variance preflop scenarios. These are cases where your player doesn’t always fold, and opponents didn’t all fold to your player’s raise). e.g.
AA in position where you 3-bet and everyone folds → consistently +3BB.
KQs facing a 3-bet → sometimes win big, sometimes lose big → super high variance
Run 1,000 rollouts for each of these high-variance scenarios.
Figure out which flops generate the highest variance over those 1,000 rollouts.
If your player completely missed the flop while another player connected and bet aggressively, your player will fold every time—low variance, predictable EV.
If your player flopped two pair or your opponent is semi-bluffing a draw, those are high-variance situations where additional rollouts provide valuable information.
Etc etc through each major decision point in the game
By skipping rollouts once I know what the outcome is likely to be, I can focus a lot more compute on the remaining scenarios and come to a much more precise estimate of EV (or whatever other metric I care about).
Would this help with the simulation goal hypothesized in the OP? It’s asking how often different types of AGIs would be created. A lot of the variance is probably carried in what sort of species and civilization is making the AGI, but some of it is carried by specific twists that happen near the creation of AGI. Getting a president like Trump and having him survive the (fairly likely) assasination attempt(s) is one such impactful twist. So I guess sampling around those uncertain impactful twists would be valuable in refining the estimate of, say, how frequently a relatively wise and cautious species would create misaligned AGI due to bad twists and vice-versa.
Let’s say I want to evaluate an algorithmic Texas Hold’em player against a field of algorithmic opponents.
The simplest approach I could take would be pure monte-carlo: run the strategy for 100 million hands and see how it does. This works, but wastes compute.
Alternatively, I could use the importance sampled approach:
Start with 100,000 pre-flop scenarios (i.e., all players have received their pocket cards, button position is fixed, no community cards yet)
Do 100 rollouts from each scenario
Most rollouts won’t be “interesting” (e.g., player has 7-2o UTG, player folds every time → EV = 0BB). If the simulation hits these states, you can say with high confidence how they’ll turn out, so additional rollouts won’t significantly change your EV estimate—you’ve effectively “locked in” your EV for the “boring” parts of the pre-flop possibility space.
Pick the 10,000 highest-variance preflop scenarios. These are cases where your player doesn’t always fold, and opponents didn’t all fold to your player’s raise). e.g.
AA in position where you 3-bet and everyone folds → consistently +3BB.
KQs facing a 3-bet → sometimes win big, sometimes lose big → super high variance
Run 1,000 rollouts for each of these high-variance scenarios.
Figure out which flops generate the highest variance over those 1,000 rollouts.
If your player completely missed the flop while another player connected and bet aggressively, your player will fold every time—low variance, predictable EV.
If your player flopped two pair or your opponent is semi-bluffing a draw, those are high-variance situations where additional rollouts provide valuable information.
Etc etc through each major decision point in the game
By skipping rollouts once I know what the outcome is likely to be, I can focus a lot more compute on the remaining scenarios and come to a much more precise estimate of EV (or whatever other metric I care about).
Thanks, I get it now.
Would this help with the simulation goal hypothesized in the OP? It’s asking how often different types of AGIs would be created. A lot of the variance is probably carried in what sort of species and civilization is making the AGI, but some of it is carried by specific twists that happen near the creation of AGI. Getting a president like Trump and having him survive the (fairly likely) assasination attempt(s) is one such impactful twist. So I guess sampling around those uncertain impactful twists would be valuable in refining the estimate of, say, how frequently a relatively wise and cautious species would create misaligned AGI due to bad twists and vice-versa.
Hm.
New EA cause area just dropped: Strategic variance reduction in timelines with high P(doom).
BRB applying for funding
Great example!