From Algorithms to Live By, I vaguely recall the multi-armed bandit problem. Maybe that’s what you’re looking for? Or is that still too closely tied to the explore-exploit paradigm?
Or is that still too closely tied to the explore-exploit paradigm?
Right. The setup for my problem is the same as the ‘bernoulli bandit’, but I only care about the information and not the reward. All I see on that page is about exploration-exploitation.
From Algorithms to Live By, I vaguely recall the multi-armed bandit problem. Maybe that’s what you’re looking for? Or is that still too closely tied to the explore-exploit paradigm?
I got a good answer here: https://stats.stackexchange.com/q/579642/5751
Right. The setup for my problem is the same as the ‘bernoulli bandit’, but I only care about the information and not the reward. All I see on that page is about exploration-exploitation.