Or is that still too closely tied to the explore-exploit paradigm?
Right. The setup for my problem is the same as the ‘bernoulli bandit’, but I only care about the information and not the reward. All I see on that page is about exploration-exploitation.
Right. The setup for my problem is the same as the ‘bernoulli bandit’, but I only care about the information and not the reward. All I see on that page is about exploration-exploitation.