Vanessa Kosoy comments on Linear infra-Bayesian Bandits

Vanessa Kosoy 21 Dec 2025 7:13 UTC
LW: 6 AF: 5
0
AF
This work^[1] was the first^[2] foray into proving non-trivial regret bounds in the robust (infra-Bayesian) setting. The specific bound I got was later slightly improved in Diffractor’s and my later paper. This work studied a variant of linear bandits, due the usual reasons linear models are often studied in learning theory: it is a conveniently simple setting where we actually know how to prove things, even with computationally efficient algorithms. (Although we still don’t have a computationally efficient algorithm for the robust version: not because it’s very difficult, but (probably) just because nobody got around to solving it.) As such, this work was useful as a toy-model test that infra-Bayesianism doesn’t run into statistical intractability issues. As to whether linear-model algorithms or their direct descendants will actually play a role in the ultimate theory of learning, that is still an open question.
1. ^
  An abridged version was also published as a paper in JMLR.
2. ^
  Other than Tian et al, which technically is a robust regret bound, but was not framed by the authors as such (instead, their motivation was studying zero-sum games).