Reading this post made the workings (and usefulness) of the counterfactual oracle more clear to me- specifically, the fact that it learns based on the times when it’s prediction wasn’t read.
(The feedback being based on the world/performance in it, rather than having a training set, also might bear some relation to the “selection versus control (versus explicit solution)” dichotomy, that I’ll have to think about later.)
Reading this post made the workings (and usefulness) of the counterfactual oracle more clear to me- specifically, the fact that it learns based on the times when it’s prediction wasn’t read.
(The feedback being based on the world/performance in it, rather than having a training set, also might bear some relation to the “selection versus control (versus explicit solution)” dichotomy, that I’ll have to think about later.)