Thanks Stuart for your thought-provoking post! I think your point about the effects of the baseline choice on the subagent problem is very interesting, and it would be helpful to separate it more clearly from the effects of the deviation measure (which are currently a bit conflated in the table). I expect that AU with the inaction baseline would also avoid this issue, similarly to RR with an inaction baseline. I suspect that the twenty billion questions measure with the stepwise baseline would have the subagent issue too.
I’m wondering whether this issue is entirely caused by the stepwise baseline (which is indexed on the agent, as you point out), or whether the optionality-based deviation measures (RR and AU) contribute to it as well. So far I’m adding this to my mental list of issues with the stepwise baseline (along with the “car on a winding road” scenario) that need to be fixed.