...and is-correct basically just tests whether they are equal.
So here “equal” would presumably be “essentially equal in the judgement of complex process”, rather than verbatim equality of labels (the latter seems silly to me; if it’s not silly I must be missing something).
I think they need to be exactly equal. I think this is most likely accomplished by making something like pairwise judgments and only passing judgment when the comparison is a slam dunk (as discussed in section 3). Otherwise the instrumental policy will outperform the intended policy (since it will do the right thing when the simple labels are wrong).
Ok, that all makes sense, thanks.
So here “equal” would presumably be “essentially equal in the judgement of complex process”, rather than verbatim equality of labels (the latter seems silly to me; if it’s not silly I must be missing something).
I think they need to be exactly equal. I think this is most likely accomplished by making something like pairwise judgments and only passing judgment when the comparison is a slam dunk (as discussed in section 3). Otherwise the instrumental policy will outperform the intended policy (since it will do the right thing when the simple labels are wrong).