It sure would.
Would you give 3:1 odds against a major player showing the untrusted model the old CoT, conditional on them implementing a “replace suspicious outputs” flow on a model with hidden CoT and publicizing enough implementation details to determine whether or not they did that?
What if both the AI group and the control group have high success rates at this test?