On one hand, Olah et al.’s (2020) investigations find circuits which implement human-comprehensible functions.
At a higher level, they also find that different branches (when the modularity is enforced already by the architecture) tend to contain different features.
Another meaning could be: I want to raise the salience of the issue ‘Red vs Not Red’, I want to convey that ‘Red vs Not Red’ is an underrated axis. I think this is also an example of level 4?