philh comments on Practical Pitfalls of Causal Scrubbing

philh 29 Mar 2023 14:35 UTC
3 points
0
Not sure if this was deliberate on your part, but:

As a simple example, consider a graph $G$ that calculates whether $\frac{1}{x_{0}} < \frac{1}{x_{1}}$ , where $x$ is an input array of length 2 ( $x = [x_{0}, x_{1}]$ ). … We create a hypothesis that claims that the graph calculates $x_{0} > x_{1}$ . … This is technically a correct hypothesis as this is indeed what the graph computes.

Only correct as long as $x_{0}, x_{1}$ are both $> 0$ or both $< 0$ . Which illustrates your point that

So while all the intensionally different implementations behave identically on the test set, they may behave very differently on out-of-distribution samples.