Am I right that the line of argument here is not about the generalization properties, but a claim about the quality of explanation, even on the restricted distribution?
Yes, I think that is a good way to put it. But faithful mechanistic explanations are closely related to generalization.
Like here, your causal model should have the explicit condition “x_1=x_2”.
That would be a sufficient condition for to make the correct predictions. But that does not mean that provides a good mechanistic explanation of on those inputs.
Fun project.
I think these kinds of pictures ‘underestimate’ models’ geographical knowledge. Just imagine having a human perform this task. The human may have very detailed geographical knowledge, may even be able to draw a map of the world from memory. This does not imply that they would be able to answer questions about latitude and longitude.