I think I basically hold disagreement (1), which I think is close to Owain’s comment. Specifically. I think a plausible story for a model learning causality is:
The model learns a lot of correlations, most real (causal) but many spurious.
The model eventually groks that there’s a relatively simple causal model explaining the real correlations but not the spurious ones. This gets favored by whatever inductive bias the training process/architecture encodes.
The model maintains uncertainty as to whether the spurious correlations are real or spurious, the same way humans do.
In this story the model learns both a causal model and the spurious correlations. It doesn’t dismiss the spurious correlations but still models the causal ones. This lets it minimize loss, which I think addresses the counter argument to (1).
I think I basically hold disagreement (1), which I think is close to Owain’s comment. Specifically. I think a plausible story for a model learning causality is:
The model learns a lot of correlations, most real (causal) but many spurious.
The model eventually groks that there’s a relatively simple causal model explaining the real correlations but not the spurious ones. This gets favored by whatever inductive bias the training process/architecture encodes.
The model maintains uncertainty as to whether the spurious correlations are real or spurious, the same way humans do.
In this story the model learns both a causal model and the spurious correlations. It doesn’t dismiss the spurious correlations but still models the causal ones. This lets it minimize loss, which I think addresses the counter argument to (1).