re: 1: a data generating process is indeed a machine that produces the data we see. But, importantly, it also contains information about all sorts of other things, in particular what sort of data would come out had you given this machine completely different inputs. The fact that this machine encodes this sort of counterfactual information is what makes it “in the causal magisterium,” so to speak.
The machine itself is presumed to be “out there.” If we are trying to learn what that machine might be, we may wish to invoke assumptions akin to Occam’s razor. But this is about us trying to learn something, not about the machine per se.
Nature is not required to be convenient!
To use an analogy our mutual friend Judea likes to use, ‘the joint distribution’ is akin to encoding how some object, say a vase, reflects light from every angle. This information is sufficient to render this vase in a computer graphics system. But this information is not sufficient to render what happens if we smash a vase—we need in addition to the surface information, also additional information about the material of the vase, how brittle it is, etc. The ‘data generating process’ would contain in addition to surface info about light reflectivity, also information that lets us deduce how a vase would react to any counterfactual deformation we might perform, whether we drop it from a table, or smash it with a hammer, or lightly nudge it.
At least the way I am conceiving a computation, I can (theoretically) run the same computation with different inputs. So I think a computation would capture that sort of counterfactual information also.
So in LW terms—beware of the mind projection fallacy.
re: 1: a data generating process is indeed a machine that produces the data we see. But, importantly, it also contains information about all sorts of other things, in particular what sort of data would come out had you given this machine completely different inputs. The fact that this machine encodes this sort of counterfactual information is what makes it “in the causal magisterium,” so to speak.
The machine itself is presumed to be “out there.” If we are trying to learn what that machine might be, we may wish to invoke assumptions akin to Occam’s razor. But this is about us trying to learn something, not about the machine per se. Nature is not required to be convenient!
To use an analogy our mutual friend Judea likes to use, ‘the joint distribution’ is akin to encoding how some object, say a vase, reflects light from every angle. This information is sufficient to render this vase in a computer graphics system. But this information is not sufficient to render what happens if we smash a vase—we need in addition to the surface information, also additional information about the material of the vase, how brittle it is, etc. The ‘data generating process’ would contain in addition to surface info about light reflectivity, also information that lets us deduce how a vase would react to any counterfactual deformation we might perform, whether we drop it from a table, or smash it with a hammer, or lightly nudge it.
Thanks for your reply.
At least the way I am conceiving a computation, I can (theoretically) run the same computation with different inputs. So I think a computation would capture that sort of counterfactual information also.
So in LW terms—beware of the mind projection fallacy.