I see two reasons not to treat every measurement from the survey as having zero weight.
First, you’d like an approach that makes sense when you haven’t considered any data samples previously, so you don’t ignore the first person to tell you “humans are generally between 2 and 10 feet tall”.
Second, in a different application you may not believe there is no causal mechanism for a new study to provide unique information about some effect size. Then there’s value in a model that updates a little on the new study but doesn’t update infinitely on infinite studies.
The approach I suggest is that you can model standard biases like p-hacking via shrinkage, and you can treat extremely discrete systematic biases like fraud or methodological errors (such as confounding which is universal among all studies) as a mixture model, where the different mixtures correspond to the different discrete values. This lets you model the ‘flip-flop’ behavior of a single key node without going full Pearl DAG.
So for example, if I have a survey I think is fraudulent—possibly just plain made up in a spreadsheet—and a much smaller survey which I trust but which has large sampling error, I can express this as a mixture model and I will get a bimodal distribution over the estimate with a small diffuse peak and a big sharp peak, which corresponds to roughly “here’s what you get if the big one is fake, and here’s what you get if it’s real and pooled with the other one”. If you can get more gold data, that updates further the switching parameter, and at some point if the small surveys keep disagreeing with the big one, the probability of it being fake will approach 1 and it’ll stop visibly affecting the posterior distribution because it’ll just always be assigned to the ‘fake’ component and not affect the posteriors of interest (for the real components).
You can take this approach with confounding too. A confounded study is not simply going to exaggerate the effect size X%, it will deliver potentially arbitrarily different and opposite signed estimates, and no matter how many confounded studies you combine, they will never be the causal estimate and they may all agree with each other very precisely if they are collecting data confounded the same way. So if you have an RCT which contradicts all your cohort correlational results, you’re in the same situation as with the two surveys.
Just to simplify your approach to a non-mathematician, you’re proposing not doing any information flow analysis but finding autonomously cases where an information input, like the opinion poll, is not adding any useful information. And you name one way to do this.
Fair enough but the problem is that if you do an information flow analysis—“does any causal mechanism exist where this source could provide information”? - you can skip considering the faulty information with 100% probability. Sheer chance can show a correlation using your proposed approach.
I see two reasons not to treat every measurement from the survey as having zero weight.
First, you’d like an approach that makes sense when you haven’t considered any data samples previously, so you don’t ignore the first person to tell you “humans are generally between 2 and 10 feet tall”.
Second, in a different application you may not believe there is no causal mechanism for a new study to provide unique information about some effect size. Then there’s value in a model that updates a little on the new study but doesn’t update infinitely on infinite studies.
The approach I suggest is that you can model standard biases like p-hacking via shrinkage, and you can treat extremely discrete systematic biases like fraud or methodological errors (such as confounding which is universal among all studies) as a mixture model, where the different mixtures correspond to the different discrete values. This lets you model the ‘flip-flop’ behavior of a single key node without going full Pearl DAG.
So for example, if I have a survey I think is fraudulent—possibly just plain made up in a spreadsheet—and a much smaller survey which I trust but which has large sampling error, I can express this as a mixture model and I will get a bimodal distribution over the estimate with a small diffuse peak and a big sharp peak, which corresponds to roughly “here’s what you get if the big one is fake, and here’s what you get if it’s real and pooled with the other one”. If you can get more gold data, that updates further the switching parameter, and at some point if the small surveys keep disagreeing with the big one, the probability of it being fake will approach 1 and it’ll stop visibly affecting the posterior distribution because it’ll just always be assigned to the ‘fake’ component and not affect the posteriors of interest (for the real components).
You can take this approach with confounding too. A confounded study is not simply going to exaggerate the effect size X%, it will deliver potentially arbitrarily different and opposite signed estimates, and no matter how many confounded studies you combine, they will never be the causal estimate and they may all agree with each other very precisely if they are collecting data confounded the same way. So if you have an RCT which contradicts all your cohort correlational results, you’re in the same situation as with the two surveys.
Just to simplify your approach to a non-mathematician, you’re proposing not doing any information flow analysis but finding autonomously cases where an information input, like the opinion poll, is not adding any useful information. And you name one way to do this.
Fair enough but the problem is that if you do an information flow analysis—“does any causal mechanism exist where this source could provide information”? - you can skip considering the faulty information with 100% probability. Sheer chance can show a correlation using your proposed approach.