Gerald Monroe comments on Do small studies add up?

Gerald Monroe 16 Mar 2022 0:35 UTC
1 point
0
To me the obvious ‘gotcha’ is that you should probably treat every measurement from the survey as having zero weight, or at least everyone who has not met Einstein in person, because there is no causal mechanism for them to have information about this value (no unique information I mean, not published in data samples you already considered previously). Huh, that also invalidates the efficient market hypothesis. (partially—it means that if most market participants have no source of information for their valuation of a security beyond information you could simply analyze with a rational model*, then the market price determined by the rational model is more likely, over the long term, to be the true price—not what the market votes it to be)
*some model that regresses between public financials disclosed and other information and predicted long term profits and revenue, and a second model that regresses between actual profits/revenue and long term securities value.
(EMH works fine if some actors have insider information or are using rational models, because they over time will have more and more of the shares of a given security as they systematically win and thus their votes count for greater and greater weight, converging on the true value of the security)
- one_forward 16 Mar 2022 21:01 UTC
  2 points
  0
  Parent
  I see two reasons not to treat every measurement from the survey as having zero weight.
  
  First, you’d like an approach that makes sense when you haven’t considered any data samples previously, so you don’t ignore the first person to tell you “humans are generally between 2 and 10 feet tall”.
  
  Second, in a different application you may not believe there is no causal mechanism for a new study to provide unique information about some effect size. Then there’s value in a model that updates a little on the new study but doesn’t update infinitely on infinite studies.
  - gwern 16 Mar 2022 21:14 UTC
    3 points
    0
    Parent
    The approach I suggest is that you can model standard biases like p-hacking via shrinkage, and you can treat extremely discrete systematic biases like fraud or methodological errors (such as confounding which is universal among all studies) as a mixture model, where the different mixtures correspond to the different discrete values. This lets you model the ‘flip-flop’ behavior of a single key node without going full Pearl DAG.
    
    So for example, if I have a survey I think is fraudulent—possibly just plain made up in a spreadsheet—and a much smaller survey which I trust but which has large sampling error, I can express this as a mixture model and I will get a bimodal distribution over the estimate with a small diffuse peak and a big sharp peak, which corresponds to roughly “here’s what you get if the big one is fake, and here’s what you get if it’s real and pooled with the other one”. If you can get more gold data, that updates further the switching parameter, and at some point if the small surveys keep disagreeing with the big one, the probability of it being fake will approach 1 and it’ll stop visibly affecting the posterior distribution because it’ll just always be assigned to the ‘fake’ component and not affect the posteriors of interest (for the real components).
    
    You can take this approach with confounding too. A confounded study is not simply going to exaggerate the effect size X%, it will deliver potentially arbitrarily different and opposite signed estimates, and no matter how many confounded studies you combine, they will never be the causal estimate and they may all agree with each other very precisely if they are collecting data confounded the same way. So if you have an RCT which contradicts all your cohort correlational results, you’re in the same situation as with the two surveys.
    - Gerald Monroe 17 Mar 2022 20:18 UTC
      1 point
      0
      Parent
      Just to simplify your approach to a non-mathematician, you’re proposing not doing any information flow analysis but finding autonomously cases where an information input, like the opinion poll, is not adding any useful information. And you name one way to do this.
      Fair enough but the problem is that if you do an information flow analysis—“does any causal mechanism exist where this source could provide information”? - you can skip considering the faulty information with 100% probability. Sheer chance can show a correlation using your proposed approach.