Houshalter comments on LINK: AI Researcher Yann LeCun on AI function

Houshalter 31 Mar 2014 16:47 UTC
2 points
0
Biased data is a real thing and this is a great example. No method can solve the problem you’ve given without additional information.
- gwern 31 Mar 2014 17:11 UTC
  7 points
  0
  Parent
  This is not biased data. No one tampered with it. No one preferentially left out some data. There is no Cartesian daemon tampering with you. It’s a perfectly ordinary causal problem for which one has all the available data. If you run a regression on the data, you will get accurate predictions of future similar data—just not what happens when you intervene and realize the counterfactual. You can’t throw your hands up and disdainfully refuse to solve the problem, proclaiming, ‘oh, that’s biased’. It may be hard, and the best available solution weak or require strong assumptions, but if that is the case, the correct method should say as much and specify what additional data or interventions would allow stronger conclusions.
  - Houshalter 28 Feb 2015 5:08 UTC
    0 points
    0
    Parent
    I’m not certain why I used the word “bias”. I think I was getting at that the data isn’t representative of the population of interest.
    
    Regardless, no other method can solve the problem specified without additional information (which you claimed). And with additional information, it’s straightforward prediction again.
    
    That is, condition on their prior health status, not just the fact they’ve been given the drug. And prior probabilities.
- Lumifer 31 Mar 2014 17:02 UTC
  0 points
  0
  Parent
  
  No method can solve the problem you’ve given without additional information.
  
  What do you call “solving the problem”?
  
  Any method will output some estimates. Some methods will output better estimates, some worse. As people have pointed out, this was an example of a real problem and yes, real-life data is usually pretty messy. We need methods which can handle messy data and not work just on spherical cows in vacuum.