othercriteria comments on What are you working on? June 2012

othercriteria 3 Jun 2012 14:18 UTC
9 points
0
Using a high-powered black-box technique to regress a one-dimensional continuous outcome against a one-dimensional continuous predictor seems misguided.

If you want to characterize how well your evolutionary learning idea works, try it on data that you’ve generated, where you know the “underlying math”. See if you can recover the program that generated the data or one that’s equivalent to it. Or try it on really big, messy data where no one knows the right answer and see if you/it can do better than the obvious competitors like SVM, k-NN, CART, etc.

The middle ground of working on an easy/messy problem, where any sane method will give you and adequate answer but there’s no known ground truth, is not going to make a very compelling story.
- Miller 3 Jun 2012 18:25 UTC
  1 point
  0
  Parent
  
  Using a high-powered black-box technique to regress a one-dimensional continuous outcome against a one-dimensional continuous predictor seems misguided.
  
  I don’t get this. You could have a rather complicated generator for this data set. A simple regression would imply the data points were independent, but the value at time T may have [likely has] a relation to value at T-3. So it seems a good problem to me.
- Thomas 3 Jun 2012 14:33 UTC
  −1 points
  0
  Parent
  http://lesswrong.com/lw/9pl/automatic_programming_an_example/
  
  Was this better?
  
  I always want the shortest possible generating algorithm. Everything else, any “dimensionality” is just irrelevant.
  - othercriteria 3 Jun 2012 15:23 UTC
    3 points
    0
    Parent
    Yes, I think that was better, because the ground truth is Kepler’s third law and jimrandomh pointed out your method actually recaptures a (badly obfuscated and possibly overfit) variant of it.
    
    “Dimensionality” is totally relevant in any approach to supervised learning. But it matters even without considering the bias/variance trade-off, etc.
    
    Imagine that you have an high-dimensional predictor, of which one dimension completely determines the outcome and the rest are noise. Your shortest possible generating algorithm is going to have to pick out the relevant dimension. So as the dimensionality of the predictor increases, the algorithm length will necessarily increase, just for information-theoretic reasons.
    - Miller 3 Jun 2012 18:29 UTC
      1 point
      0
      Parent
      
      recaptures a (badly obfuscated and possibly overfit) variant of it.
      
      How do you overfit Kepler’s law?
      
      edit: Retracted. I see now looking at the actual link the result wasn’t just obfuscated but wrong, and so the manner in which it’s wrong can overfit of course (and that matches the results).
      - othercriteria 3 Jun 2012 19:29 UTC
        3 points
        0
        Parent
        To the extent that Kepler’s laws are exact only for two-body systems of point masses (so I guess calling Kepler’s third law the ground truth is a bit problematic) and to the extent that the data are imperfectly observed, there are residuals that over-eager models can try to match.
        
        Edit: More generally, you don’t overfit the underlying law, you overfit noisy data generated by the underlying law.
        Thomas 3 Jun 2012 20:20 UTC
        3 points
        0
        Parent
        Kepler’s law holds well. The influences of other planets are negligible for the precision we dealt with.
    - Thomas 3 Jun 2012 15:58 UTC
      0 points
      0
      Parent
      Dimensions irrelevant for the output, will fall out. Regardless if they are random or not. If they somehow (anyhow) contribute, their influence will remain in the evolved algorithm.
      
      The simplest algorithm in the Kolmogorov’s sense is the best you can hope for.