jsteinhardt comments on Best career models for doing research?

jsteinhardt 8 Dec 2010 2:24 UTC
51 points
I believe that most people hoping to do independent academic research vastly underestimate both the amount of prior work done in their field of interest, and the advantages of working with other very smart and knowledgeable people. Note that it isn’t just about working with other people, but with other very smart people. That is, there is a difference between “working at a university / research institute” and “working at a top university / research institute”. (For instance, if you want to do AI research in the U.S., you probably want to be at MIT, Princeton, Carnegie Mellon, Stanford, CalTech, or UC Berkeley. I don’t know about other countries.)

Unfortunately, my general impression is that most people on LessWrong are mostly unaware of the progress made in statistical machine learning (presumably the brand of AI that most LWers care about) and cognitive science in the last 20 years (I mention these two fields because I assume they are the most popular on LW, and also because I know the most about them). And I’m not talking about impressive-looking results that dodge around the real issues, I’m talking about fundamental progress towards resolving the key problems in artificial intelligence. Anyone planning to do AI research should probably at least understand these first, and what the remaining obstacles are.

You aren’t going to understand this without doing a lot of reading, and by the time you’ve done that reading, you’ll probably have identified a research group whose work clearly reflects your personal research goals. At this point it seems like the obvious next step is to apply to work with that group as a graduate student / post doc. This circumvents the problem of having to work on research you aren’t interested in. As for other annoyances, while teaching can potentially be a time-sink, the rest of “wasted” time seems to be about publishing your work; I really find it hard to justify not publishing your work, since (a) other people need to know about it, and (b) writing up your results formally oftentimes leads to a noticeably deeper understanding than otherwise. Of course, you can waste time trying to make your results look better than they are, but this certainly isn’t a requirement and has obvious ethical issues.

EDIT: There is the eventual problem that senior professors spend more and more of their time on administrative work / providing guidance to their lab, rather than doing research themselves. But this isn’t going to be an issue until you get tenure, which is, if you do a post-doc, something like 10-15 years out from starting graduate school.
- Danny_Hintze 10 Dec 2010 23:30 UTC
  9 points
  Parent
  
  There is the eventual problem that senior professors spend more and more of their time on administrative work / providing guidance to their lab, rather than doing research themselves. But this isn’t going to be an issue until you get tenure, which is, if you do a post-doc, something like 10-15 years out from starting graduate school.
  
  This might not even be a significant problem when the time does come around. High fluid intelligence only lasts for so long, and thus using more crystallized intelligence later on in life to guide research efforts rather than directly performing research yourself is not a bad strategy if the goal is to optimize for the actual research results.
  - jsteinhardt 11 Dec 2010 3:05 UTC
    4 points
    Parent
    Those are roughly my thoughts as well, although I’m afraid that I only believe this to rationalize my decision to go into academia. While the argument makes sense, there are definitely professors that express frustration with their position.
    
    What does seem like pretty sound logic is that if you could get better results without a research group, you wouldn’t form a research group. So you probably won’t run into the problem of achieving suboptimal results from administrative overhead (you could always just hire less people), but you might run into the problem of doing work that is less fun than it could be.
    
    Another point is that plausibly some other profession (corporate work?) would have less administrative overhead per unit of efficiency, but I don’t actually believe this to be true.
- nhamann 10 Dec 2010 11:27 UTC
  7 points
  Parent
  
  … the progress made in statistical machine learning (presumably the brand of AI that most LWers care about) and cognitive science in the last 20 years… And I’m not talking about impressive-looking results that dodge around the real issues, I’m talking about fundamental progress towards resolving the key problems in artificial intelligence.
  
  Could you point me towards some articles here? I fully admit I’m unaware of most of this progress, and would like to learn more.
  - jsteinhardt 11 Dec 2010 3:56 UTC
    17 points
    Parent
    A good overview would fill up a post on its own, but some relevant topics are given below. I don’t think any of it is behind a paywall, but if it is, let me know and I’ll link to another article on the same topic. In cases where I learned about the topic by word of mouth, I haven’t necessarily read the provided paper, so I can’t guarantee the quality for all of these. I generally tried to pick papers that either gave a survey of progress or solved a specific clearly interesting problem. As a result you might have to do some additional reading to understand some of the articles, but hopefully this is a good start until I get something more organized up.
    
    Learning:
    
    Online concept learning: rational rules for concept learning [a somewhat idealized situation but a good taste of the sorts of techniques being applied]
    
    Learning categories: Bernoulli mixture model for document classification, spatial pyramid matching for images
    
    Learning category hierarchies: nested Chinese restaurant process, hierarchical beta process
    
    Learning HMMs (hidden Markov models): HDP-HMMs this is pretty new so the details haven’t been hammered out, but the article should give you a taste of how people are approaching the problem, although I also haven’t read this article; I forget where I read about HDP-HMMs, although another paper on HDPs is this one. I think the original article I read was one of Erik Sudderth’s, which are here. Another older algorithm is the Baum-Welch algorithm.
    
    Learning image characteristics: deep Boltzmann machines
    
    Handwriting recognition: hierarchical Bayesian approach, basically the same as the previous research
    
    Learning graphical models: a survey paper
    
    Planning:
    
    Planning in MDPs: value iteration, plus LQR trees for many physical systems
    
    Planning in POMDPs: I don’t actually know much about this; my impression is that we need to do more work in this area, but approaches include reinforcement learning. A couple interesting papers: Bayes risk approach, plus a survey of hierarchical methods
- Perplexed 22 Jan 2011 4:52 UTC
  2 points
  Parent
  
  … my general impression is that most people on LessWrong are mostly unaware of the progress made in statistical machine learning (presumably the brand of AI that most LWers care about) and cognitive science in the last 20 years … . And I’m not talking about impressive-looking results that dodge around the real issues, I’m talking about fundamental progress towards resolving the key problems in artificial intelligence. Anyone planning to do AI research should probably at least understand these first, and what the remaining obstacles are.
  
  I’m not planning to do AI research, but I do like to stay no more than ~10 years out of date regarding progress in fields like this. At least at the intelligent-outsider level of understanding. So, how do I go about getting and keeping almost up-to-date in these fields. Is MacKay’s book a good place to start on machine learning? How do I get an unbiased survey of cognitive science? Are there blogs that (presuming you follow the links) can keep you up to date on what is getting a buzz?
  - jsteinhardt 22 Jan 2011 21:19 UTC
    3 points
    Parent
    I haven’t read MacKay myself, but it looks like it hits a lot of the relevant topics.
    
    You might consider checking out Tom Griffiths’ website, which has a reading list as well as several tutorials.
- sark 21 Jan 2011 23:10 UTC
  1 point
  Parent
  We should try to communicate with long letters (snail mail) more. Academics seem to have done that a lot in the past. From what I have seen these exchanges seem very productive, though this could be a sampling bias. I don’t see why there aren’t more ‘personal communication’ cites, except for them possibly being frowned upon.
  - jsteinhardt 21 Jan 2011 23:46 UTC
    1 point
    Parent
    Why use snail mail when you can use skype? My lab director uses it regularly to talk to other researchers.
    - sark 22 Jan 2011 1:06 UTC
      3 points
      Parent
      Because it is written. Which makes it good for communicating complex ideas. The tradition behind it also lends it an air of legitimacy. Researchers who don’t already have a working relationship with each other will take each other’s letters more seriously.
      - jsteinhardt 22 Jan 2011 4:28 UTC
        3 points
        Parent
        Upvoted for the good point about communication. Not sure I agree with the legitimacy part (what is p(Crackpot | Snail Mail) compared to p(Crackpot | Email)? I would guess higher).
        Sniffnoy 23 Jan 2011 5:31 UTC
        2 points
        Parent
        What I’m now wondering is, how does using email vs. snail mail affect the probability of using green ink, or its email equivalent...
        sark 22 Jan 2011 12:01 UTC
        1 point
        Parent
        Heh you are probably right. It just seemed strange to me how researchers cannot just communicate with each other as long as they have the same research interests. My first thought was that it might have been something to do with status games, where outsiders are not allowed. I suppose some exchanges require rapid and frequent feedback. But then, like you mentioned, wouldn’t Skype do?
        jsteinhardt 22 Jan 2011 21:09 UTC
        1 point
        Parent
        I’m not sure what the general case looks like, but the professors who I have worked with (who all have the characteristic that they do applied-ish research at a top research university) are both constantly barraged by more e-mails than they can possibly respond to. I suspect that as a result they limit communication to sources that they know will be fruitful.
        
        Other professors in more theoretical fields (like pure math) don’t seem to have this problem, so I’m not sure why they don’t do what you suggest (although some of them do). And I am not sure that all professors run into the same problem as I have described, even in applied fields.
  - Desrtopa 22 Jan 2011 6:17 UTC
    0 points
    Parent
    “In the past” as in before they had alternative methods of long distance communication, or after?