I believe that most people hoping to do independent academic research vastly underestimate both the amount of prior work done in their field of interest, and the advantages of working with other very smart and knowledgeable people. Note that it isn’t just about working with other people, but with other very smart people. That is, there is a difference between “working at a university / research institute” and “working at a top university / research institute”. (For instance, if you want to do AI research in the U.S., you probably want to be at MIT, Princeton, Carnegie Mellon, Stanford, CalTech, or UC Berkeley. I don’t know about other countries.)
Unfortunately, my general impression is that most people on LessWrong are mostly unaware of the progress made in statistical machine learning (presumably the brand of AI that most LWers care about) and cognitive science in the last 20 years (I mention these two fields because I assume they are the most popular on LW, and also because I know the most about them). And I’m not talking about impressive-looking results that dodge around the real issues, I’m talking about fundamental progress towards resolving the key problems in artificial intelligence. Anyone planning to do AI research should probably at least understand these first, and what the remaining obstacles are.
You aren’t going to understand this without doing a lot of reading, and by the time you’ve done that reading, you’ll probably have identified a research group whose work clearly reflects your personal research goals. At this point it seems like the obvious next step is to apply to work with that group as a graduate student / post doc. This circumvents the problem of having to work on research you aren’t interested in. As for other annoyances, while teaching can potentially be a time-sink, the rest of “wasted” time seems to be about publishing your work; I really find it hard to justify not publishing your work, since (a) other people need to know about it, and (b) writing up your results formally oftentimes leads to a noticeably deeper understanding than otherwise. Of course, you can waste time trying to make your results look better than they are, but this certainly isn’t a requirement and has obvious ethical issues.
EDIT: There is the eventual problem that senior professors spend more and more of their time on administrative work / providing guidance to their lab, rather than doing research themselves. But this isn’t going to be an issue until you get tenure, which is, if you do a post-doc, something like 10-15 years out from starting graduate school.
There is the eventual problem that senior professors spend more and more of their time on administrative work / providing guidance to their lab, rather than doing research themselves. But this isn’t going to be an issue until you get tenure, which is, if you do a post-doc, something like 10-15 years out from starting graduate school.
This might not even be a significant problem when the time does come around. High fluid intelligence only lasts for so long, and thus using more crystallized intelligence later on in life to guide research efforts rather than directly performing research yourself is not a bad strategy if the goal is to optimize for the actual research results.
Those are roughly my thoughts as well, although I’m afraid that I only believe this to rationalize my decision to go into academia. While the argument makes sense, there are definitely professors that express frustration with their position.
What does seem like pretty sound logic is that if you could get better results without a research group, you wouldn’t form a research group. So you probably won’t run into the problem of achieving suboptimal results from administrative overhead (you could always just hire less people), but you might run into the problem of doing work that is less fun than it could be.
Another point is that plausibly some other profession (corporate work?) would have less administrative overhead per unit of efficiency, but I don’t actually believe this to be true.
… the progress made in statistical machine learning (presumably the brand of AI that most LWers care about) and cognitive science in the last 20 years… And I’m not talking about impressive-looking results that dodge around the real issues, I’m talking about fundamental progress towards resolving the key problems in artificial intelligence.
Could you point me towards some articles here? I fully admit I’m unaware of most of this progress, and would like to learn more.
A good overview would fill up a post on its own, but some relevant topics are given below. I don’t think any of it is behind a paywall, but if it is, let me know and I’ll link to another article on the same topic. In cases where I learned about the topic by word of mouth, I haven’t necessarily read the provided paper, so I can’t guarantee the quality for all of these. I generally tried to pick papers that either gave a survey of progress or solved a specific clearly interesting problem. As a result you might have to do some additional reading to understand some of the articles, but hopefully this is a good start until I get something more organized up.
Learning:
Online concept learning: rational rules for concept learning [a somewhat idealized situation but a good taste of the sorts of techniques being applied]
Learning HMMs (hidden Markov models): HDP-HMMs this is pretty new so the details haven’t been hammered out, but the article should give you a taste of how people are approaching the problem, although I also haven’t read this article; I forget where I read about HDP-HMMs, although another paper on HDPs is this one. I think the original article I read was one of Erik Sudderth’s, which are here. Another older algorithm is the Baum-Welch algorithm.
… my general impression is that most people on LessWrong are mostly unaware of the progress made in statistical machine learning (presumably the brand of AI that most LWers care about) and cognitive science in the last 20 years … . And I’m not talking about impressive-looking results that dodge around the real issues, I’m talking about fundamental progress towards resolving the key problems in artificial intelligence. Anyone planning to do AI research should probably at least understand these first, and what the remaining obstacles are.
I’m not planning to do AI research, but I do like to stay no more than ~10 years out of date regarding progress in fields like this. At least at the intelligent-outsider level of understanding. So, how do I go about getting and keeping almost up-to-date in these fields. Is MacKay’s book a good place to start on machine learning? How do I get an unbiased survey of cognitive science? Are there blogs that (presuming you follow the links) can keep you up to date on what is getting a buzz?
We should try to communicate with long letters (snail mail) more. Academics seem to have done that a lot in the past. From what I have seen these exchanges seem very productive, though this could be a sampling bias. I don’t see why there aren’t more ‘personal communication’ cites, except for them possibly being frowned upon.
Because it is written. Which makes it good for communicating complex ideas. The tradition behind it also lends it an air of legitimacy. Researchers who don’t already have a working relationship with each other will take each other’s letters more seriously.
Upvoted for the good point about communication. Not sure I agree with the legitimacy part (what is p(Crackpot | Snail Mail) compared to p(Crackpot | Email)? I would guess higher).
Heh you are probably right. It just seemed strange to me how researchers cannot just communicate with each other as long as they have the same research interests. My first thought was that it might have been something to do with status games, where outsiders are not allowed. I suppose some exchanges require rapid and frequent feedback. But then, like you mentioned, wouldn’t Skype do?
I’m not sure what the general case looks like, but the professors who I have worked with (who all have the characteristic that they do applied-ish research at a top research university) are both constantly barraged by more e-mails than they can possibly respond to. I suspect that as a result they limit communication to sources that they know will be fruitful.
Other professors in more theoretical fields (like pure math) don’t seem to have this problem, so I’m not sure why they don’t do what you suggest (although some of them do). And I am not sure that all professors run into the same problem as I have described, even in applied fields.
I believe that most people hoping to do independent academic research vastly underestimate both the amount of prior work done in their field of interest, and the advantages of working with other very smart and knowledgeable people. Note that it isn’t just about working with other people, but with other very smart people. That is, there is a difference between “working at a university / research institute” and “working at a top university / research institute”. (For instance, if you want to do AI research in the U.S., you probably want to be at MIT, Princeton, Carnegie Mellon, Stanford, CalTech, or UC Berkeley. I don’t know about other countries.)
Unfortunately, my general impression is that most people on LessWrong are mostly unaware of the progress made in statistical machine learning (presumably the brand of AI that most LWers care about) and cognitive science in the last 20 years (I mention these two fields because I assume they are the most popular on LW, and also because I know the most about them). And I’m not talking about impressive-looking results that dodge around the real issues, I’m talking about fundamental progress towards resolving the key problems in artificial intelligence. Anyone planning to do AI research should probably at least understand these first, and what the remaining obstacles are.
You aren’t going to understand this without doing a lot of reading, and by the time you’ve done that reading, you’ll probably have identified a research group whose work clearly reflects your personal research goals. At this point it seems like the obvious next step is to apply to work with that group as a graduate student / post doc. This circumvents the problem of having to work on research you aren’t interested in. As for other annoyances, while teaching can potentially be a time-sink, the rest of “wasted” time seems to be about publishing your work; I really find it hard to justify not publishing your work, since (a) other people need to know about it, and (b) writing up your results formally oftentimes leads to a noticeably deeper understanding than otherwise. Of course, you can waste time trying to make your results look better than they are, but this certainly isn’t a requirement and has obvious ethical issues.
EDIT: There is the eventual problem that senior professors spend more and more of their time on administrative work / providing guidance to their lab, rather than doing research themselves. But this isn’t going to be an issue until you get tenure, which is, if you do a post-doc, something like 10-15 years out from starting graduate school.
This might not even be a significant problem when the time does come around. High fluid intelligence only lasts for so long, and thus using more crystallized intelligence later on in life to guide research efforts rather than directly performing research yourself is not a bad strategy if the goal is to optimize for the actual research results.
Those are roughly my thoughts as well, although I’m afraid that I only believe this to rationalize my decision to go into academia. While the argument makes sense, there are definitely professors that express frustration with their position.
What does seem like pretty sound logic is that if you could get better results without a research group, you wouldn’t form a research group. So you probably won’t run into the problem of achieving suboptimal results from administrative overhead (you could always just hire less people), but you might run into the problem of doing work that is less fun than it could be.
Another point is that plausibly some other profession (corporate work?) would have less administrative overhead per unit of efficiency, but I don’t actually believe this to be true.
Could you point me towards some articles here? I fully admit I’m unaware of most of this progress, and would like to learn more.
A good overview would fill up a post on its own, but some relevant topics are given below. I don’t think any of it is behind a paywall, but if it is, let me know and I’ll link to another article on the same topic. In cases where I learned about the topic by word of mouth, I haven’t necessarily read the provided paper, so I can’t guarantee the quality for all of these. I generally tried to pick papers that either gave a survey of progress or solved a specific clearly interesting problem. As a result you might have to do some additional reading to understand some of the articles, but hopefully this is a good start until I get something more organized up.
Learning:
Online concept learning: rational rules for concept learning [a somewhat idealized situation but a good taste of the sorts of techniques being applied]
Learning categories: Bernoulli mixture model for document classification, spatial pyramid matching for images
Learning category hierarchies: nested Chinese restaurant process, hierarchical beta process
Learning HMMs (hidden Markov models): HDP-HMMs this is pretty new so the details haven’t been hammered out, but the article should give you a taste of how people are approaching the problem, although I also haven’t read this article; I forget where I read about HDP-HMMs, although another paper on HDPs is this one. I think the original article I read was one of Erik Sudderth’s, which are here. Another older algorithm is the Baum-Welch algorithm.
Learning image characteristics: deep Boltzmann machines
Handwriting recognition: hierarchical Bayesian approach, basically the same as the previous research
Learning graphical models: a survey paper
Planning:
Planning in MDPs: value iteration, plus LQR trees for many physical systems
Planning in POMDPs: I don’t actually know much about this; my impression is that we need to do more work in this area, but approaches include reinforcement learning. A couple interesting papers: Bayes risk approach, plus a survey of hierarchical methods
I’m not planning to do AI research, but I do like to stay no more than ~10 years out of date regarding progress in fields like this. At least at the intelligent-outsider level of understanding. So, how do I go about getting and keeping almost up-to-date in these fields. Is MacKay’s book a good place to start on machine learning? How do I get an unbiased survey of cognitive science? Are there blogs that (presuming you follow the links) can keep you up to date on what is getting a buzz?
I haven’t read MacKay myself, but it looks like it hits a lot of the relevant topics.
You might consider checking out Tom Griffiths’ website, which has a reading list as well as several tutorials.
We should try to communicate with long letters (snail mail) more. Academics seem to have done that a lot in the past. From what I have seen these exchanges seem very productive, though this could be a sampling bias. I don’t see why there aren’t more ‘personal communication’ cites, except for them possibly being frowned upon.
Why use snail mail when you can use skype? My lab director uses it regularly to talk to other researchers.
Because it is written. Which makes it good for communicating complex ideas. The tradition behind it also lends it an air of legitimacy. Researchers who don’t already have a working relationship with each other will take each other’s letters more seriously.
Upvoted for the good point about communication. Not sure I agree with the legitimacy part (what is p(Crackpot | Snail Mail) compared to p(Crackpot | Email)? I would guess higher).
What I’m now wondering is, how does using email vs. snail mail affect the probability of using green ink, or its email equivalent...
Heh you are probably right. It just seemed strange to me how researchers cannot just communicate with each other as long as they have the same research interests. My first thought was that it might have been something to do with status games, where outsiders are not allowed. I suppose some exchanges require rapid and frequent feedback. But then, like you mentioned, wouldn’t Skype do?
I’m not sure what the general case looks like, but the professors who I have worked with (who all have the characteristic that they do applied-ish research at a top research university) are both constantly barraged by more e-mails than they can possibly respond to. I suspect that as a result they limit communication to sources that they know will be fruitful.
Other professors in more theoretical fields (like pure math) don’t seem to have this problem, so I’m not sure why they don’t do what you suggest (although some of them do). And I am not sure that all professors run into the same problem as I have described, even in applied fields.
“In the past” as in before they had alternative methods of long distance communication, or after?