Could machine learning be used to fruitfully classify academic articles?
The word “fruitfully” is doing all the heavy lifting here.
It is, of course, possible to throw an ML algorithm at a corpus of academic articles. Will the results be useful? That entirely depends on what do you consider useful. You will certainly get some results.
Something in a related space, http://www.vosviewer.com/ is now being used by a few publishers and it is AWESOME. You can rearrange by researcher links (who published with whom), academic area links, citation links, institution, etc.
Sure, I guess my question was whether you’d think that it’d be possible to do this in a way that would resonate with readers. Would they find the estimates of quality, or level of postmodernism, intuitively plausible?
My hunch was that the classification would primarily be based on patterns of word use, but you’re right that it would probably be fruitful to use at patterns of citations.
If you get a well labelled dataset, I think this is pretty thoroughly within the scope of current machine learning technologies, but that means spending perhaps hundreds of hours labelling papers as a certain amount postmodern out of 100. If you’re trying to single out the postmodernism that you’re convinced is total BS, then that’s more complex. Doable but you need to make the case to me about why it would be worthwhile, and what exactly your aim would be.
Thanks Ryan, that’s helpful. Yes, I’m not sure one would be able to do something that has the right combination of accuracy, interestingness and low-cost at present.
deleted
The word “fruitfully” is doing all the heavy lifting here.
It is, of course, possible to throw an ML algorithm at a corpus of academic articles. Will the results be useful? That entirely depends on what do you consider useful. You will certainly get some results.
Something in a related space, http://www.vosviewer.com/ is now being used by a few publishers and it is AWESOME. You can rearrange by researcher links (who published with whom), academic area links, citation links, institution, etc.
If you had a million labelled postmodern and non-postmodern papers, you could decently identify them.
You could categorise most papers with fewer labels using citation graphs.
You can recommend papers, as you would Amazon books with a recommender system (using ratings).
There are hundreds of ways to apply machine learning to academic articles; it’s a matter of deciding what you want the machine learning to do.
Sure, I guess my question was whether you’d think that it’d be possible to do this in a way that would resonate with readers. Would they find the estimates of quality, or level of postmodernism, intuitively plausible?
My hunch was that the classification would primarily be based on patterns of word use, but you’re right that it would probably be fruitful to use at patterns of citations.
If you get a well labelled dataset, I think this is pretty thoroughly within the scope of current machine learning technologies, but that means spending perhaps hundreds of hours labelling papers as a certain amount postmodern out of 100. If you’re trying to single out the postmodernism that you’re convinced is total BS, then that’s more complex. Doable but you need to make the case to me about why it would be worthwhile, and what exactly your aim would be.
Thanks Ryan, that’s helpful. Yes, I’m not sure one would be able to do something that has the right combination of accuracy, interestingness and low-cost at present.