We could test this theory by using it on the existing data and selecting the best comments under this theory. I would be interested in reading the “top 20 (or 50) LW comments ever” found by this algorithm, posted as a separate article. It could give us an approximate idea of what exactly the new system would incentiize.
Is there a good canonical source for all LW comments ever? I’m interested in importing the data into Python and playing around with ranking algorithms. (I’m not sure what disclaimer to use to keep others from not doing the same just because I publicly said that I’m interested in it, but yeah, feel free to duplicate work and come up with other interesting analyses)
It’s probably doable to use those to scrape comments and put them into some kind of list or database, but spending time looting LW comments that way seems like wasted effort compared to getting a full dump from an official source.
We could test this theory by using it on the existing data and selecting the best comments under this theory. I would be interested in reading the “top 20 (or 50) LW comments ever” found by this algorithm, posted as a separate article. It could give us an approximate idea of what exactly the new system would incentiize.
Is there a good canonical source for all LW comments ever? I’m interested in importing the data into Python and playing around with ranking algorithms. (I’m not sure what disclaimer to use to keep others from not doing the same just because I publicly said that I’m interested in it, but yeah, feel free to duplicate work and come up with other interesting analyses)
You could ask matt to send you the necessary parts of the database.
There’s this and this. Maybe they allow you to go all the way back to the beginning.
It’s probably doable to use those to scrape comments and put them into some kind of list or database, but spending time looting LW comments that way seems like wasted effort compared to getting a full dump from an official source.