I wasn’t suggesting you spider everything associated with that tag, just look through it for more blogs. I guess maybe that’s too much work?
At this point yeah. I now have 56k URLs in the queue, and at 20 seconds a URL… Pareto is the idea here, what are the main sites worth preserving?
I guess ribbon farm and Paul Graham would be the 2 big ones from my list.
I wasn’t suggesting you spider everything associated with that tag, just look through it for more blogs. I guess maybe that’s too much work?
At this point yeah. I now have 56k URLs in the queue, and at 20 seconds a URL… Pareto is the idea here, what are the main sites worth preserving?
I guess ribbon farm and Paul Graham would be the 2 big ones from my list.