LessWrong analytics (February 2009 to January 2017)
In January 2017, Vipul Naik obtained Google Analytics daily sessions and pageviews data for LessWrong from Kaj Sotala. Vipul asked me to write a short post giving an overview of the data, so here it is.
This post covers just the basics. Vipul and I are eager to hear thoughts on what sort of deeper analysis people are interested in; we may incorporate these ideas in future posts.
The data for both sessions and pageviews span from February 26, 2009 to January 3, 2017. LessWrong seems to have launched in February 2009, so this is close to the full duration for which LessWrong has existed.
Total pageviews recorded by Google Analytics for this period is 52.2 million.
Total sessions recorded by Google Analytics for this period is 19.7 million.
Both plots end with an upward swing, coinciding with the effort to revive LessWrong that began in late November 2016. However, as of early January 2017 (the latest period for which we have data) the scale of any recent increase in LessWrong usage is small in the context of the general decline starting in early 2012.
The top 20 posts of all time (by total pageviews), with pageviews and unique pageviews rounded to the nearest thousand, are as follows:
|Title||Pageviews (thousands)||Unique Pageviews (thousands)|
|Don’t Get Offended||681||128|
|How to Be Happy||551||482|
|How to Beat Procrastination||378||342|
|The Best Textbooks on Every Subject||266||233|
|Do you have High-Functioning Asperger’s Syndrome?||188||168|
|The Quantum Physics Sequence||157||130|
|An Alien God||125||113|
|An Intuitive Explanation of Quantum Mechanics||123||106|
|Three Worlds Collide (0/8)||121||93|
|Bayes’ Theorem Illustrated (My Way)||121||112|
|9⁄26 is Petrov Day||121||115|
|The Baby-Eating Aliens (1/8)||109||98|
|The noncentral fallacy—the worst argument in the world?||107||99|
|Advanced Placement exam cutoffs and superficial knowledge over deep knowledge||107||94|
|Guessing the Teacher’s Password||102||96|
|The Fun Theory Sequence||102||90|
Note that Google Analytics reports are subject to sampling when the number of sessions is large (as it is here) so the input numbers are not exact. More details can be found in a post at LunaMetrics. This doesn’t affect the estimates for the top posts, but those wishing to work with the exported data should be aware of this.
Each post on LessWrong can have numerous URLs. In the case of posts that were renamed, a significant number of pageviews could be recorded at both the old and new URL. To take an example, the following URLs all point to lukeprog’s post “How to Be Happy”:
All that matters for identifying this particular post is that we have the substring “/lw/4su” in the URL. In the above table, I have grouped the URLs by this identifying substring and summed to get the pageview counts.
In addition, each post has two “canonical” URLs that can be obtained by clicking on the post titles: one that begins with either “/r/lesswrong/lw” or “/r/discussion/lw” and one that begins with just “/lw”. I have used the latter in linking to the posts from my table.
The data, source code used to generate the plots, as well as the Markdown source of this post are available in a GitHub Gist.
Clone the Git repository with:
“Effective Altruism Forum web traffic from Google Analytics”, a post by Vipul
gwern.net analytics by gwern
Here are a few related PredictionBook predictions:
Thanks to Kaj for providing the data used in this post. Thanks to Vipul for asking around for the data, for the idea of this post, and for sponsoring my work on this post.