Vipul and I ultimately want to get a better sense of the value of a Wikipedia pageview (one way to measure the impact of content creation), and one way to do this is to understand how people are using Wikipedia. As we focus on getting more people to work on editing Wikipedia – thus causing more people to read the content we pay and help to create – it becomes more important to understand what people are doing on the site.
most people in LW/SSC/WP/general college-educated SurveyMonkey population/Vipul Naik’s social circles read WP regularly (with a skew to reading WP a huge amount), have some preference for it in search engines & sometimes search on WP directly, every few months is surprised by a gap in WP which could be filled (sounding like a long tail of BLPs and foreign material; the latter being an area that the English WP has always been weak in)
reading patterns in the total sample match aggregate page-view statistics fairly well; respondents tend to have read the most popular WP articles
they primarily skim articles; reading usage tends to be fairly superficial, with occasional use of citations or criticism sections but not any more detailed evaluation of the page or editing process
At face value, this suggests that WP editing may not be that great a use of time. Most people do not read the articles carefully, and aggregate traffic suggests that the sort of niche topics I write on is not reaching all the people one might hope. For example, take threshold models & GCTA traffic statistcs − 74/day and 35/day respectively, or maybe 39k page views a year total. (Assuming, of course, that my contributions don’t get butchered.) This is not a lot in general—I get more like 1k page views a day on gwern.net. A blogpost making it to the front page of Hacker News frequently gets 20k+ page views within the first few days, for comparison.
I interpret this as implying that a case for WP editing can’t be made based on just the traffic numbers. I may get 1k page views a day, but relatively little of that is to pages using GCTA or threshold models even in passing. It may be that writing those articles is highly effective because when someone does need to know about GCTA, they’ll look it up on WP and read it carefully (even though they don’t read most WP pages carefully), and over the years, it’ll have a positive effect on the world that way. This is harder to quantify in a survey, since people will hardly remember what changed their beliefs (indeed, it sounds like most people find it hard to remember how they use WP at all, it’s almost like asking how people use Google searches—it’s so engrained).
My belief is that WP editing can have long-term effects like that, based primarily on my experiences editing Neon Genesis Evangelion and tracking down references and figuring out the historical context. I noticed that increasingly discussions of NGE online took on a much better informed hue, and in particular, the misguided obsession with the Christian & Kabbalic symbolism has died down a great deal, in part due to documenting staff quotes denying that the symbolism was important. On the downside, if you look through the edit history, you can see that a lot of terrific (and impeccably sourced) material I added to the article has been deleted over the years. So YMMV. Presumably working on scientific topics will be less risky.
I think Issa might write a longer reply later, and also update the post with a summary section, but I just wanted to make a quick correction: the college-educated SurveyMonkey population we sampled in fact did not use Wikipedia a lot (in S2, CEYP had fewer heavy Wikipedia users than the general population).
It’s worth noting that the general SurveyMonkey population as well as the college-educated SurveyMonkey population used Wikipedia very little, and one of our key findings was the extent to which usage is skewed to a small subset of the population that uses it heavily (although almost everybody has heard of it and used it at some point). Also, the responses to S1Q2 show that the general population rarely seeks Wikipedia actively, in contrast with the small subset of heavy users (including many SSC readers, people who filled my survey through Facebook).
Your summary of the post is an interesting take on it (and consistent with your perspective and goals) but the conclusions Issa and I drew (especially regarding short-term value) were somewhat different. In particular, both in terms of the quantity of traffic (over a reasonably long time horizon) and the quality and level of engagement with pages, Wikipedia does better than a lot of online content. Notably, it does best in terms of having sustained traffic, as opposed to a lot of “news” that trends for a while and then drops sharply (in marketing lingo, Wikipedia content is “evergreen”).
I think WP editing can be a good idea, but one has to accept that the payoff is not going to arrive anytime soon. I don’t think I started noticing much impact from my NGE or Star Wars or other editing projects for years, though I eventually did begin noticing places where authors were clearly being influenced by my work (or in some cases, bordering on plagiarism). It’s entirely viable to do similar work off Wikipedia—my own website is quite ‘evergreen’ in terms of traffic.
This sort of long-term implicit payoff makes it hard to evaluate. I don’t believe the results of this survey help much in evaluating WP contributions because I am skeptical people are able to meaningfully recall their WP usage or the causal impact on their beliefs. I bet that if one dumped respondents’ browser histories, one would find higher usage rates. One might need to try to measure the causal impact in some more direct way. I think I’ve seen over the years a few experiments along this line in use of scientific publications or external databases: for example, one could randomly select particular papers or concepts, insert them into Wikipedia as appropriate, and look for impacts on subsequent citations or Google search trends. Just as HN referrals underestimates the traffic impact of getting to the front page of HN, WP pageviews may underestimate (or overestimate) impact of WP edits.
Their motivation is public education & outreach:
This is a topic I’ve wondered about myself, as I occasionally spend substantial amounts of time trying to improve Wikipedia articles; most recently genetic correlations, GCTA, liability threshold model, result-blind peer review, missing heritability problem, Tominaga Nakamoto, & debunking urban legends (Rutherford, Kelvin, Lardner, bicycle face, Feynman IQ, MtGox). Even though I’ve been editing WP since 2004, it can be deeply frustrating (look at the barf all over the result-blind peer review right now) and I’m never sure if it’s worth the time.
Results:
most people in LW/SSC/WP/general college-educated SurveyMonkey population/Vipul Naik’s social circles read WP regularly (with a skew to reading WP a huge amount), have some preference for it in search engines & sometimes search on WP directly, every few months is surprised by a gap in WP which could be filled (sounding like a long tail of BLPs and foreign material; the latter being an area that the English WP has always been weak in)
reading patterns in the total sample match aggregate page-view statistics fairly well; respondents tend to have read the most popular WP articles
they primarily skim articles; reading usage tends to be fairly superficial, with occasional use of citations or criticism sections but not any more detailed evaluation of the page or editing process
At face value, this suggests that WP editing may not be that great a use of time. Most people do not read the articles carefully, and aggregate traffic suggests that the sort of niche topics I write on is not reaching all the people one might hope. For example, take threshold models & GCTA traffic statistcs − 74/day and 35/day respectively, or maybe 39k page views a year total. (Assuming, of course, that my contributions don’t get butchered.) This is not a lot in general—I get more like 1k page views a day on
gwern.net
. A blogpost making it to the front page of Hacker News frequently gets 20k+ page views within the first few days, for comparison.I interpret this as implying that a case for WP editing can’t be made based on just the traffic numbers. I may get 1k page views a day, but relatively little of that is to pages using GCTA or threshold models even in passing. It may be that writing those articles is highly effective because when someone does need to know about GCTA, they’ll look it up on WP and read it carefully (even though they don’t read most WP pages carefully), and over the years, it’ll have a positive effect on the world that way. This is harder to quantify in a survey, since people will hardly remember what changed their beliefs (indeed, it sounds like most people find it hard to remember how they use WP at all, it’s almost like asking how people use Google searches—it’s so engrained).
My belief is that WP editing can have long-term effects like that, based primarily on my experiences editing Neon Genesis Evangelion and tracking down references and figuring out the historical context. I noticed that increasingly discussions of NGE online took on a much better informed hue, and in particular, the misguided obsession with the Christian & Kabbalic symbolism has died down a great deal, in part due to documenting staff quotes denying that the symbolism was important. On the downside, if you look through the edit history, you can see that a lot of terrific (and impeccably sourced) material I added to the article has been deleted over the years. So YMMV. Presumably working on scientific topics will be less risky.
I think Issa might write a longer reply later, and also update the post with a summary section, but I just wanted to make a quick correction: the college-educated SurveyMonkey population we sampled in fact did not use Wikipedia a lot (in S2, CEYP had fewer heavy Wikipedia users than the general population).
It’s worth noting that the general SurveyMonkey population as well as the college-educated SurveyMonkey population used Wikipedia very little, and one of our key findings was the extent to which usage is skewed to a small subset of the population that uses it heavily (although almost everybody has heard of it and used it at some point). Also, the responses to S1Q2 show that the general population rarely seeks Wikipedia actively, in contrast with the small subset of heavy users (including many SSC readers, people who filled my survey through Facebook).
Your summary of the post is an interesting take on it (and consistent with your perspective and goals) but the conclusions Issa and I drew (especially regarding short-term value) were somewhat different. In particular, both in terms of the quantity of traffic (over a reasonably long time horizon) and the quality and level of engagement with pages, Wikipedia does better than a lot of online content. Notably, it does best in terms of having sustained traffic, as opposed to a lot of “news” that trends for a while and then drops sharply (in marketing lingo, Wikipedia content is “evergreen”).
I think WP editing can be a good idea, but one has to accept that the payoff is not going to arrive anytime soon. I don’t think I started noticing much impact from my NGE or Star Wars or other editing projects for years, though I eventually did begin noticing places where authors were clearly being influenced by my work (or in some cases, bordering on plagiarism). It’s entirely viable to do similar work off Wikipedia—my own website is quite ‘evergreen’ in terms of traffic.
This sort of long-term implicit payoff makes it hard to evaluate. I don’t believe the results of this survey help much in evaluating WP contributions because I am skeptical people are able to meaningfully recall their WP usage or the causal impact on their beliefs. I bet that if one dumped respondents’ browser histories, one would find higher usage rates. One might need to try to measure the causal impact in some more direct way. I think I’ve seen over the years a few experiments along this line in use of scientific publications or external databases: for example, one could randomly select particular papers or concepts, insert them into Wikipedia as appropriate, and look for impacts on subsequent citations or Google search trends. Just as HN referrals underestimates the traffic impact of getting to the front page of HN, WP pageviews may underestimate (or overestimate) impact of WP edits.