2018 Review: Voting Results!

Ben Pace24 Jan 2020 2:00 UTC

135 points

The votes are in!

59 of the 430 eligible voters participated, evaluating 75 posts. Meanwhile, 39 users submitted a total of 120 reviews, with most posts getting at least one review.

Thanks a ton to everyone who put in time to think about the posts—nominators, reviewers and voters alike. Several reviews substantially changed my mind about many topics and ideas, and I was quite grateful for the authors participating in the process. I’ll mention Zack_M_Davis, Vanessa Kosoy, and Daniel Filan as great people who wrote the most upvoted reviews.

In the coming months, the LessWrong team will write further analyses of the vote data, and use the information to form a sequence and a book of the best writing on LessWrong from 2018.

Below are the results of the vote, followed by a discussion of how reliable the result is and plans for the future.

The Complete Results

Click Here If You Would Like A More Comprehensive Vote Data Spreadsheet

To help users see the spread of the vote data, we’ve included swarmplot visualizations.

For space reasons, only votes with weights between −10 and 16 are plotted. This covers 99.4% of votes.
Gridlines are spaced 2 points apart.
Concrete illustration: The plot immediately below has 18 votes ranging in strength from −3 to 12.

#	Post Title	Total	Vote Spread
1	Embedded Agents	209	(One outlier vote of +17 is not shown)
2	The Rocket Alignment Problem	183
3	Local Validity as a Key to Sanity and Civilization	133
4	Arguments about fast takeoff	98
5	The Costly Coordination Mechanism of Common Knowledge	95
6	Toward a New Technical Explanation of Technical Explanation	91
7	Anti-social Punishment	90	(One outlier vote of +20 is not shown)
8	The Tails Coming Apart As Metaphor For Life	89
9	Babble	85
10	The Loudest Alarm Is Probably False	84
11	The Intelligent Social Web	79
12	Prediction Markets: When Do They Work?	77
13	Coherence arguments do not imply goal-directed behavior	76
14	Is Science Slowing Down?	75
15	Robustness to Scale	74
15	A voting theory primer for rationalists	74
17	Toolbox-thinking and Law-thinking	73
18	A Sketch of Good Communication	72
19	A LessWrong Crypto Autopsy	71
20	Paul’s research agenda FAQ	70
21	Unrolling social metacognition: Three levels of meta are not enough.	69
22	An Untrollable Mathematician Illustrated	65
23	Specification gaming examples in AI	64
23	Will AI See Sudden Progress?	64
23	Varieties Of Argumentative Experience	64
26	Meta-Honesty: Firming Up Honesty Around Its Edge-Cases	62
27	My attempt to explain Looking, insight meditation, and enlightenment in non-mysterious terms	60
27	Naming the Nameless	60
27	Inadequate Equilibria vs. Governance of the Commons	60
30	2018 AI Alignment Literature Review and Charity Comparison	57
31	Noticing the Taste of Lotus	55
31	On Doing the Improbable	55
31	The Pavlov Strategy	55
31	Being a Robust, Coherent Agent (V2)	55
35	Spaghetti Towers	54
36	Beyond Astronomical Waste	51
36	Research: Rescuers during the Holocaust	51
38	Open question: are minimal circuits daemon-free?	48
38	Decoupling vs Contextualising Norms	48	(One outlier vote of +23)
40	On the Loss and Preservation of Knowledge	47
41	Is Clickbait Destroying Our General Intelligence?	46
42	What makes people intellectually active?	43
43	Why everything might have taken so long	40
44	Challenges to Christiano’s capability amplification proposal	39
45	Public Positions and Private Guts	38
46	Clarifying “AI Alignment”	36
46	Expressive Vocabulary	36
48	Bottle Caps Aren’t Optimisers	34
49	*Argue Politics With Your Best Friends**	32
50	Player vs. Character: A Two-Level Model of Ethics	30
51	Conversational Cultures: Combat vs Nurture (V2)	29
51	Act of Charity	29
53	Optimization Amplifies	27
53	Circling	27	(One outlier vote of −17)
55	Realism about rationality	25	(Two outliers of −30 and +18)
55	Caring less	25
57	Lessons from the Cold War on Information Hazards: Why Internal Communication is Critical	24
57	The Bat and Ball Problem Revisited	24
59	Argument, intuition, and recursion	21
59	Unknown Knowns	21
61	Competitive Markets as Distributed Backprop	18
62	Towards a New Impact Measure	14
62	Explicit and Implicit Communication	14
62	On the Chatham House Rule	14
62	Historical mathematicians exhibit a birth order effect too	14
66	Everything I ever needed to know, I learned from World of Warcraft: Goodhart’s law	13
67	The funnel of human experience	11
68	Understanding is translation	9
69	Preliminary thoughts on moral weight	7
70	Metaphilosophical competence can’t be disentangled from alignment	3
71	Two types of mathematician	2
72	How did academia ensure papers were correct in the early 20th Century?	-2
73	Birth order effect found in Nobel Laureates in Physics	-5
74	Give praise	-10
75	Affordance Widths	-142	(One outlier of −29)

How reliable is the output of this vote?

For most posts, between 10-20 people voted on them (median of 17). A change by 10-15 in a post’s score is enough to move a post up or down around 10 positions within the rankings. This is equal to a few moderate strength votes from two or three people, or an exceedingly strong vote from a single strongly-feeling voter. This means that the system is somewhat noisy, though it seems to me very unlikely that posts at the very top could end up placed much differently.

The vote was also affected by two technical mistakes the team made:

The post-order was not randomized. For the first half of the voting period, the posts on the voting page appeared in order of number of nominations (least to most) instead of appearing randomly, thereby giving more visual attention to the first ~15 or so posts (these were posts with 2 nominations). Ruby looked into it and says that 15-30% more people cast votes on these earlier-appearing posts compared to those appearing elsewhere in the list. Thanks to gjm for identifying this issue.
Users were given some free negative votes. When calculating the cost of users’ votes, we used a simple equation, but missed that it produced an off-by-one error for negative numbers. Essentially, users got a free 1-negative-vote-weight on all the posts to which they had voted on negatively. To correct for this, for those who had exceeded their budget − 18 users in total—we reduced the strength of their negative votes by a single unit, and for those who had not spent all their points their votes were unaffected. This didn’t affect the rank-ordering very much, a few posts changed by 1 position, and a smaller number changed by 2-3 positions.

The effect size of these errors is not certain since it’s hard to know how people would have voted counterfactually. My sense is that the effect is pretty small, and that the majority of noise in the system comes from elsewhere.

Finally, we discarded exactly one ballot, which spent 10,000 points on voting instead of the allotted 500. Had a user gone over by a small amount e.g. 1-50 points, we had planned to scale their votes down to fit the budget. However when someone’s allocation was so extreme, we were honestly unsure what adjustment to their votes they would have wanted, as if their points had been normalised down to 500, the majority of their votes would have been adjusted to zero. (This decision was made without knowing the user who cast the ballot or which posts were affected.)

Overall, I think the vote is a good indicator to about 10 places within the rankings, but, for example, I wouldn’t agonise over whether a post is at position #42 vs #43.

The Future

This has been the first LessWrong Annual Review. This project was started with the vision of creating a piece of infrastructure that would:

Create common knowledge about how the LessWrong community feels about various posts and topics and the progress we’ve made.
Improve our longterm incentives, feedback, and rewards for authors.
Help create a highly curated “Best of 2018” Sequence and Book.

The vote reveals much disagreement between LessWrongers. Every post has at least five positive votes and every post had at least one negative vote – except for An Untrollable Mathematician Illustrated by Abram Demski, which was evidently just too likeable – and many people had strongly different feelings about many posts. Many of these seem more interesting to me than the specific ranking of the given post.

In total, users wrote 207 nominations and 120 reviews, and many authors updated their posts with new thinking, or clearer explanations, showing that both readers and authors reflected a lot (and I think changed their mind a lot) during the review period. I think all of this is great, and like the idea of us having a Schelling time in the year for this sort of thinking.

Speaking for myself, this has been a fascinating and successful experiment—I’ve learned a lot. My thanks to Ray for pushing me and the rest of the team to actually do it this year, in a move-fast-and-break-things kind of way. The team will be conducting a Review of the Review where we take stock of what happened, discuss the value and costs of the Review process, and think about how to make the review process more effective and efficient in future years.

In the coming months, the LessWrong team will write further analyses of the vote data, award prizes to authors and reviewers, and use the vote to help design a sequence and a book of the best writing on LW from 2018.

I think it’s awesome that we can do things like this, and I was honestly surprised by the level of community participation. Thanks to everyone who helped out in the LessWrong 2018 Review—everyone who nominated, reviewed, voted and wrote the posts.

What links here?

Ben Pace24 Jan 2020 2:00 UTC

135 points

59 comments6 min readLW link

LessWrong Review Site Meta Epistemic Review Community

2018 Review: Voting Results!

Top 15 posts

Top 15 posts not about AI

Top 10 posts about AI

The Complete Results

How reliable is the output of this vote?

The Future