Musings on LessWrong Peer Review
In the context of:
In the spirit of writing down conversations, this is a rough summary of some recent conversations with Oliver Habryka and Zvi Mowshowitz about how and why to implement peer review on Less Wrong.
This is not meant to argue for a point or outline a specific plan, just to highlight a bunch of the thoughts we’re currently thinking about. I haven’t put much effort towards polishing it, but it’s here for people who want to follow along with our thought process.
Curation, Canon and Common Knowledge
If 90% of the people around have the idea, when I’m not confident that 100% do then I often explain the basic idea for everyone. This often costs a lot of time.
– Ben Pace on Common Knowledge
Right now we curate about 3 posts a week. This feels about right from the perspective of readers checking in on good content regularly, and authors having a reasonable chance of getting into curated if their posts are good. But it means curation isn’t that strong a signal of quality or importance. A 90th percentile LW post doesn’t necessarily mean “6 months later, we’d expect this idea to still seem important.”
Our intention was for curated to be a fairly big deal and to only curate things we’re confident are “important”, but in practice it seems hard.
We’ve considered either raising standards for Curated, or introducing a new category above it. Early ideas here were renaming “Curated” to “Common Knowledge” with the intent to slightly raise standards and imply that if you want to stay up to date on “what it’s expected you know if you’re participating in LW discourse”, you should read things in the “Common Knowledge” section.
As cute as this name was, a major strike against it was that common knowledge is a useful technical term, and we don’t want to degrade it’s meaning.
We considered other terms like “Canon”, with slightly different connotations: maybe once every few months we could look back and see which posts (curated or otherwise) seemed like they were likely to enter the longterm rationalsphere lexicon.
I chatted separately with Zvi and Oliver about this idea, and found that we had different intuitions about what “Canon” means.
Canon as History
To Zvi and I, the main association for canon was “famous works you’re expected to have read, not necessarily because they’re the clearest or most rigorous but because of their historical context.” Maybe someone has written a better version of Hamlet, or a better version of Plato’s Republic, but you may still want to read those if you’re part of subcultures that value their historical legacy.
In the rationalsphere, Inadequate Equilibria is (sort of) a more formal and rigorous version of Meditations on Moloch, but you might still find value in the poetry, emotional oomph and historical legacy of Meditations, and knowing about it may help you understand a lot of conversations going on since longtime readers will be using chunks of it as shorthand.
Therefore, you might want Meditations on Moloch to be recognized as a part of the Rationalist Canon.
Canon as the Best Reference
Oliver’s take was more like “Canonical status is about which writing you want to be the canonical reference point for a thing”, and you may want this to change over time. (He pointed out the canonization process of the Bible literally involved deciding which stories seemed important, revising their order, etc)
Part of the reason Ben wrote his common knowledge post was that the closest thing he could find to a canonical common knowledge introduction was Scott Aaronson’s essay, which involved a tricky logic puzzle that I still personally struggle to understand even after stepping through it carefully over several hours.
There’s value in stepping through that tricky logic puzzle, but meanwhile, common knowledge is a really useful concept that seemed like it could be explained more intuitively. Ben spent several weeks writing a post that he hoped stood a chance of becoming the Canonical Less Wrong Post on Common Knowledge.
Peer Review as Canon, and Upping Our Game
Meanwhile, one problem in Real Science™ is that it’s hard to remove things from canon. Once something’s passed peer review, made its way into textbooks and entered the public zeitgeist… if you have a replication crisis or a paradigm shift, it may be hard to get people to realize the idea is now bogus.
This suggests two things:
Naively, this suggests you need to be really careful about what you allow into Canon in the first place (if using the Canon as Best Reference frame).
You may even want to aspire higher, to create a system where removing things from Canon is more incentivized. This is probably harder.
LessWrong 2.0 is aiming to be a platform for intellectual progress. Oliver and I are optimistic about this because we think LessWrong 1.0 contributed a lot of genuine progress in the fields of rationality, effective altruism, AI safety and x-risk more generally. But while promising, the progress we’ve seen so far doesn’t seem as great as we could be doing.
In the Canon As History frame, everything in the Sequences should be part of Canon. In the Canon as Best Reference or Peer Review as Canon, frames, there’s… a lot of sequence posts that might not cut it, for a few reasons:
The replication crisis happened so some bits of evidence are not longer as compelling
The concept hasn’t turned out to be as important after several years of refining instrumental or epistemic rationality
Some of the writing just isn’t that great.
Similarly, in the Slatestar Codex, there’s a lot of posts that introduce great ideas, but in a highly politicized context that makes it more fraught to link to them in an unrelated discussion. It’d be useful to have a canonical reference point that introducing an idea without riling up people who have strong opinions on feminism.
And also meanwhile, other problems in Real Science™ include the peer review being:
Thankless for the reviewers
Intertwined with conferences and journals that come with weird social games and rent-seeking and gate keeping problems
Quality of review varies a lot
Something about academia makes people write in styles that are really hard to understand, and people don’t even see this as a problem. (See Chris Olah’s Research Debt)
...all of these ideas bumping around has me thinking we shouldn’t just be trying to add another threshold of quality control. As long as we’re doing this, let’s try to solve a bunch of longstanding problems in academia, at least in the domains that LessWrong has focused on.
I’ve recently written about making sure LessWrong is friendly to idea generation. I’ve heard many top contributors talk about feeling intense pressure to make sure their ideas are perfect before trying to publish them, and this results in ideas staying locked inside organizations and in-person conversation.
I think LessWrong still needs more features that enable low-pressure discussion of early stage thoughts.
But, if we’re to become a serious platform for intellectual progress, we also need to incentivize high caliber content that is competitive with mainstream science and philosophy – some combination of “as good or better at rigor and idea generation” and “much better at explaining things.”
I think a challenging but achievable goal might be to become something like distill.pub, but for theoretical rationality and the assorted other topics that LessWrongers have ended up interested in.
[Note: a different goal would be to become prestigious in a fashion competitive with mainstream science journals. This seems harder than “become a reliable internal ecosystem for the rationality and effective altruism communities to develop and vet good ideas, without regard for external prestige.”
I’m not even sure becoming prestigious would be useful, and insofar as it is, it seems much better to try to become good and worry about translating that into prestige later. Meanwhile it’s not a goal I’m personally interested in]
This is the first of a few posts I have planned on this subject. Some upcoming concepts include:
[Edit: alas, I never wrote these up, although we’re still thinking about it a bunch]
What is Peer Review for? [The short answers are A) highlighting the best content and B) forcing you to contend with feedback that improves your thinking.]
What makes for good criticism at different stages of idea development
What tools we might build eventually, and what tools exist now that authors may want to use more. (I think a lot of progress can be made just changing some of the assumptions around existing tools on LW and elsewhere)
How all of this influences our overall design choices for the site