Thoughts on the Scope of LessWrong's Infohazard Policies

Recently, I was asked about my opinion on deleting posts that had potentially bad consequences. In particular, whether I should delete a post criticising the CDC’s response to the coronavirus, during a time of crisis where the CDC was the relevant US government authority on how to respond. I spent a few hours writing a response, and I’ve reproduced it here with light editing so it can be linked to and discussed out of the context of the thread. It’s not tightly edited, but it has a bunch of things about the current policies that I think are helpful to say rather than not say.

This post doesn’t discuss all aspects of how to deal with infohazards and secrecy on LessWrong. In areas such as AI capabilities, bioweapons, and more, the LessWrong engineering team has thought more about features to build to allow for multi-stage publication practises. But this was an instance of a user asking that a specific post be deleted because it posed an infohazard, so that’s what I address below.

I’ve spent multiple hundreds of hours thinking about information hazards, publication norms, and how to avoid unilateralist action, and I regularly use those principles explicitly in decision-making. I’ve spent quite some time thinking about how to re-design LessWrong to allow for private discussion and vetting for issues that might lead to e.g. sharing insights that lead to advances in AI capabilities. But given all of that, on reflection, I completely disagree that this post should be deleted, or that the authors were taking worrying unilateralist action.

Let me give my thoughts on the issue of infohazards.

I am honestly not sure what work you think the term is doing in this situation, so I’ll recap what it is for everyone following. In history, there has been a notion that all science is fundamentally good, that all knowledge is good, and that science need not ask ethical questions of its exploration. Much of Bostrom’s career has been to draw the boundaries of this idea and show where it is false. For example, one can build technologies that a civilization is not wise enough to use correctly, that lead to degradation of society and even extinction (you and I are both building our lives around increasing the wisdom of society so that we don’t go extinct). Bostrom’s infohazards paper is a philosophical exercise, asking at every level of organisation what kinds of information can hurt you. The paper itself has no conclusion, and ends with an exhortation toward freedom of speech, its point is simply to help you conceptualise this kind of thing and be able to notice in different domains. Then you can notice the tradeoff and weigh it properly in your decision-making.

So, calling something an infohazard merely means that it’s damaging information. An argument that has a false conclusion is an infohazard, because it might cause people to believe a false conclusion. Publishing private information is an infohazard, because it allows adversaries to attack you better. Yet, we allow people to make arguments that have false conclusions, and we often publish infohazardous private material because it contributes to the common good e.g. I list my home address on occasional public facebook events. This helps people burgle my house, but it’s worth it to let lots of friends find me.

Now, in the x-risk community that focuses on biosecurity, I believe there are two kinds of infohazards that there is consensus that we should suppress. These are: sharing (even rough) technological designs for pathogens that could kill millions of people, and sharing information about system weaknesses that are presently subject to attack by adversaries. I won’t give any current examples because I think they could be very damaging, but Davis Kingsley helpfully published an old example that is no longer applicable in this post if anyone is interested. So I assume that this is what you are talking about, as I know of no other classes of infohazard that there is a consensus about in the bio-x-risk space that one should take great pains to silence and punish people who publish on them.

The main reason Bostrom’s paper is brought up in biosecurity is in the context of arguing that the spread of specific technological designs for various pathogens and or damaging systems shouldn’t be published or sketched out in great detail. As Churchill was shocked by Niels Bohr’s plea to share the nuclear designs with the Russians, because (he claimed) it would lead to the end of all war (to which Churchill said no and wondered if Bohr was a Russian spy), we shouldn’t freely publish designs for pathogens that terrorists or warring states could use to hurt a lot of people or potentially cause an existential catastrophe. It would be wise to (a) have careful publication practises that involve the option of not-publishing details of such biological systems and (b) not publicise how to discover such information.

Bostrom has spent a lot of his career saying this is a worrying problem that you need to understand carefully. If someone on LessWrong were sharing e.g. their best guess at how to design and build a pathogen that could kill 1%, 10% or possibly 100% of the world’s population, I would be in strong agreement that as an admin of the site I should preliminarily move the post back into their drafts, talk with the person, encourage them to think carefully about this, and connect them to people I know who’ve thought about this. I can imagine that the person has reasonable disagreements, but if it seemed like the person was actively indifferent to the idea that it might cause damage, then I can’t stop them writing anywhere on the internet, but LessWrong has very good SEO and I don’t want that to be widely accessible so it would be the right call to remove their content of this type from LessWrong. This seems sensible for the case of people posting mechanistic discussion of how to build pathogens that would be able to kill 1%+ of the population.

Now, you’re asking whether we should treat criticism of governmental institutions during a time of crisis in the same category that we treat someone posting pathogens designs or speculating on how to build pathogens that can kill 100 million people. We are discussing something very different, a potential infohazard that has a fairly different set of associated intuitions.

Is there an argument here that is as strong as the argument that sharing pathogen designs can lead to an existential catastrophe?

Let me list some reasons why this action is in fact quite useful.

Helping people inform themselves about the virus. As I am writing this sentence, I’m in a house meeting attempting to estimate the number of people in my area with the disease, and what levels of quarantine we need to be at and when we need to do other things (e.g. can we go to the grocery store, can we accept amazon packages, can we use Uber, etc). We’re trying to use various advice from places like the CDC and the WHO, and it’s helpful to know when I can just trust them to have done their homework versus taking them as helpful but that I should re-do their thinking with my own first-principles models in some detail.
Helping necessary institutional change happen. The coronavirus is not likely to be an existential catastrophe. I expect it will likely kill over 1 million people, but is exceedingly unlikely to kill a couple percent of the population, even given hospital overflow and failures of countries to quarantine. This isn’t the last hurrah from that perspective, and so a naive maxipok utilitarian calculus would say it is more important to improve the CDC for future existential biorisks rather than making sure to not hinder it in any way today. I think that standard policy advice is that stuff gets done quickly in crisis time, and I think that creating public, common knowledge of the severe inadequacies of our current institutions at this time, not ten years later when someone writes a historical analysis, but right now, is the time when improvements and changes are most likely to happen. I want the CDC to be better than this when it comes to future bio-x-risks, and now is a good time to very publicly state very clearly what it’s failing at.
Protecting open, scientific discourse. I’m always skeptical of advice to not publicly criticise powerful organisations because it might cause them to lose power. I always feel that if their continued existence and power is threatened by honest and open discourse… then it’s weird to think that it’s me who’s defecting on them when I speak openly and honestly about them. I really don’t know what deal they thought they could make with me (and every other free-thinking person who notices these things?!) where I would silence myself. I’m afraid that was not a deal that was on offer, and they’re picking the wrong side. Open and honest discourse is always controversial and always necessary for a healthy, scientific culture.

So the counterargument here is that there is a downside strong enough possible here. Importantly, when Bostrom shows that information should be hidden and made secret because sharing it might lead to an existential catastrophe.

Could criticising the government here lead to an existential catastrophe?

I don’t know your position, but I’ll try to paint a picture, and let me know if this sounds right. I think you think that something like the following is a possibility. The post in question, or a successor like it, goes viral (pun not intended) on twitter, leading to a consensus that the CDC is incompetent. Later on, the CDC recommends mass quarantine in the US, and the population follows the letter but not the spirit of the recommendation, and this means that many people break quarantine and die.

So that’s a severe outcome.

My first point is that it isn’t an existential catastrophe.

(Is the coronavirus itself an existential catastrophe? As I said above, this doesn’t seem like it’s the case to me. Its death rate seems to be around 2% when given the proper medical treatment (respirators and the like), and so given hospital overload will likely be higher, perhaps 3-20% (depending on the variation in age of the population). My understanding is that it will likely peak at a maximum of 70% of any given highly connected population, and it’s worth remembering that much of humanity is spread out and not based in cities where people see each other all of the time.

I think the main world in which this is an existential catastrophe is the world where getting the disease does not confer immunity after you lose the disease. This means a constant cycle of the disease amongst the whole population, without being able to develop a vaccine. In that world, things are quite bad, and I’m not really sure what we’ll do then. That quickly moves me from “The next 12 months will see a lot of death and I’m probably going to be personally quarantined for 3-5 months and I will do work to ensure the rationality community and my family is safe and secured” to “This is the sole focus of my attention for the foreseeable future.”

Importantly, I don’t really see any clear argument for which way criticism of the CDC plays out in this world.)

I know there are real stakes here. Even though you need to go against CDC recommendation today and stockpile, in the future the CDC will hopefully be encouraging mass quarantine, and if people ignore that advice then a fraction of them will die. But there are always life-and-death stakes to speaking honestly about failures of important institutions. Early GiveWell faced the exact same situation, criticising charities saving lives in developing countries. One can argue that this kills people by reducing funding for these important charities. But this was just worth a million times over it because we’ve coordinated around far more effective charities and saved way more lives. We need to discuss governmental failure here in order to save more lives in the future.

(Can I imagine taking down content about the coronavirus? Hm, I thought about it for a bit, and I can imagine that, if a country was under mass quarantine, and people were writing articles with advice about how to escape quarantine and meet people, that would be something we’d take down. There’s an example. But criticising the government? It’s like a fundamental human right, and not because it would be inconvenient to remove, but because it’s the only way to build public trust. The standard answer here is just correct, that public criticism of powerful institutions is good and healthy.)

(Added: Looking back on this example, I realise that my example is not as clear cut as I thought at the time. For example, often people advising how to break out of quarantine would be doing so in part because they do not trust the government – there have been quarantine-breaking protests for exactly this purpose. I do not think the above stands so clearly, and would need to think about it more to come up with a rule I was confident in.)

The reason we mustn’t silence discussion when we think the consequences are bad, is because the truth is powerful and has surprising consequences. Bostrom has argued that if it’s an existential risk, this principle no longer holds, but if you think he thinks this applies to anything else, let me quote the end of his paper on infohazards.

Even if our best policy is to form an unyielding commitment to unlimited freedom of thought, virtually limitless freedom of speech, an extremely wide freedom of inquiry, we should realize not only that this policy has costs but that perhaps the strongest reason for adopting such an uncompromising stance would itself be based on an information hazard; namely, norm hazard: the risk that precious yet fragile norms of truth-seeking and truthful reporting would be jeopardized if we permitted convenient exceptions in our own adherence to them or if their violation were in general too readily excused.

Footnote on Unilateralism

I don’t see a reasonable argument that this was close to such a situation such that it’s a dangerous unilateralist action to write the post on the CDC. This isn’t a situation where 95% of people think it’s bad but 5% think it’s good.

If you want to know whether we’ve lifted the unilateralist’s curse here on LessWrong, well, you need look no further than the Petrov Day event that we ran, and see what the outcome was. That was indeed my attempt to help LessWrong practise and self-signal that we don’t take unilateralist action. But this case is neither an x-risk infohazard nor worrisome unilateralist action. It’s just two people doing their part in helping us draw an accurate map of the territory.

Thoughts on the Scope of LessWrong’s Infohazard Policies