Postmortem to Petrov Day, 2020

The first underwater test of an atom bomb, in the US at Bikini Atoll in the Pacific, July 24, 1946.
Also, the LessWrong.com Frontpage this Petrov Day.

We failed.

The Frontpage was taken down for 24 hours.

Launch Attempts

At 1:26 am, ~275 users got an email (and 20 minutes later a PM) from me, with their unique launch codes. At this time, the Big Red Button was on the Frontpage, and codes could be submitted.

Two users attempted to submit launch codes. It was made clear that any such attempts would be deanonymised, as the button itself said “This is not an anonymous action”.

The users were: Grotace at 2:09 am, who submitted the random string “kdjssndksjnd” to no effect, and Chris_Leong at 4:33 am, who submitted his personalised launch codes, taking down the site.

What happened?

Note that the following includes private messaging metadata about a number of users. All users (except for the attacker) were asked for their explicit consent for it to be included, and it would have been anonymized had they declined, even if it was clearly deducible from the public record. The LessWrong team only looks into private messaging metadata when there is a suspicion of sockpuppeting, phishing, or other forms of abuse, or when permission has been given by the user (e.g. for debugging). Additionally, Chris Leong commented on a draft of this post before it was published.

On receiving his codes, Chris Leong posted on LessWrong and Facebook asking whether he should take down the site.

He also wrote a comment on LessWrong:

Should I press the button or not? I haven’t pressed the button at the current time as it would be disappointing to people if they received the email, but someone pressed it while they were still asleep.

He did not get replies on the LessWrong comment before he took the site down. The first two comments on the Facebook post were as follows.

Commenter 1

I kind of feel like you should take it down for a day. No individual should be trusted with the ability to destroy the world. We are all driven by personal incentive mechanisms and other systems must be developed to ensure it’s not that easy for one person to bring so much chaos into the world.

Also… when’s the next time you’re going to get the opportunity to take down a massive site for a day legally?

Commenter 2

LessWrong is archived elsewhere so the damage by bringing it down (especially just for a day) is not that much. Equally, there isn’t much benefit that you would get by bringing it down. As the website says, if you decide to enter your launch codes, it’s not anonymous. So the question becomes one of social capital.

After the second comment, the following happenned:

  • A LW account was created with the username “petrov_day_admin_account”.

  • From 4:21 to 4:34 am (Pacific Time), this user opened 6 conversations, sending only 4 messages.

  • They sent one message each to Chris_Leong, Raemon, Habryka, and Liron.

  • They all contained identical body text with the name changed for each user (although Liron’s message began with “Hello Chris_Leong”).

  • Chris Leong’s message was sent at 4:26 am.

  • A conversation was opened with adamzerner, and a second conversation was opened with Chris_Leong, but no messages were sent, as by that time it was no longer necessary.

Here were the next three comments on Chris’s Facebook post. (By all accounts the relevant text below matches the text in the messages sent to the above users.)

Commenter 2

Well, I sent out this message to a lot of users, now the site has gone down, but it doesn’t say who pressed the button so it could still be a coincidence:

Hello <NAME>,

You are part of a smaller group of 30 users who has been selected for the second part of this experiment. In order for the website not to go down, at least 5 of these selected users must enter their codes within 30 minutes of receiving this message, and at least 20 of these users must enter their codes within 6 hours of receiving the message. To keep the site up, please enter your codes as soon as possible. You will be asked to complete a short survey afterwards.

Chris Leong

Damn it, you got me! Well done

Commenter 2

No problem :D Maybe the lesson is that if you can destroy the world you shouldn’t let people know? Not really sure, just thought it would be fun.

Chris Leong

Well played!

And that is the story of how LessWrong was taken down for 24 hours by Chris Leong, on Petrov Day.

What went wrong?

One of the most important fact about this year is how many users were given codes. Last year had success with ~125 users getting codes, and this year the total was more than doubled to ~275 users. This weakened the selection filter which ultimately failed.

Well-intentioned does not mean secure

The first mistake was in how I chose which users to entrust with codes.

The main buckets I made were high karma users I know and have a judge of character (~140), high karma users I don’t currently know very well (~50), and low karma users I know and have a judge of character (~80).

I weighed my selection toward users using real names instead of pseudonyms, as reputation is a factor people track when deciding whether their actions are good or bad.

For high karma users I don’t know very well, I spent about 5 hours reading through their comments and posts, asking myself questions like the following:

  • Does this user write comments in good-faith?

  • Does this user put in time and effort to explain their ideas in the comment sections?

  • Does this user write short, snarky comments?

  • Does this user try to help the site grow, such as by proposing and giving feedback on new features?

  • Does this user have a grudge against LessWrong?

  • Does this user seem to understand and partake in the culture and sense-of-humor of LessWrong?

Overall, I picked users on the basis I thought they would not choose to take the website down. I didn’t consider for a moment that it would not be a matter of direct choice at all, but that someone would fall for a standard phishing attack.

A classic mistake on my part. In that it is the opening paragraph of Eliezer’s classic paper Cognitive Biases Potentially Affecting Judgment of Global Risks:

All else being equal, not many people would prefer to destroy the world. Even faceless corporations, meddling governments, reckless scientists, and other agents of doom require a world in which to achieve their goals of profit, order, tenure, or other villainies. If our extinction proceeds slowly enough to allow a moment of horrified realization, the doers of the deed will likely be quite taken aback on realizing that they have actually destroyed the world. Therefore I suggest that if the Earth is destroyed, it will probably be by mistake.

So the first lesson is that if you want to entrust others with powers that can be easily abused or used destructively, it is not sufficient for them to be well-intentioned, it is also necessary for them to be hard enough to trick that the payoff isn’t worth it for any external adversary.

Users saw different meanings to the ritual

The launch codes gave users the ability to take down the Frontpage for 24 hours, a page that a few thousand people visit every day. A single user being unable to use it is a simple irritation; once you multiply this by thousands it becomes a non-trivial communal resource.

However, while many users understood that they were being given responsibility for the commons, many users did not see a clear connection to Petrov’s situation, whose choices were different in many ways to the LessWrong Big Red Button. Furthermore Neel Nanda commented he felt that the communications around it felt like “RPG flavor text” which suggested it was not to be taken seriously. This confusion made the day less meaningful for a number of people, and in particular some people felt confused about how it could possibly symbolize Petrov’s choice and thus thought it was “just a game”.

To respond to that briefly: when you are given responsibility for a communal resource, even if the person giving it to you says “It’s fine to destroy it – play along!” then you’re still supposed to think for yourself about whether they’re right. One of the core virtues of Petrov day is how Petrov took responsibility for launching nuclear armageddon, and he didn’t “assume the people in charge knew best” or “just do what expected of him”. So in some ways I feel like this challenge was unfair in the same way that reality is unfair, and it is a question about whether people noticed their responsibility to the commons without being told that they were supposed to take responsibility. On the other hand, in some ways this was a harder challenge than Petrov’s, because the stakes were measured in a much smaller communal resource and not in the lives of a billion people, and because Petrov had more lead time to think before the time came when he had to make the decision.

Nonetheless, several users requested that the relation between the exercise and Petrov’s decision be made more explicitly before next year, which seems quite reasonable to me. So in advance of next year, I’ll write up an explanation of why I think the Big Red Button is an accurate symbolic setup to Petrov’s situation.

I did not check who submitted launch codes last year

Note that this section includes private data about a user (Chris Leong). The user was asked for their explicit consent before it was included, and had they declined then this section would have been redacted. It is further the case that Chris Leong commented on a draft of this post before it was published.

Last year, Chris_Leong pressed the Big Red Button, guessed an incorrect launch code, and submitted it.

(He was not given real launch codes last year.)

I did not think to check the group of 7 accounts with non-zero karma who submitted random launch codes last year, and ensure that they were not in this year’s batch. I will be doing so in future.

Chris was the only one in that batch who was given codes this year. To me, this is suggestive evidence that of the people in this year’s batch Chris was the most likely to press the Big Red Button, and causes me to update that had he in particular not been in the batch, there quite likely would not have been a next person to press the button instead.

In a comment on a draft of this post, Chris noted that he doesn’t recollect doing this last year, and also asked me what message was displayed next to the button last year.

The answer is that there was no message next to the button last year, it looked like this:

The Big Red Button in 2019.

Indeed, last year I know of a user (not Chris Leong) who visited the site, clicked the red button, and entered and submitted codes, before finding out what the button did.

As a result of that user, this year we changed the red button to the following, so that mistake would not happen again.

The Big Red Button in 2020.

I nonetheless think that playing with the button is a clear sign that I shouldn’t have included someone in this year’s batch.

I should not have given launch codes to Chris

Chris is well-intentioned, but this did not mean he could be entrusted with this communal resource. A number of factors in retrospect show that I should not have given the codes to Chris.

  • He didn’t think he had a responsibility to the commons in this situation. He treated it like a game, even though he was given the ability to destroy a non-trivial communal resource.

  • He failed to model that he was not secure and that people would like to trick him on this matter, so posted that he had codes in a place where someone tricked him into using them.

  • He didn’t realize that a lot of users cared seriously about this tradition and this exercise, which could have led him to rethink the above two bullets.

By the way, if you’d like to read Chris’s account of what happened, see his post On Destroying The World.

Looking Ahead

This is the 14th year of Petrov Day, and the 2nd year that we have observed Petrov Day with the Big Red Button ritual on LessWrong. On many continents there were a number of annual ceremonies celebrating Petrov Day, even in this socially distanced time, remembering how fragile the world is and our responsibility to not destroy it.

I learned a lot this year. I think many of us did.

(Thank you to everyone who participated in the lively discussion.)

I definitely updated upward on how many counterfactual worlds ended this way in the Cold War, where some third party to the conflict attempted to trick one of the players into instigating an attack, perhaps by posing as a relevant authority and saying it was a ‘training exercise’.

I think this is one of the more innocuous ways that the site could’ve gone down (not through malice, but through lack of taking it seriously, light trolling, and a security failure).

I hadn’t though about red-teaming the site. We often learn the most important principles the hard way. While pentesting your friends for their private details and other low-consequence effects is great, I don’t think much of people who unilaterally try to take down something more like a public utility for 24 hours. But it is certainly a valuable addition to the setup, and I’ll look into setting up some red-teaming next year.

Last year I assigned a 40% probability to the site going down, and it didn’t. This year I assigned a 20% probability to the site going down, and it did.

I think had this happened the first year we tried it, I would expect a common response to be that it was obvious that we would fail, and that we were far too credulous if we that thought we could give codes to over one hundred rationalists and one of them wouldn’t take down the site.

Then, after last year’s success, I received many messages saying that it was clear nobody would take it down this year. So in some ways this has been good for making the challenge level clear – this is a real coordination problem, that we can fail, and isn’t overdetermined in either direction.

Onwards and upwards. This was a collective coordination problem for LessWrong, and we’re 1 for 2. We’ll return next Petrov Day, for another round of sitting around and not pressing destructive buttons, alongside our ceremonies and celebrations.

So far we’ve had one Petrov Day on LessWrong without pressing the Big Red Button.

I hope there are many more to come.


Afterword

In spite of Chris taking down the site this week, I still appreciate Chris’s many positive contributions to LessWrong and our associated communities, and for everyone else’s sake I’ll briefly list some of them here.

I’m grateful for all of the above contributions Chris has made, his contribution to LessWrong has been clearly net positive.