Today we celebrate [LW · GW] not destroying the world. We do so today because 38 years ago, Stanislav Petrov made a decision that averted tremendous calamity. It's possible that an all-out nuclear exchange between the US and USSR would not have actually destroyed the world, but there are few things with an equal chance of doing so.
As a Lieutenant Colonel of the Soviet Army, Petrov manned the system built to detect whether the US government had fired nuclear weapons on Russia. On September 26th, 1983, the system reported five incoming missiles. Petrov’s job was to report this as an attack to his superiors, who would launch a retaliative nuclear response. But instead, contrary to the evidence the systems were giving him, he called it in as a false alarm, for he did not wish to instigate nuclear armageddon.
Petrov is not alone in having made decisions that averted destruction — presidents, generals, commanders of nuclear submarines, and similar also made brave and fortunate calls — but Petrov's story is salient, so today we celebrate him and all those who chose equally well.
As the world progresses, it's likely that many more people will face decisions like Petrov's. Let's hope they'll make good decisions! And if we expect to face decisions ourselves, let us resolve to decide wisely!
Mutually Assured Destruction (??)
The Petrov Day tradition is to celebrate Petrov's decisions and also to practice not destroying things, even when it's tempting.
In both 2019 and 2020, LessWrong placed a large red button on the frontpage and distributed "launch codes" to a few hundred "trustworthy" people. A launch would bring down the frontpage for the duration of Petrov Day, denying hundreds to thousands of people access to LessWrong. In 2019, all was fine. In 2020... let's just say some bad decisions were made [LW · GW].
And yet, having a button on your own page that brings down your own site doesn't make much sense! Why would you have nukes pointed at yourself? It's also not very analogous to the cold war nuclear scenario between major world powers.
For those reasons, in 2021, LessWrong is teaming up with the Forum to play a game of mutual destruction. Two buttons, two sets of codes, and two sets of hopefully trustworthy users.
(The button will appear on the homepage on Sunday morning, 8 AM PST.)
If LessWrong chose any launch code recipients they couldn't trust, the EA Forum will go down, and vice versa. One of the sites going down means that people are blocked from accessing important resources: the destruction of significant real value. What's more, it will damage trust between the two sites ("I guess your most trusted users couldn't be trusted to not take down our site") and also for each site itself ("I guess the admins couldn't find a hundred people who could be trusted").
For exact rules of the game, see the final section below.
Last year, it emerged that there was ambiguity about how serious the Petrov Day exercise was. I'll be clear as I can via text: there is real value on the line here, and this is a real trust-building exercise that was not undertaken lightly by either LessWrong or the Forum. Both sites have chosen recipients who we hope will understand this.
How Do I Celebrate?
If you were one of the two hundred people to receive launch codes for LessWrong or the Forum, celebrate by doing nothing!
And you can also play on hard mode: "During said ceremony, unveil a large red button. If anybody presses the button, the ceremony is over. Go home. Do not speak."
This has been a common practice at Petrov Day celebrations in Oxford, Boston [EA(p) · GW(p)], Berkeley, New York, and in other rationalist communities. It is often done with pairs of celebrations, each with a button that can end the other.
Rules of the Exercise
The following email was sent last night to 100 users from the EA Forum. 100 LessWrong users received a similar message.
I invite you to participate in an exercise to determine whether the EA Forum can find 100 users it can trust with a genuinely high-stakes decision.
This year, we’re joining LessWrong in celebrating Petrov Day [LW · GW] — a holiday where we celebrate the non-destruction of the world, and practice not destroying it ourselves.
To prove the goodwill and trust between the two sites, each site is sending “nuclear launch codes” to 100 users we think we can trust. I chose you personally to receive this message.
If you enter your launch codes into the launch console on the Forum’s homepage, they will cause LessWrong’s homepage to go down for the duration of Petrov Day. For the rest of the day, thousands of people will have a hard time using the site; some posts and comments will likely go unwritten. And I’ll have failed in my mission to find 100 people I could trust not to take down our friendly compatriots.
Your code is personalized; if someone enters it, we’ll know whose code took down the site.
This is your code: [CODE]
LessWrong and the Forum both have second-strike capability that will last for one hour after one of the sites is taken down. If the Forum’s homepage disappears, please consider very carefully whether or not you think it is correct to retaliate.
I hope you’ll help us all keep LessWrong safe, and that they’ll do the same for us.
First of all I'd like to thank the Forum team for their hard work producing this nuclear deterrent. We have been extremely lucky that LessWrong did not heed Bertrand Russell's advice during their period of nuclear monopoly. However, I am concerned that we have not yet tested these weapons, and hence we cannot be entirely sure they will function as intended. Perhaps a test strike against a lightly populated military target like the https://www.nytimes.com/ would make an effective demonstration?
I had one of the EA Forum's launch codes, but I decided to permanently delete it as an arms-reduction measure. I no longer have access to my launch code, though I admit that I cannot convincingly demonstrate this.
Thanks for this interesting exercise. Three things I want to say:
#1: For people unaware, pressing the red button means you cannot un-press it, though nothing bad will happen unless you enter a launch code.
After I read this page carefully, I thought it was going to be fine and harmless/reversible for me to press the red button, since I had not received a launch code. I have no intention of bringing the LessWrong site down, and don't plan on entering any launch code, whether random or someone else's, into the page. I also thought pressing the button would be anonymous, but entering launch codes would not be.
I was just curious at what the user interface/experience would be if I press the button, but not enter anything into it. Anyway, apparently pressing the button means you cannot "un-press" the button. So if you're similarly as curious as me, here's what it looks like after pressing the button:
It was only after I read this LessWrong postmortem about Petrov Day 2020 [LW · GW] that pressing the red button, even without entering anything into it, will likely be known as done by you by the Forum/LW team, but probably not announced publicly. So I'm posting this ahead saying me pressing the button was out of curiosity, not out of some bad intention.
#2: I think the EA Forum/LW teams should not publicly name the people who enter random codes into either webpage.
In the same LW postmortem [LW · GW], I also found out that the name of someone entering random codes could be revealed. I think both the EA Forum and LW team should probably not do this. (I haven't entered a random code, but maybe someone else would, without knowing they'd possibly be publicly named.)
You probably don't know whether the person entering random codes had read this post and understood the exercise before doing so. And they might feel some distress about being publicly named as having entered a random code.
#3: The EA Forum/LW teams should not read in too much about the number of people who press the red button without entering a correct launch code.
I assume they probably won't read into it, but I thought it would still be worth saying.
As seen in another comment here (which I assume is not a joke) [EA · GW], someone accidentally pressed the button. And there's probably others who didn't understand what the exercise was about, and were just tempted to click a big red button when they first landed on the Forum today. And then there's probably a few people like me who wanted to see what would happen if the button were pressed but no launch codes were entered.
You shouldn't read too much into the amount of people pressing the button in terms of malice, but you can read into it in terms of negligence, lack of caution or impulsiveness. It's how many people saw a big red button and pressed it without first checking what it does. It's how many people took the chance that pressing it may do something bad even without the launch codes.
I was also curious what happens if you press the button and don't enter the code, but didn't check, because I view pressing the button as something you just don't do - I wouldn't do it even if a site admin specifically told me "you can press the button without any consequence".
Though, having pressed the button, it was a good idea to publish how it looks, and you satisfied my curiosity.
I'd fairly strongly disagree with that take. I think it's an extremely reasonable assumption that a somewhat cartoony red button someone put at the top of a website deliberately does not do harm to press. Someone deliberately chose to put it there, and most features on websites are optimised for user interaction. This only looks unreasonable within the strong frame of having cultural context about Petrov Day
Fair, though I still tend to check what things I press on do before I press them. If there's no explanation I might still press them, but if it says "learn more" right there I will probably learn more before I do.
I'm coming to the conclusion that a private Petrov day game is good but a public one without community buy-in leads to a lot of tense disagreements as to what the game means. In some ways that's a nice analogy for the human condition, in other ways it feels like afterwards we should have some kind of group therapy.
I think I'm softly in favour but I'm glad this only happens once a year. Also I'm 1% worried this is going to end in reputational damage to the community.
People offering forecasting questions like this is really cool, but is there any way to resolve these questions later and give people track records? Or at that point are we just re-inventing Metaculus too much?
Probably a question for Aaron Gertler / the EA Dev team. Semi-relatedly, is there a way to tag Aaron? That might be another good feature.
Another feature request: Is it possible to make other people's predictions invisible by default and then reveal them if you'd like? (Similar to how blacked-out spoilers work, which you can hover over to see the text.)
I wanted to add a prediction but then noticed that I heavily anchored on the previous responses and didn't end up doing it.
I'm also interested in people's predictions had the codes been anonymous (not been personalized). In this case, individual reputational risk would be low, so it would mostly be a matter of community reputational risk, and we'd learn more about if EAs or LWers would stab each other in the back (well, inconvenience each other) if they could get away with it.
As of this comment: 40%, 38%, 37%, 5%. I haven't taken into account time passing since the button appeared.
With 395 total codebearer-days, a launch has occurred once. This means that, with 200 codebearers this year, the Laplace prior for any launch happening is 40% (1−(1−1396)200). The number of participants is about in between 2019 (125 codebearers) and 2020 (270 codebearers), so doing an average like this is probably fine.
I think there's a 5% chance that there's a launch but no MAD, because Peter Wildeford has publicly committed to MAD [EA(p) · GW(p)], says 5%, and he knows himself best.
I think the EA forum is a little bit, but not vastly, more likely to initiate a launch, because the EA Forum hasn't done Petrov day before and qualitatively people seem to be having a bit more fun and irreverance over here, so I'm giving 3% of the no-MAD probability to EA Forum staying up and 2% to Lesswrong staying up.
Also, the reference class of launches doesn't fully represent the current situation: last launch was more of a self-destruct. This time, it's harming another website/community, which seems more prohibitive. So I think the prior is lower than 40%.
I think it would be good for CEA to provide a clear explanation, that it (not LW) stands behind as an organization, of exactly what real value it views as being on the line here, and why it thinks it was worthwhile to risk that value.
I don't think this is a terrible question. Personally, somewhere like $2k to $20k; $2k if one only considers the object level value (say $1M/year ÷ 365 days) , $20k if one thinks that the value is higher than $1M/year or if one really values the intangibles. And because of the unilateralist curse &c, one should probably tend towards the higher amounts anyways.
The appropriate response to someone with the launch codes to a real nuke suggesting we sell them to terrorists is to shoot them, not to wait to see if the terrorists could pay a lot of money; by comparison a downvote seems very apt!
[W]hen you are given responsibility for a communal resource, even if the person giving it to you says “It’s fine to destroy it – play along!” then you’re still supposed to think for yourself about whether they’re right. One of the core virtues of Petrov day is how Petrov took responsibility for launching nuclear armageddon, and he didn’t “assume the people in charge knew best” or “just do what expected of him”. So in some ways I feel like this challenge was unfair in the same way that reality is unfair, and it is a question about whether people noticed their responsibility to the commons without being told that they were supposed to take responsibility.
Just because the site admins gave us the ability to shut down the site does not mean that it is harmless or permissible to do so. Even if they were to tell us it's a game and it's permissible to do so (which they did not) that still would not make it harmless nor necessarily permissible. The stakes still affect the permissibility regardless of what they were to say.
He didn't invite anyone to shut it down. He simply gave more people the power to shut it down than already had it and invited us to practice not using that power. (I think this was permissible.)
But for the sake of argument, even if Aaron did invite us to shut it down, that would not mean that Aaron's action was necessarily permissible. Maybe it would be since service providers have the right to stop providing services, but when the stakes are sufficiently high suddenly deciding to just stop providing a service to harm all your customers seems unethical (e.g. if Bezos and/or whoever else has the authority at Amazon decided to just shut down Amazon without warning).
An obvious question which I'm keen to hear people's thoughts on - does MAD work here? Specifically, does it make sense for the EA forum users with launch codes to commit to a retaliatory attack? The obvious case for it is deterrence. The obvious counterarguments are that the Forum could go down for a reason other than a strike from LessWrong, and that once the Forum is down, it doesn't help us to take down LW (though this type of situation might be regular enough that future credibility makes it worth it)
Though of course it would be really bad for us to have to take down LW, and we really don't want to. And I imagine most of us trust the 100 LW users with codes not to use them :)
The question is whether precommitment would actually change behavior. In this case, anyone shutting down either site is effectively playing nihilist, and doesn't care, so it shouldn't.
In fact, if it does anything, it would be destabilizing - if "they" commit to pushing the button if "we" do, they are saying they aren't committed to minimizing damage overall, which should make us question whether we're actually on the same side. (And this is a large part of why MAD only works if you are both selfish, and scared of losing.)
Everyone cares about something, so maybe we should precommit to something more .. deterring? It should likely be something that's not really bad, but still somewhat uncomfortable for the person to experience. (I realize that going down this path of thinking might produce actual outside-game harm)
I know we're trying to remember when the US and USSR had their weapons pointed at each other but it feels more like the North and South islands of New Zealand are trying to decide whether to nuke each other!
Edit: Not even something so violent - just temporarily inconvenience each other
I briefly saw a "Missile Incoming" message with a 60:00 timer (that wasn't updating) on the buttons on the front pages of both LW and the EA Forum, at around 12pm EST, on mobile. Both messages were gone when I refreshed. Was this a bug or were they testing the functionality, testing us or preparing to test us?
They should have left it up longer if they wanted to test us with it, since it was gone when I reloaded the pages and the timer was never updated while it was up, even though each side would have an hour to retaliate (or it was supposed to give the impression that the hour was over, and it was already too late).
On the one hand, in order for MAD to work, decision-makers on both sides must be able to give credible threats for a retaliatory strike scenario. This is also true in this experiment’s case: if we assume that this will be iterated on future Petrov Days, then we must show that any tit-for-tat precommitments made are followed through.
But at the same time, if LessWrong takes down the EA Forum, it just seems like wanton destruction to similarly take it down, too. I know that, as a holder of the codes, I should ensure that I’m making a fully credible threat by precommitting to a retaliatory strike, but I want to take precommitments seriously and I don’t feel confident enough to precommit to such an action.
After giving this much thought, I decided to present the perhaps-too-weak claim that if the EA Forum goes down due to a LessWrong user pressing the button, I may press in retaliation. While this is not an idle threat, and I am serious about potentially performing a retaliatory strike, I am falling short of committing myself to that action in advance. I give more of my reasoning in my blog post on this.
(Ultimately, this is moot, since others are already willing to make such a precommitment so I don’t have to.)
Attention EA Forum - I am a chosen user of LessWrong and I have the codes needed to destroy the EA Forum. I hereby make a no first use pledge and I will not enter my codes for any reason, even if asked to do so. I also hereby pledge to second strike - if LessWrong is taken down, I will retaliate.
I downvoted this. I'm not sure if that was an appropriate way to express my views about your comment, but I think you should lift your pledge to second strike, and I think it's bad that you pledged to do so in the first place.
I think one important disanalogy between real nuclear strategy and this game is that there's kind of no reason to press the button, which means that for someone pressing the button, we don't really understand their motives, which makes it less clear that this kind of comment addresses their motives.
Consider that last time LessWrong was persuaded to destroy itself, it was approximately by accident. Especially considering the context of the event we're commemorating was essentially another accident, I think the most likely story for why one of the sites gets destroyed is not intentional, and thus not affected by precommitments to retaliate.
Yeah, that did occur to me. I think it's more likely that he's telling the truth, and even if he's lying, I think it's worth engaging as if he's sincere, since other people might sincerely believe the same things.
Surely after the site has been nuked you will no longer be able to enter the codes, because your silos will have been destroyed? And prior to that you risk mis-classifying our civilian space exploration vehicles, whose optimal launch trajectory just happens to go over LessWrong airspace, as weapons?
I hope we invested in secure second strike capabilities. I think Lesswrong has a nuclear triad - we have guest posts on other websites that can launch nukes even after Lesswrong itself has been destroyed
remove Peter Wildeford's launch codes from the list of valid launch codes for both this forum and LessWrong. Reason: he clearly does not understand that this precommitment is unlikely to deter any of the 'trusted' LW users to press the button (see this [EA(p) · GW(p)]David Mannheim's comment and discussion below)
evaluate our method of chosing 'trusted users'. We may want to put specific users that take dangerous actions like these on a black list for future instances of Petrov Day.
I would ask how users are chosen, but I imagine that making that knowledge more available increasing the information risk it will be misused by nefarious actors.
I can't parse the concept of 'precommitment'. I don't intend to launch a first strike, but maybe something will happen in the next few hours to change my intention, and I don't have any way to restructure my brain to reduce that possibility to 0. The reverse applies for second striking.
Sure, precommitments are not certain, but they're a way of raising the stakes for yourself (putting more of your reputation on the line) to make it more likely that you'll follow through, and more convincing to other people that this is likely.
In other words: of course you don't have any way to reach probability 0, but you can form intentions and make promises that reduce the probability (I guess technically this is "restructuring your brain"?)
This is not how I understand the term. What you're describing is how I would describe the word "commitment". But a "precommitment" is more strict; the idea is that you have to follow through in order to ensure that you can get through a Newcomb's paradox situation.
You can use precommitments to take advantage of time-travel shenanigans, to successfully one-box Newcomb, or to ensure that near-copies of you (in the multiverse sense) can work together to achieve things that you otherwise wouldn't.
With that said, it may make sense to say that we humans can't really precommit in these kinds of ways. But to the extent that we might be able to, we may want to try, so that if any of these scifi scenarios ever do come up, we'd be able to take advantage of them.
Yeah, if precommitment is to be distinguished from regular 'intending to do a thing' or 'stating such intention', it must be ripping out your steering wheel in a game of chicken.
Making a promise not to something I didn't intend to - and where doing it would already harm me socially - doesn't seem to add much beyond the value of stating my intentions (and the statement could still be a lie).