The left hand side of the example is deliberately making the mistake described in your article, as a way to build intuition on why it is a mistake.
(Adding instead of averaging in the update summaries was an unintended mistake)
Thanks for explaining how to summarize updates, it took me a bit to see why averaging works.
Seeing the equations, it was hard to intuitively grasp why updates work this way. This example made things more intuitive for me:
If an event can have 3 outcomes, and we encounter strong evidence against outcomes B and C, then the update looks like this:
The information about what hypotheses are in the running is important, and pooling the updates can make the evidence look much weaker than it is.
I found the postmortem over-focuses on what went wrong or was sub-optimal. I would like to point out that I found the event fun, despite being a lurker with no code.
There were some reports of people seeing a frozen countdown on the button, that disappeared when the page was refreshed. Was this an intentional false alarm? I had assumed that was the case, as a false alarm with some evidence that it was false echoes some parts of Petrov’s situation nicely.
Just be aware that other users have already noticed messages which could be deliberate false alarms: https://www.lesswrong.com/posts/EW8yZYcu3Kff2qShS/petrov-day-2021-mutually-assured-destruction?commentId=JbsutYRotfPDLNskK
I had not noticed my own Gel-Mann amnesia when reading that bit, and therefore find your response quite convincing. I had thought that Ziv’s answer to (D) made sense due to the FDA being over-cautious about approving things, but both the scope of the precedent and the kinds/directions of errors had not registered with me.
One possible strategy would be to make AI more dangerous as quickly as possible, in the hopes it produces a strong reaction and addition of safety protocols. Doing this with existing tools so that it is not an AGI makes it survivable. This reminds me a bit of Robert Miles facial recognition and blinding laser robot. (Which of course is never used to actually cause harm.)
If the AGI can simply double it’s cognitive throughput, it can just repeat the action “sleuth to find an under-priced stock” as needed. This does not exhaust the order book until the entire market is operating at AGI-comparable efficiency, at which point the AGI probably controls a large (or majority) share of the trading volume.
Also, the other players would have limited ability to imitate the AGI’s tactics, so its edge would last until they left the market.
A hypothesis I had was that the US was sticking to an exact formula due to higher vaccine hesitancy, in order to “play it safe” and give less for anti-vaxers to criticize. After looking at a small handful of countries, I think this is not a significant cause of the difference in responses.
If this were true I would expect countries that have higher vaccine hesitancy to be less likely to do first doses first.
Checking [this data](https://www.thelancet.com/cms/10.1016/S0140-6736(20)31558-0/attachment/720358f5-8df0-405b-b06f-7734cf542a58/mmc1.pdf) which was near the top of search results, and using eyeballed values of strongly agree to “I think vaccines are safe” as the measure:
Canada: 75%, Yes FDF (March 3)
US: 66%, No FDF
Mexico: 60%, Yes FDF (Jan 22)
UK: 50%, Yes FDF (Jan 4)
Germany: 50%, Yes FDF (March 5)
Obviously a really small sample and I am being loose with the data, but it does not support this hypothesis, with no obvious correspondence between vaccine-confidence and when FDF started. I chose the countries in question off the top of my head.
Dates and sources were found by searching online, I have not carefully checked them.
https://www.statista.com/statistics/1195560/coronavirus-covid-19-vaccinations-number-germany/ This graph looks like there is about a 3-week lag in 2nd doses.
https://www.thetimes.co.uk/article/germany-follows-uk-by-delaying-second-dose-of-covid-vaccine-mk65kkh9w March 05, Germany starts FDF.
https://abcnews.go.com/Health/wireStory/mexico-russias-sputnik-shortages-limited-2nd-doses-77617433 Mexico does first doses first due to supply issues with the Sputnik vaccine; the first dose can be produced faster. The article does not mention the save more lives argument.
https://www.nasdaq.com/articles/mexico-may-delay-second-vaccine-doses-and-allow-private-orders-to-tame-raging-pandemic The tone of this piece seems to suggest FDF out of desperation.
From my understanding of the Canada situation, it may have been motivated by less access to vaccines initially. The US did very well in terms of getting lots of vaccines soon (https://ourworldindata.org/covid-vaccinations) while Canada took about 4 months after the US to really get going. Canada may have been more desperate to prevent Covid (or have their numbers stop lagging the US), and thus been less risk-adverse.
This argument does not work for the UK, as they have been ahead of the US the whole time.
https://www.msn.com/en-ca/news/canada/vaccine-panel-says-canada-can-delay-second-dose-of-covid-19-vaccine-if-shortage/ar-BB1cIJaG This article cites the decision being partly justified by limited supplies and how bad things were.
I like how this proposal makes explicit the player strategies, and how they are incorporated into the calculation. I also think that the edge case where the agents actions have no effect on the result
I think that this proposal making alignment symmetric might be undesirable. Taking the prisoner’s dilemma as an example, if s = always cooperate and r = always defect, then I would say s is perfectly aligned with r, and r is not at all aligned with s.
The result of 0 alignment for the Nash equilibrium of PD seems correct.
I think this should be the alignment matrix for pure-strategy, single-shot PD:
Here the first of each ordered pair represents A’s alignment with B. (assuming we use the [0,1] interval)
I think in this case the alignments are simple, because A can choose to either maximize or to minimize B’s utility.
I have put the preferred state for each player in bold. I think by your rule this works out to 50% aligned. However, the Nash equilibrium is both players choosing the 1⁄1 result, which seems perfectly aligned (intuitively).
In this game, all preferred states are shared, yet there is a Nash equilibrium where each player plays the move that can get them 1 point 2⁄3 of the time, and the other move 1⁄3 of the time. I think it would be incorrect to call this 100% aligned.
(These examples were not obvious to me, and tracking them down helped me appreciate the question more. Thank you.)
Another point you could fix using intuition would be complete disinterest. It makes sense to put it at 0 on the [-1, 1] interval.
Assuming rational utility maximizes, a board that results in a disinterested agent would be:
Then each agent cannot influence the rewards of the other, so it makes sense to say that they are not aligned.
More generally, if arbitrary changes to one players payoffs have no effect on the behaviour of the other player, then the other player is disinterested.