My hypothesis—Apologising is low status.
Possibly this is no longer the case (apologising first is often seen as a sign of maturity) but I can certainly see this being the case in the ancestral environment.
This would match my experience that apologising is feels awful even when it is entirely my fault.
I guess if the other person has already apologised then me also apologising just puts our status back roughly where we started which is why going second feels much easier.
A previous calculation on LW gave 2.4 x 10^24 for AlphaStar (using values from the original alphastar blog post) which suggested that the trend was roughly on track.
The differences between the 2 calculations are (your values first):
Agents: 12 vs 600
Days: 44 vs 14
TPUs: 32 vs 16
Utilisation: 33% vs 50% (I think this is just estimated in the other calculation)
Do you have a reference for the values you use?
Or, alternatively, did Oxford really find a pharmaceutical company so incompetent that they did this by mistake, on top of giving an entire trial segment the wrong dose of vaccine the first time around? These are some rather epic screwups.
My experience working for a large company makes me not particularly surprised by this and I would give a decent amount of probability to this being an accident. I don’t know enough about the specific procedures to be hugely confident but it does seem most likely to me.
If we’re fairly confident that the wrong dose thing was an accident—I can’t think of any reason to do this deliberately and then try to cover it up—then AstraZeneca obviously have the potential to make big mistakes.
One scenario would be that the person requesting / approving the press release is not the same as the person running the project but rather their boss or their bosses boss or even in another department. The press release approver is less involved in the minutiae and has remembered the 79% figure, maybe even goes so far as to check their e-mails that this is the correct figure (or check with someone else who checks their e-mails). Probably none of these people were in the meeting with the safety board.
I have had this experience myself on many occasions where my superiors have given information to customers that is outdated just from them not being as up to date or forgetting the latest results. Obviously I’d like to think something like this would have more care taken about it but the dosing debacle is suggestive that checking things isn’t AstraZeneca’s strong suit.
That combined with the 0% chance of this not being noticed suggests to me that this wasn’t on purpose.
There is a system for describing human facial expressions—Facial Action Coding System.
This has also been expanded for some animals (chimps, macaques, gibbons, orangutans, dogs, cats, horses). Alas, no dolphins.
I wondered whether a decent amount of the cost increase was in changing from a hatchback to a sedan but I see that this is only $1,000 to go from the Mirage hatchback to sedan. And the Mirage sedan is the same size as a 90′s Ford Escort sedan/station wagon so size doesn’t explain it either.
Yeah, I didn’t actually answer q18 either (possibly knite maybe used my list as a basis?) for exactly that reason. Scott just put me in as the same as him for that question for the purposes of making an apples-to-apples comparison which seemed fine—no idea what I would have put if I had answered!
I’m kicking myself on #16 - I don’t know enough about epidemiology to make such a strong guess.
Yeah, I did a similar thing on #38 where I was similarly overconfident on an economy question which I don’t know nearly enough about.
On #16 itself I was lower than I should have been because I was using “virus” as a reference class rather than “respiratory virus” which was an obvious mistake looking back at it.
It looks like you’re using the correct formula but maybe with a mistake of what the “p” in the formula means so that your scores on questions where the result was “false” are incorrect.
I think you maybe used ln(probability put on “true”)-ln(.5) and then multiplied the result by −1 if the actual answer was false?
The formulation Scott used was ln(probability put on the correct answer)-ln(.5)
So for q3 for example the calculation shouldn’t be
but should be
One for older / more interested kids—the Monty Hall problem.
I remember my uncle spending a long time going through this with me and having to actually run the scenario a few times for me to believe he was right!
Welcome to the predictions fun!
Im impressed with how little you put on 14&15, those were particularly good predictions IMO.
I think there might be an error on your calculation sheet—for instance your score for 3 should be the same as your score for 5?
Looking at the study it doesn’t look like the participants in the trial were randomised—rather if you wanted to use Taffix you could.
If I’m right I’m not sure what to make of it—you could have selection bias either way. More conscientious/concerned people took it or people with jobs where they had higher exposure levels took it. I would guess the former effect would be larger but not sure.
Yes, I agree Russia was unlikely to be above US for population reasons, I mentioned them more as an example of how bad under-reporting can be—I can’t think of a way other than Covid to get 147k unaccounted for excess deaths but I could be missing something. I had concerns about this in all 3 of China, India and Brazil (although I guess there’s the chance that we wouldn’t get (accurate) excess deaths numbers anyway). 85% for 6 seems right but only dropping 5% for 17 seems low.
A commenter on Scott’s post has made a case for India deaths being higher than US (enough to convince Scott it seems).
p(17|16) = p(17) / p(16) = 0.2 / 0.7 ~ 0.29 (as p(17|¬16) = 0)
Its possible / likely that I’m still missing how difficult it is to win a parlay but:
Given Covid is seen as seasonal by the end of the year, there was very likely some wave in Autumn—the main question is whether it meets the conditions set out in 17
At the time of prediction it seemed almost certain that we would get below the thresholds with the next month or two
I expected (but wasn’t certain) that a second wave would take us back above one of those thresholds.
There remains the question of having a wave in the middle (Autumn wave is therefore not second wave). This was somewhere that my model was expecting a profile in the US more like what happened in the UK/Europe where cases/deaths were at a very low level for most of the Summer. This is a common thread in a few of my other predictions about US numbers—I generally underpredicted slightly but noticeably and this was a significant cause for that. So yeah, definitely an oversight from me in that regards.
I was going to write up my thoughts on this but it would be easier to just comment here.
I agree with your assessments for almost all of these. I was most impressed by your understanding of the politics in Q9 & 11 (China and Hydroxychloroquine) and the predicting the lack of consensus for Q14 & 15.
A couple where I have a question:
1. On 6⁄7 (US highest toll official & unofficial) I had a bit more probability on Brazil (similar to India, more than China) – given large population (2/3rds US) and approach of the government.
Regarding official vs unofficial, you only mention deliberate lying but I had more expectation of insufficient / bad testing hiding true amounts than lying. According to WSJ Russia’s excess deaths are 4.8x higher than their official deaths (compared to 1.7x for US). This isn’t enough to overtake the US but I think this gives an idea of the scale of the potential problem. Mexico’s excess deaths are higher than Brazil’s despite having 35% fewer official cases. (India isn’t included in those numbers—excess deaths stats aren’t available I think).
Does that change your mind as to what a good prediction would have been?
2. On q17 (second wave) your prediction for p(17|16) is ~29%. Given that we are in a world where there is a general consensus that summer made things less bad, 29% seems low for a second wave even given the difficult operationalisation? My corresponding number was 50% which still seems better to me (although I messed up q16 so we actually predicted the same for 17 itself). In terms of which way it resolves, I think just numbers of deaths resolves this as clearly true (assuming by Autumn we mean 22 Sep – 21 Dec), both in terms of official result and intent:
Was there a second wave in Autumn? Yes, in late Autumn running into early Winter.
The problem is the notice given which results in the low correlation you mention. (by audit I don’t really mean financial audits as I don’t have experience of those—I’m more thinking of quality audits)
I find it interesting that company audits (that I’ve experienced anyway) suffer from the same problem as ofstead inspections.
It is perhaps worth noting that Ofstead inspections are nowadays done with a day advance warning and can be done with no warning.
The question of how much more infectious B.1.1.7 is is pretty useless without also referencing a generation time estimate.
Generally true, but in using contact tracing data the English analysis is answering the “how much more infectious” question directly rather than relying on inferring from relative growth rates and estimated generation times.
The 37% error does revise my estimate a bit for how confident I should be that it is <50% (although even correcting that it is probably still under 50% according to Zvi) but I still expect it to end up that side of the equation. If I was answering that survey now I’d be at 20% or so.
If, however, you previously had accepted that the English strain was more infectious, and the question was how much more infectious, then news of the answer could be good or bad. In this case, it’s good. This is an estimated 37% increase in infectiousness.
If, however, you previously had accepted that the English strain was more infectious, and the question was how much more infectious, then news of the answer could be good or bad. In this case, it’s good.
This is an estimated 37% increase in infectiousness.
Interesting to look back on the English strain prediction from December. It looks likely now that this will resolve in the negative (37% agrees with contact tracing results from England (30-45%) and Netherlands data (40%)).
This is interesting.
Trying to think when logarithmic thinking makes sense and why humans might often think like that:
If I am in control of all (or almost all) of my risks then a pandemic where I am only taking a 0 person risk is very different from a pandemic where I am taking a 99 person risk. So moving from 0 to 1 in the presumably-very-deadly-pandemic where I am being super-cautious is a very bad thing. Moving from 99 to 100 in the probably-not-too-bad-pandemic where I’m already meeting up with 99 people is probably not too bad.
So thinking logarithmically makes sense if my base level of risk is strongly correlated to the deadliness of the pandemic. The more sensible route is to skip the step of looking at what risk I’m taking to give me evidence of how bad things are and just look directly at how bad things are.
In the 2 examples you give there are external reasons for additional base risk and these are not (strongly) correlated with the deadliness of the pandemic.
Trying to quantify these effects:
1. Salaries can’t add much, especially if you’re looking at mass producing. If you’re creating 500 vaccines then maybe it takes a couple of hours? Say $20/hour (looking at local job listings for this kind of role) we get 8c/dose on salary. As you scale this is only going to go down.
2. It seems like vaccine trials can be done for a few hundred million although there is a big variation and I’m not completely sure whether the numbers given there include some manufacturing build up. If a large pharma company is going to be making lots of vaccine it seems like they should be able to achieve that for less than $1/dose.
3a. Taxes may add a decent few percent but can’t be a main driver of cost
3b. Shipping costs for refrigerated goods are maybe 5c per 1000 miles per kg. That data is from a while back (1988!) and costs might be a bit higher for colder temperatures but I can’t see this being a large fraction of the cost.
4a. For liability I note that at least AstraZeneca have struck deals in most countries to be exempt from such liabilities. It seems that in the US all COVID vaccines will benefit from this.
4b. Some companies (at least AstraZeneca and Johnson & Johnson) have said that they will be selling their COVID vaccines at cost. Even lacking this, I wouldn’t expect corporate profits to be huge, even just from a PR point of view.
5. Risk of failed vaccine trials. If you only expect to have a 1 in 3 chance of successful stage 3 trial then the $1/dose from 2 becomes $3/dose to expect to break even. I’m not sure whether this risk is covered by governments—I think it was to some extent but am not confident.
Given Dentin’s comment that the material cost if something like 10c/dose (which makes sense given how little it cost to double John’s peptides order) then I think most of the cost looks like it is in the trials and risk of failure thereof but this isn’t enough to explain why companies aren’t doing this. Its probably too late now anyway as vaccines already approved should have the pandemic under control before any new trials would be complete.