Pirates have a bimodal distribution (around 20% and 40% damage) and only the 40% part of the distribution seems to have declined. So, this looks like two different populations and theoretically, the 20% pirates could be the strong, smart pirates who win a lot and back off early if they won’t get an easy win, while the 40% pirates could be weak, stupid pirates who go all out every time.
Still all totally speculative of course.
I haven’t looked much into dependence on time and direction yet, apart from
noticing that the pirates decline in relative frequency
but, I would like some clarification about whether the 100gp budget is for a single set of interventions we’ll use on all trips, or is spent each time on a (potentially varying) arrangement. (edit: I see abstractapplic already responded to such a question from Measure: it’s a single set obtained up front and not changed.)
My current thoughts:
From looking at the probability distributions, I mostly agree with gjm; my current recommendations are the same as GuySrinivasan’s and Measure’s.
Demon Whale distribution scares the crap out of me, and I probably panic buy all 20 oars allowed. I mostly agree that the distribution looks like it could be peaking near 100% damage, but I am not at all confident of this and think it could be consistent with something growing a lot bigger beyond the cutoff. I expect 250-1000 of the destroyed vessels to have been destroyed by demon whales. While it looks like it will be in control before needing 20 oars, I am uncertain enough to value oars pretty high. Budget spent = 20gp.
Merpeople have a very wierd looking distribution. It doesn’t seem to tailing off at the end, and so (like gjm) I am very uncertain about what happens after the 100% cutoff. I think (like GuySrinivasan) there’s a possibility that it has a multimodal distribution with another peak beyond 100% (not saying bimodal since there even looks like there might be a small peak around 25% distorting the main peak of about 50%, though this could very easily be random). I figure merpeople are responsible for around 200 (assuming no extra peak) or potentially vastly more (assuming an extra peak) of the losses. Merpeople are potentially another solid choice for mitigation imo, unless removing them from the encounter table puts in demon whales in as the substitute. Budget spent = 45+20=65gp
I do not agree with gjm that only about 1% of crabmonster encounters are terminal. The distribution seems to be tailing off very slowly, visually more or less consistent with a triangle-shaped distribution. A simple linear extrapolation would suggest a few percent which I would take as a lower bound. But it might not be linear, but slowing in how it tapers off, so it might be much much more than this. For all I know (apart from the finite number of sinkings) it might not even sum to a finite value. On the other hand, we only really care about crabmonster attacks that do less than 200% damage, since the only relevant intervention reduces damage by 50%. I estimate that between about 50 and about 250 of the destroyed vessels to have been destroyed by the relevant part of the crabmonster damage distribution, with potentially unlimited numbers destroyed by crabmonsters outside that range. Despite the expected max of about 250 mitigatable losses I consider arming carpenters a pretty solid choice for 20gp. Budget spent = 20+65=85gp.
Nessie looks like a pretty straightforward distribution where I assume about 10-20 losses are from the bit of the distribution we can’t see. A single cannon (which is all that we can afford after the other purchases) should suffice. Budget spent = 10+85=95gp.
Conclusion: 20 oars (20gp) + pay off merpeople (45gp) + arm carpenters (20gp) + one cannon (10gp), total 95 gp spent. This is the same plan previously recommended by GuySrinivasan and Measure.
Nothing else looks like it can kill us, unless e.g. some bimodal distribution has one of its humps located entirely within the >=100% zone.
However, stepping out of the pure data analysis and into reasoning about the fantasy world, it seems strange that pirates would bother to attack us if they only ever do 64% damage. They are intelligent, after all. Maybe they run away if in a hard fight, not sticking around to do more damage than that, and take over the ship entirely if they win? I might consider a second cannon (also provides extra insurance in case nessie’s distribution extends further than expected). Dropping 3 oars would provide the funds, and will probably still be enough for the demon whale.
(and...also anticipated by Measure on the pirate theory).
They say that now, but perhaps they would change their mind in a hypothetical future where they actually regularly interacted with ems.
I think the idea is that the different Earths have the same landmasses, and in particular the landmass corresponding to our Japan has in Dath Ilan an exile-location function that is vaguely analogous to Australia in the past in our timeline.
It’s possible that natural selection has historically kept the quantity of transposons down to small levels relative to the amount that one gains in non-gonad cells during aging. While this may change now that selection is relaxed, if the transposon suppression in gonads is good enough, it may take a long time. (and selection may not really be relaxed in our case, given our tendencies to late reproduction).
It’s a stochastic process, not a clock. One person gets an extra transposon copy at location A, another gets one at location B, sexual reproduction drops both 1⁄4 of the time.
If the suppression of transposons in the gonads is good enough, the reset could be the same as with any other harmful mutation—shuffling by sexual reproduction and natural selection.
Which may suggest a reason why sexual reproduction exists in the first place.
We see evidence reported that climate change may increase the likelihood of extreme weather events, both hot and cold, in the coming years.
Your source does not seem to support this claim, and from what I am aware (largely, admittedly, from skeptic sources, but I haven’t seen credible mainstream sources contradicting this), global warming is expected to reduce and not increase overall temperature variation (high temperatures increase but low temperatures get more moderate by a larger amount).
With (1) (total number of 1′s) excluded, but all of (2), (3), (4) included:
Confidence level: 61.8Score: 20.2
Confidence level: 61.8
With (2) (total number of runs) excluded, but all of (1), (3), (4) included:
Confidence level: 59.4Score: 13.0
Confidence level: 59.4
With ONLY (1) (total number of 1′s) included:
Confidence level: 52.0Score: −1.8
Confidence level: 52.0
With ONLY (2) (total number of runs) included:
Confidence level: 57.9Score: 18.4
Confidence level: 57.9
So really it was the total number of runs doing the vast majority of the work. All calculations here do include setting the probability for string 106 to zero, both for the confidence level and final score.
I think it depends a lot more on the number of strings you get wrong than on the total number of strings, so I think GuySrinivasan has a good point that deliberate overconfidence would be viable if the dataset were easy. I was thinking the same thing at the start, but gave it up when it became clear my heuristics weren’t giving enough information.
My own theory though was that most overconfidence wasn’t deliberate but simply from people not thinking through how much information they were getting from apparent non-randomness (i.e. the way I compared my results to what would be expected by chance).
Whoops, missed this post at the time.
In response to:
(4) average XOR-correlation between bits and the previous 4 bits (not sure what this means -Eric)
This is simply XOR-ing each bit (starting with the 5th one) with the previous 4 and adding it all up. This test was to look for a possible tendency (or the opposite) to end streaks at medium range (other tests were either short range or looked at the whole string). I didn’t throw in more tests using any other numbers than 4 since using different tests with any significant correlation on random input would lead to overconfidence unless I did something fancy to compensate.
“XOR derivative” refers to the 149-bit substring where the k-th bit indicates whether the k-th bit and the (k +1)-th bit of the original string were the same or different. So this is measuring the number of runs … (3) average value of second XOR derivative and (4) average XOR-correlation between bits and the previous 4 bits...
I’m curious how much, if any, of simon’s success came from (3) and (4).
Values below. Confidence level refers to the probability of randomness assigned to the values that weren’t in the tails of any of the tests I used.
Confidence level: 63.6Score: 21.0
Confidence level: 63.6
With (4) excluded:
Confidence level: 61.4Score: 19.4
Confidence level: 61.4
With (3) excluded:
Confidence level: 62.2Score: 17.0
Confidence level: 62.2
With both (3) and (4) excluded:
Confidence level: 60.0Score: 16.1
Confidence level: 60.0
Score in each case calculated using the probabilities rounded to the nearest percent (as they were or would have been submitted ultimately). Oddly, in every single case the rounding improved my score (20.95 v. 20.92, 19.36 v. 19.33, 16.96 v. 16.89 and 16.11 v. 16.08.
So, looks I would have only gone down to fifth place if I had only looked at total number of 1′s and number of runs. I’d put that down to not messing up calibration too badly, but looks that would have still put me in sixth in terms of post-squeeze scores? (I didn’t calculate the squeeze, just comparing my hypothetical raw score with others’ post-squeeze scores)
True, the typical argument for the great silence implying a late filter is weak, because an early filter is not all that a priori implausible.
However, the OP (Katja Grace) specifically mentioned “anthropic reasoning”.
As she previously pointed out, an early filter makes our present existence much less probable than a late filter. So, given our current experience , we should weight the probability of a late filter much higher than the prior would be without anthropic considerations.
Individuals may be bad at foresight, but if there’s predictably going to be a good price for 100000 coats in a few months, someone’s likely to supply them, unless of course there’s some anti “price gouging” legislation.
If you didn’t account for selection effects, you may have correctly avoided boosting DEX because you thought it was actively harmful instead of merely useless.
I immediately considered a selection effect, but then I tricked myself into believing it did matter by a method that corrected for the selection effect but was vulnerable to randomness/falsely seeing patterns. Oops. Specifically I found the average dex for successful and failed adventurers for each total non-dex stat value, but had them listed in an inconvenient big column with lots of gaps. I looked at some differences and it seemed that for middle values of non-dex stats, successful adventurers consistently had lower average dex than failed ones, while that reversed for extreme values. When I (now—I didn’t at the time) make a bar chart out of the data it’s a lot more clear that there’s no good evidence for any effect of dex on success:
If you didn’t look for interactions, you may have dodged the WIS<INT penalty just because WIS seemed like a better place to put points than INT.
Yep. Thing is, I *did* look for interactions—with DEX. I had the idea that DEX might be bad due to such interactions, and when I didn’t find anything more or less stopped looking for such interactions.
And I’m pretty sure even the three people who submitted optimal answers on the last post (good job simon, seed, and Ericf) didn’t find them by using the right link function
For sure in my case. I calculated the success/fail ratios for each value of each stat individually (no smoothing), and found the reachable stat combo that maximized the product of those ratios. This method found the importance of reaching 8. I was never confident that this wasn’t random, though.
When I did later start simming guesses what I simmed would have given smoothed results: a bunch of stat checks with a D20, success if total number of passed stat checks greater than a threshold. The actual test would have been pretty far down in the list of things I would have checked given infinite time.
>! in reply to:
Graduate stats likely come from 2d10 drop anyone under 60 total
I think you’re right. The character stats data seems consistent with starting with 10000 candidates, each with 6 stats independently chosen by 2d10, and tossing out everything with a total below 60.
One possible concern with this is the top score being the round number of 100, but I tested it and got only one score above 100 (it was 103), so this seems consistent with the 100 top score being coincidence.
You do indeed miss out on some gains from a jump—WIS gets you a decline in success at +1 but a big gain at +3. (Edit: actually my method uses odds ratio (successes divided by failures) not probabilities (successes divided by total). So, may not be equivalent to detecting jump gains for your method. Also my method tries to maximize multiplicative gain, while your words “greatest positive” suggest you maximize additive gain.)
STR − 8 (increased by 2)
CON − 15 (increased by 1)
DEX − 13 (no change)
INT − 13 (no change)
WIS − 15 (increased by 3)
CHA − 8 (increased by 4)
calculation method: spreadsheet adhockery resulting in tables for each stat of:
per point gain = ((success odds ratio for current stat)/(success odds ratio for current stat + n))^(1/n), find n and table resulting in highest per point gain, generate new table for that stat for new stat start point and repeat.
str +2 points to 8, con +1 point to 15, cha +4 points to 8, wis +3 points to 15, based on assuming that a) different stats have multiplicative effect (no other stat interactions) and b) that the effect of any stat is accurately represented by looking at the overall data in terms of just that stat and that c) the true distribution is exactly the data distribution with no random variation. I have not done anything to verify that these assumptions make sense.
dex looks like it actually has a harmful effect. I don’t know whether the apparent effect is or is not too large to be explained by it helping bad candidates meet the college’s apparent 60-point cutoff.
I would worry in a lot of these cases that there’s some risk that your model isn’t taking account of, so you could be “picking up pennies in front of a steamroller”. Not in all cases though − 70-200% isn’t pennies.
But things like supposedly equivalent assets that used to be closely priced now diverging seems highly suspicious.
You need to have a private key to sign, otherwise it would be useless as a “signature”.
For signing (in the non-ring case), you encrypt with your private key and they decrypt with your public key, whereas in normal encryption (again, non-ring) you encrypt with their public key and they decrypt with their private key.
It’s not necessarily structural inefficiency at PredictIt specifically that is causing most of this, but to a large extent bettors pricing in the odds of Trump still winning the election. Apparently Betfair’s odd of Trump winning are still around 10% - link I found from searching for articles on betting odds from the last day, but I wasn’t able to find the odds at Betfair itself.