D&D.Sci: The Choosing Ones
This is a D&D.Sci scenario: a puzzle where players are given a dataset to analyze and an objective to pursue using information from that dataset.
Thank you to Juan Vasquez for playtesting.
Intended Difficulty: ~3.5/5
The fairy in your bedroom explains that she is a champion of Fate, tasked with whisking mortals into mysterious realms of wonderment and (mild) peril; there, they forge friendships with mythical creatures, do battle with ancient evils, and return to their mundane lives having gained the confidence that comes with having saved a world[1]. But there’s an unusually large, important world experiencing an unusually non-mild amount of peril—she’d even go so far as to call it moderate peril! - and in this circumstance, only the best will suffice. For this reason, she fervently and humbly entreats you . . .
. . . to use your Data Science skills to help her decide which potential Chosen One she should Choose.
(Upon hearing this last part, your mounting concern immediately dissipates. Adventures in a mysterious realm of wonderment would be hard to schedule around, but you’re always up for spending an evening on a Data Science problem, especially if it might mean a chance to collect favors from the fae.)
Seeing your face brighten, the fairy hands you a table of her colleagues’[2] predictions of percentage success rates[3] for past heroes, alongside said heroes’ actual outcomes, and a table of their predictions for the prospective Chosen. Whom will you have her send?
Bonus Objectives
The fairy[4] figures that, if she’s breaking the taboos prohibiting seeking advice from a mortal, she might as well not do things by halves. Her further questions:
Which of her colleagues makes the best / most useful predictions, when considered in isolation?
Which (if any) of her colleagues could be considered redundant?
If your choice of Chosen were unavailable, which of the forty candidates would you consider the second- and third-best options?
I’ll post an answerkey, along with an explanation of how I generated the dataset, sometime on Monday 26th May. I’m giving you nine days, but the task shouldn’t take more than an evening or two; use Excel, R, Python, or whatever other tools you think are appropriate. Let me know in the comments if you have any questions about the scenario.
If you want to investigate collaboratively and/or call your choices in advance, feel free to do so in the comments; however, please use spoiler blocks or rot13 when sharing inferences/strategies/decisions, so people intending to fly solo can look for clarifications without being spoiled.
- ^
. . . except when they fail.
- ^
The fairy notes, in passing, that before her workplace hired a full-time Healer her colleagues would occasionally take sick days; the predictions for fairies who couldn’t be there to make them are recorded as zeroes.
- ^
You try asking for the information they used to make those predictions, but she just stonewalls you, muttering something about “unquantifiable truths”, “incomprehensible to mortal minds” and “data protection legislation”. She does, however, offer a larger table representing everyone who could have been Chosen and whether they actually were.
- ^
You didn’t think to ask her name early on, and saying anything now would just be awkward.
Amy’s score is always 1 or 99, and is completely independent of all other scores, and seems almost uncorrelated with success. She might just be flipping a coin, but she only gives 99 about 1⁄4 of the time. Flipping two coins?
Holly’s score is moderately-well-correlated with all other scores except Amy’s. I suspect her of knowing Amy is flipping a coin, and of just averaging out all the other faeries’ scores to get her own, but I have no proof yet.
Bella, Colleen, Liboulen and Linestra’s scores all heavily correlate with one another. Starting to disentangle them:
Colleen is copying Linestra: she gives a score 1.7 more than Linestra’s, or a 50 if Linestra is sick.
Bella and Liboulen’s scores appear to suggest the following world model:
Each hero has three stats (for lack of anything better I will call them A, B and C, standing for...uh...Attractiveness, Beauty, and Charm).
Each of these stats are integers from 1 to 10.
Bella gives a hero a score of A + B − 1.
Liboulen gives a hero a score of 5A—B + C + 40.7
Linestra is clearly doing something related as well, but I haven’t figured out what yet. Her scores charted against Liboulen’s are particularly bizarre. And sadly, I will need to figure her out in order to reconstruct A, B and C for each hero.
UPDATE: Linestra is something along the general lines of 4A + 1.2B + 2.5C + 22 + a tiny bit of noise.
FURTHER UPDATE: my desire for neatness has caused me to settle on (3.6*A + 1.2*B + 2.4*C) + 23 plus or minus at most 1.
Fizz, Ister and Ziqual again all correlate with one another. I haven’t dug into them yet.
Fizz, Ister and Ziqual appear to be driven by three different variables: let’s call them D, E and F (Doubt, Envy and Fear?).
Ister gives 50+D
Ziqual gives (D*E). He then subtracts 1 about half of the time, but never if E==1. (Hopefully also not if D==1, but it’s hard to be certain on that side).
Fizz gives Min(D, E) + 2F + 41.
We now have six variables, which makes me suspect that actually these are meant to be STR/DEX/CON/INT/WIS/CHA in some order. I can’t reconstruct which order, though. (Though if five of them seem valuable and one seems useless I am going to be open to the possibility that this is the same winrate calc as in the original D&D.Sci).
The obvious next step is going to be taking the success/failure data and evaluating it based on these six derived variables. Back soon...
A, B and C (the three stats that Bella/Liboulen/Linestra care about) are all slightly positively correlated with one another. D, E and F (the three stats that Fizz/Ister/Ziqual care about) are again all slightly positively correlated with one another.
However, each of A-C is slightly negatively correlated with each of D-F. This is true in the candidate data, it’s not an artefact of how the fairies choose.
My current theory is that e.g. A-C are Physical stats, and D-F are Mental stats (or vice versa), and that these are correlated between potential heroes. This also suggests some faerie politics, with the Physical Stats Caucus and the Mental Stats Caucus pushing for different types of hero.
Most stats seem straightforwardly beneficial to increase. D-F seem slightly more valuable than A-C.
Given that Fizz/Ziqual sound like male names, while Bella/Linestra sound like female names, and our faerie is female, she’s more likely to be in the A-C Caucus than in the D-F caucus: don’t tell her that D-F are more valuable until you figure out her name.
In particular, it looks like A-C have diminishing returns while D-F have increasing returns. Increasing A from 9 to 10 actually might be actively bad. Increasing B/C from 9 to 10 is good, but nowhere near as good as increasing them from 1 to 2. On the other hand, increasing D-F seems to get even better as they get higher (though E in particular looks a bit odd).
Still to do:
Check whether Amy or Holly knows anything that isn’t encapsulated in stats.
Check for interactions between stats: is there a breakpoint on e.g. STR > CON or INT > WIS? We could see the diminishing returns on A-C if they were penalized for being higher than D-F?
No stat pairs exhibit interesting effects.
Holly’s score is given by the sum of all 6 stats, plus 20, plus a number from 1 to 12. Despite my initial hope that this was a seventh stat, it is not: or, at least, it exhibits no correlation with success.
Amy’s score actually does seem to have some small but non-zero predictive power that isn’t related to stats. I’ve included it in my regression, though it doesn’t actually change my top three list. It does, however, make me suspicious. There are two possible explanations for this:
Amy might be observing some trait of heroes that is not one of the six stats and nevertheless predictive of their success.
Amy might be slipping some quiet help to her preferred candidates/sabotaging her non-preferred candidates. Votes of 1 and 99 suggest that she’s trying to have as large an effect as possible on the selection of Chosen, and so she might be doing something else sneaky.
Current answer:
My current top candidate is #11 (stats of 7-4-7-10-10-7). If they should Refuse The Call, my current second place is #19, (5-2-5-10-9-10, also supported by Amy), and my current third place is #7 (10-2-9-7-9-6).
I’ll tweak the regression a bit and see if anything changes, but #11 is very far ahead of the pack, with the highest stat total and a skew towards the D/E/F stats that are more valuable, so I don’t expect them to stop being at the top.
Sadly, these are also the same top three candidates, in the same order, as you get by doing none of this work and just running a linear regression.
:(
CONTAINS FINAL ANSWER
On further examination, it looks like there are bonuses assigned for the minimum of the three stats A-C (I’ve been calling these the ‘physical stats’) and the maximum of the three stats D-F (I’ve been calling these the ‘mental stats’).
This doesn’t dislodge #11 from the top of the list, but it does move up #2 (whose minimum physical stat is 6) and worsen #19 and #7 (whose minimum physical stat is 2).
My final top 3 is #11, then #19, then #2. (If the fairy in question seems disappointed to see #11, it’s probably Amy, and I’ll recommend her #19).
I appreciate your analysis. It’s was fun to try my best and then check your comments for the real answer, moreso than just getting it from the creator.
Could you please explain how you inferred the existence of A B and C? I’d like to know more.
So this boils down to interpreting scatter charts.
Say you plot two normally-distributed numbers against one another. You get something that looks like this:
If instead you plot two d6 rolls against one another, you see this:
with sharp cutoffs because the d6 roll is bounded at 1 below and 6 above, and with a regular grid because the d6 roll is always an integer.
Various relationships between the variables can show up in the scatter chart
If Y is the sum of two d6 rolls, and X is the first roll, you see this:
You can think of this graph as being made up of various stripes:
The vertical green line is ‘every value the second die can roll, given that the first die rolled a 2’.
The diagonal orange line is ‘every value the first die can roll, given that the second die rolled a 4’.
Suppose that X = twice the first die plus the second die, and Y = twice the second die plus the first die:
Again the points form a grid, and again we can see patterns. Since the green line has 6 points on it and moves [up 2 and right 1] each step, we can see something that takes 6 discrete values and applies 2x its value to Y and 1x its value to X.
Now plot Bella’s scores against Liboulen’s:
This is a bit more complicated because there are three variables rather than two. But you can still imagine the same lines:
and you can disentangle the corresponding variables.
Thank you very much! This is very clear!
I was able to deduce them by
making a scatter-plot of Colleen vs Liboulen’s predictions. You can see that this plot has the points on a “flattened prism” in 3 directions, and manually count the shifts and see that each of the underlying components has 10 possible values.
Once you have that structure, you can pick out points on the extremes and use their slopes to calculate some of the relevant slopes. Finally, I brought in Bella’s info and used that to work out the remaining stats. (I used chatGPT for some help throwing together some linear regressions, but they needed a good bit of tweaking to be functional, and mostly agreed with the slopes that I had calculated by just looking at the scatterplots.)
OK: so, based on doing a bunch of calibration plots, mutual information plots, and two-way scatter plots to compare candidates, this is what I have.
Candidate 11 is the best choice. 7 and 34 are my second choices, though 19 also looks pretty good.
Holly gives the most information, she’s the best predictor overalll, followed by Ziqual. Amy is literally useless. Colleen and Linestra are equivalent. Holly and Ziqual both agree on candidate 11, so I’ll choose them.
Interestingly, some choosers like to rank clusters of individuals at exactly the same value, and it isn’t clear why. None of our current candidates fall into those weird clusters, so maybe its historical?
Also, lots of the numbers end in .7, I guess the faeries just love the number 7. I think there’s at least three stats going on, and each predictor is seeing some function of the stats, since many of the heatmaps look like a discrete grid.
Did a little data exploration and then tried
k nearest neighbors. That seemed good enough to provide answers, though it didn’t provide deep understanding. I’ll pick
candidate 11, and for the bonus questions
Holly has the best ratings in isolation, Colleen is a redundant copycat and Amy is noise, and as backups I like 7 and then 19.
I am going for number 11, mainly because other adventurers with predictions similar to 11 did unusually well.
A few miscellaneous observations:
Ister, Ziqual and Fizz seem to have some pretty deterministic structure connecting them.
Ister always predicts an integer between 51 and 60 inclusive.
Ziqual’s prediction is equal to (Ister − 50) * (Integer from 1 to 10) - (one of 0, 1). Multipliers in the 5 to 7 range are most common.
Fizz’s prediction is less than or equal to (Ister’s prediction + 10). Fizz’s prediction is greater than or equal to 44.
Separately, a scatterplot of Liboulen and Colleen’s predictions has a lot of structure: [Scatterplot removed since it seems to show up through the spoiler. Message me if you want to see it.]
Note that each of the 3 “axies” of this “prism” has 10 separate blobs of points. This makes me suspect that Liboulen and Colleen are each a weighted sum of 3 underlying integer variables that each range from 1 to 10. (There would also need to be a small noise term or other factor, since the points do not perfectly fit this pattern.) The noise term seems to only apply to Coleen’s estimates, as Liboulen’s estimates have way less distinct values.
Bella seems to have some interactions with these two. Linestra is an almost perfect clone of Colleen, but her estimates are either 1.7 or (occasionally) 1.9 lower.
It feels like it should be easy enough to find the coefficients corresponding to the 3 visible slopes formed by the edges of this figure. Based on some data slicing and eyeballing the graph above, I think the coefficients for L are 5, 1, 1, and the coefficients for C are approximately 3.6, −1.25, and 2.47
I am sure that there is a linear algebra regression to find the exact values, but I haven’t figured it out yet.
A summary of some interesting results. I am leaving how I found some of this out for now, for brevity’s sake.
I have manage to extract 6 integer variables that range from 1-10.
3 of them are from the components of (Coleen, Linestra, Liboulen, Bella), the other 3 are from (Fizz, Ister, Ziqual).
Each of them has a very similar histogram, sort of like a truncated normal distribution. A linear regression of them with Holly gives approximately 1 as their coefficient, except for 1 variable (which I am calling X2 for now) which has a coefficient of roughly −1.
All of these underlying variables have a magnitude of correlation with the candidate succeeding, between 0.11 and 0.17 , with X2 being the only negative correlation.
When looking at the Fae council, I noticed:
When Linestra is gone, Colleen predicts exactly 50 each and every time. This suggests that she is plagiarizing Linestra. She only ever gives 50 rating when Colleen is missing or predicting 48.3.
When Colleen is gone, the correlation between Linestra’s rating and the candidate being chosen is almost halved. This is consistent with the council using a voting process.
At this point I am throwing everything that I found in a linear regression, because I ran out of time. My pick is:
Candidate 11, with an estimated 0.91 chance of success.
Candidates 19 and 7 would be my next choices, with 0.87 and 0.85 estimated chances of success respectively.
If I had had more time to work on this, I would have like to look at:
Why do 3 of the “stats” have diminishing returns, while the other 3 have increasing returns?
Are there any temporal trends?
Can I find anything else out from the sick days?
Why does adding the Bella/L/L stats together result in a spike for very low stats? (aphyer seems to have figured this one out.)
How does the voting system of the faye council work?
What is up with Amy’s ratings?
What is up with Ziqual’s off-by-one ratings?
Is the “noise” in Linestra’s ratings actually related to the other stats?
If I was designing the puzzle, I would try to have one of the possible choices be someone the council would be unlikely to select, but who actually the best. Looking at aphyer’s comments, the reversal on the “physical” stats might be setup for the optimal answer.