abstractapplic comments on [S] D&D.Sci: All the D8a. Allllllll of it. Evaluation and Ruleset

abstractapplic 28 Feb 2023 0:38 UTC
8 points
0
Reflections on my performance:
This stings my pride a little; I console myself with the fact that my “optimize conditional on Space and Life” allocation got a 64.7% success rate.
If I’d allocated more time, I would have tried a wider range of ML algorithms on this dataset, instead of just throwing XGBoost at it. I’m . . . not actually sure if that would have helped; in hindsight, trying the same algorithms on different subsets (“what if I built a model on only the 4-player games?”) and/or doing more by-hand analysis (“is Princeliness like Voidliness, and if so, what does that mean?”) might have provided better results.
Reflections on the challenge:
I found this one hard to get started with because it had a de facto 144 explanatory columns (“does this party include a [Class] of [Aspect]?”) along with its 1.4m rows, and the effects of each column was mediated by the effects of each other column. This made it difficult—and computationally intensive! - to figure out anything about what classpect combinations affect the outcome.
That said, I appreciated this scenario. The premise was fun, the writing was well-executed, and the challenge was fair. Also, it served as a much-needed proof-by-example that “train one ML model, then optimize over inputs” isn’t a perfect skeleton key for solving problems shaped like this. If it was a little obtuse on top of that . . . well, I can chalk that up to realism.
- aphyer 28 Feb 2023 1:17 UTC
  6 points
  1
  Parent
  Good to know, thank you! I think my main takeaway is that I am really bad at judging difficulty levels on these: I actually expected this scenario to be easier than the previous Dwarves & D.Sci scenario, but that one had three different near-perfect solutions while this one only had one noticeably-better-than-random solution.
  Long-winded and empirically incorrect argument that led me to that expectation follows:
  I was aware of the large number of possible characters—this is why the dataset ended up being so big, because I wanted to be sure it was large enough to allow simple analyses to work in spite of that. One sample approach I tried out on my end as part of designing the scenario was this:
  - Take only teams that contained a Knight of Blood and a Mage of Time (but of any size).
  - For each possible classpect, find its winrate on those teams.
  This would have given you ~4k teams, with ~120 with each possible other classpect, which wasn’t enough to get an optimal solution but would have been an excellent first step:
  - Page of Heart has a 59.46% winrate
  - Maid of Heart has a 57.01% winrate
  - Maid of Breath has a 51.55% winrate
  - ...
  - ...
  - Heir of Hope has a 27.10% winrate
  - Heir of Rage has a 26.85% winrate
  - Maid of Void has a 22.64% winrate
  As I envisioned things playing out:
  - Just running this approach and grabbing the two highest characters you could:
    You would have picked a Page of Heart (3-9-3) and a Maid of Breath (2-12-1)
    This would have given you stats of 18-25-17, for a lowest stat of 17 and a 64% winrate.
    This isn’t optimal (it over-invests in Friendship, since you’ve picked two different high-Friendship characters), but it’s noticeably better than random.
  - Additionally, looking at the high/low scores might point you further in useful directions:
    For instance, Heart/Breath/Life showed up an awful lot in the top on a variety of different classes.
    This might have pointed you in the direction of ‘there’s a specific thing I’m missing’ and gotten you to bring only one Heart-like hero.
  Sadly it seems I overestimated how obvious a thing to try that was. Based on the answers it looks like:
  - simon did something fairly similar to this, requiring 4-person teams but only requiring one of your two starting characters on the team, and ended up with a similar outcome of ‘generally good, but overinvested a bit in Friendship’.
  - Yonge ran some analysis that did a good job of finding ‘generally strong characters’ but wasn’t specific to the two characters you started with.
  - You did some kind of ML thing I didn’t understand.
- abstractapplic 28 Feb 2023 0:48 UTC
  6 points
  0
  Parent
  Reflections x3 combo:
  Just realized this could have been a perfect opportunity to show off that modelling library I built, except:
  A) I didn’t have access to the processing power I’d need to make it work well on a dataset of this size.
  B) I was still thinking in terms of “what party archetype predicts success”, when “what party archetype predicts failure” would have been more enlightening. Or in other words . . .
  . . . I forgot to flip the problem turn-ways.