I’m not a data scientist, but I love these. I’ve got a four-hour flight ahead of me and a copy of Microsoft Excel; maybe now is the right time to give one a try!
!It seems like the combination of materials determines the cost of the structure.
!Architects who apprenticed with Johnson or Stamatin always produce impossible buildings; architects who apprenticed with Geisel, Penrose, or Escher NEVER do. Self-taught architects sometimes produce impossible buildings, and sometimes they do not.
!This lets us select 5 designs from our proposals which will certainly produce impossible buildings. To do better, we need to understand how to tell when a proposal by a self-taught architect is likely to produce an impossible building.
!~44% of designs by self-taught architects are impossible. This more-or-less matches the 2⁄5 of masters whose apprentices reliably produce impossible buildings. So I hypothesize that self-taught students pick a favorite master at random and crib their style, acting (illegibly) like a typical apprentice thereafter. So now I need to see if there are particular materials, structure types, or blueprint types which are favored by students of any of the known master architects. By choosing designs by self-taught architects which have those properties, maybe I can tease out whose style they’re probably using.
!A structure can contain either dreams or nightmares, but not both.
!I’m too smooth-brained to tease out complex correlations on this flight while just using Excel: if there’s something weird going on (like, buildings made with either Dreams -or- Glass are likely to be impossible, but if you use both at once they cancel one another out somehow), I don’t know how to find it. So I’ll just assume everything is independent of everything else and do a Bayes to it.
!We can down-select our variables to match those which appear in the Self-Taught proposals; it does us no good to learn whether the “good” architects make use of Nightmares or not, if none the proposals before us make use of Nightmares.
!Good properties: Towers; buildings of Dreams and / or Glass; Hastily-Sketched blueprints. Bad properties: Mansions, Mechanisms; buildings of wood and / or Steel; Obsessively Detailed blueprints.
!So I choose proposals D, E, G, H, and K (probability 1); and also proposal A (probability ~62%) if we’ve got room.
!Ok, I just got off the plane and checked the puzzle description. Turns out we only get to choose 4 buildings, and there was no reason to try and tease out what Self-Taught architects are doing. In that case, I need to rank proposals D, E, G, H, and K by likely price.
!Structure price looks vaguely exponential, so I’ll take do a linear fit to minimize RMS(log10(error)). If I minimize RMSE directly then it always screws up the low-price structures to get marginally better fits on high-priced ones.
!It really looks like for each structure, you pick two materials; each material contributes a random amount to the price, with every material having its own distribution of price contributions. I can’t figure out what dice or whatever are being rolled for each material, but the fit gives me the average contribution for each one.
!So I choose proposals K, E, D, and H, with expected prices 30k, 73k, 78k, and 78k. Proposal G should be impossible too, but it’ll probably cost about 572k.
I’m not a data scientist, but I love these. I’ve got a four-hour flight ahead of me and a copy of Microsoft Excel; maybe now is the right time to give one a try!