A while back I read How to Measure Anything and found it fascinating. In my day job, I spend quite a bit of time trying to make sense of the world by looking at dashboards of requests, latencies, error rates, etc. (software systems).
After finishing the book and taking copious notes, I understood that it gave me a prepackaged process that I could apply as-is, but I found it very difficult to adapt to everyday situations. I don’t think I picked up a good intuition about stats, in other words.
I’m looking to change that. Specifically, I want to learn to apply stats in these two situations:
measuring things. Mostly software systems, but open to little experiments. Dan Luu used to measure a lot of fun things.
understanding how others measure things. I’d like to be able to judge if claims made in a paper about covid spread or social media addiction are backed up by the math/data in the paper.
The challenge I’m facing is that I know a bunch of techniques, but not how they relate to each other and the problems they’re meant to solve. To illustrate what I mean: I know how to get percentiles and calculate means, but until today morning I didn’t know why averaging percentiles is usually a bad idea. I’m missing the map.
I’ve seen these books recommended as a good way to start:
Statistics, 4th Edition 4th Edition, by Freedman, Pisani, and Purves
Probability Theory: The Logic of Science, by Jaynes
An Introduction to Error Analysis: The Study of Uncertainties in Physical Measurements, by Taylor
Think Stats, by Downey
But I also wanted to ask someone familiar with the field:
Is it best to start with an introductory textbook and branch out from there?
Are there specific subfields / topics I should be focusing on (or avoiding)?
Is what I’m looking to learn labeled in some way? For example, I can’t tell if this is data analytics or data science or X.