Zac Hatfield-Dodds comments on New Water Quality x Obesity Dataset Available

Zac Hatfield-Dodds 27 May 2022 22:45 UTC
5 points
0
Since it looks like you’re using Pandas, I’d recommend adding Seaborn for simple statistical plots, to avoid the saturation effect from having so many points on a scatterplot. It’s a lovely toolkit for producing specific kinds of useful plots quickly and easily, with minimal customisation.

(specifically for these plots, I’d reach for a joint distribution plot with kind="hex" or kde or reg)
- Natália 6 Jun 2022 1:08 UTC
  12 points
  0
  Parent
  Here are a few jointplots I created with seaborn (“percent_obese” is from Elizabeth’s original dataset, “OBESITY_CrudePrev” is from the PLACES 2021 Zip Code Tabulation Area-level obesity prevalence estimates):
  - Natália 22 Jun 2022 4:57 UTC
    3 points
    0
    Parent
    Note: controlling for SES, altitude and race essentially eliminates this correlation (it becomes 0.026642, p=0.26, n=1764 counties, essentially no different from what you’d get by random chance.)
  - DanielFilan 28 Jun 2022 21:34 UTC
    2 points
    0
    Parent
    Just noticed this—thanks for fixing the plots!