Since it looks like you’re using Pandas, I’d recommend adding Seaborn for simple statistical plots, to avoid the saturation effect from having so many points on a scatterplot. It’s a lovely toolkit for producing specific kinds of useful plots quickly and easily, with minimal customisation.
(specifically for these plots, I’d reach for a joint distribution plot with kind="hex" or kde or reg)
Note: controlling for SES, altitude and race essentially eliminates this correlation (it becomes 0.026642, p=0.26, n=1764 counties, essentially no different from what you’d get by random chance.)
Since it looks like you’re using Pandas, I’d recommend adding Seaborn for simple statistical plots, to avoid the saturation effect from having so many points on a scatterplot. It’s a lovely toolkit for producing specific kinds of useful plots quickly and easily, with minimal customisation.
(specifically for these plots, I’d reach for a joint distribution plot with
kind="hex"
or kde or reg)Here are a few jointplots I created with seaborn (“percent_obese” is from Elizabeth’s original dataset, “OBESITY_CrudePrev” is from the PLACES 2021 Zip Code Tabulation Area-level obesity prevalence estimates):
Note: controlling for SES, altitude and race essentially eliminates this correlation (it becomes 0.026642, p=0.26, n=1764 counties, essentially no different from what you’d get by random chance.)
Just noticed this—thanks for fixing the plots!