If we add 100 to everything, that transformation will be sized differently after we take the log. 0s go from -infinity to +2, a jump of infinity (...plus 2, to the degree that makes any sense); 100s go from 2 to 2.3, a jump of .3. If we added 1 instead, 0s would go from infinity to 0, and 100s would go from 2 to 2.004. If we added .01, 0s would go to −2, and 100s would go to 2.00004.
But what does that do to our trendline? Suppose that 40% of EAs gave 0, and 60% of non-EAs gave 0. Then I when I calculate the mean difference in log-scale, the extra 20% of non-EAs whose score I can pick with my scaling factor is a third of the differing sample. The gulf between the groups (i.e. the difference between the trendlines) will be smaller if I choose 100 than if I choose 0.01. (I can’t pick a factor that makes the groups switch which one donated more—that’s the order preservation property—but if I add a trillion to all of donations, the difference between the groups will become invisible because both groups will look like a flat line, and if I add a trillionth to all of the donations, it’ll look much more like a graph of percent donating.)
And so it seems to me that there are three potentially interesting comparisons: percent not donating by age for the two groups (it seems likely EA will have less non-donors than non-EA at each age / age group), per-person and per-donor amounts donated for each age group (not sure about per-donor because of the previous effect, but presumably per-person amounts are higher), and then the overall analysis you did where either an offset or a direct 0->something mapping is applied so that the two effects can be aggregated.
(I don’t have R on this computer, or I would just generate the graphs I would have liked for you to make. Thanks for putting in that effort!)
Here’s why it matters:
If we add 100 to everything, that transformation will be sized differently after we take the log. 0s go from -infinity to +2, a jump of infinity (...plus 2, to the degree that makes any sense); 100s go from 2 to 2.3, a jump of .3. If we added 1 instead, 0s would go from infinity to 0, and 100s would go from 2 to 2.004. If we added .01, 0s would go to −2, and 100s would go to 2.00004.
But what does that do to our trendline? Suppose that 40% of EAs gave 0, and 60% of non-EAs gave 0. Then I when I calculate the mean difference in log-scale, the extra 20% of non-EAs whose score I can pick with my scaling factor is a third of the differing sample. The gulf between the groups (i.e. the difference between the trendlines) will be smaller if I choose 100 than if I choose 0.01. (I can’t pick a factor that makes the groups switch which one donated more—that’s the order preservation property—but if I add a trillion to all of donations, the difference between the groups will become invisible because both groups will look like a flat line, and if I add a trillionth to all of the donations, it’ll look much more like a graph of percent donating.)
And so it seems to me that there are three potentially interesting comparisons: percent not donating by age for the two groups (it seems likely EA will have less non-donors than non-EA at each age / age group), per-person and per-donor amounts donated for each age group (not sure about per-donor because of the previous effect, but presumably per-person amounts are higher), and then the overall analysis you did where either an offset or a direct 0->something mapping is applied so that the two effects can be aggregated.
(I don’t have R on this computer, or I would just generate the graphs I would have liked for you to make. Thanks for putting in that effort!)