There is a serious underlying causal model difference here that cannot be addressed in a purely statistical manner.
The video proposes a model in which some fraction of the population have baseline allergies (yielding a 1% baseline allergic reaction rate over the course of the study), and independently from that the treatment causes allergic reactions in some fraction of people (just over 1% during the same period). If this model is universally correct, then the argument is reasonable.
Do we know that the side effects are independent in this way? It seems to me that we cannot assume this model of independence for every possible combination of treatment and side effect. If there is any positive correlation, then the assumption of independence will yield an extrapolated risk that is too low.
Putting this in terms of causal models, the independence model looks like: Various uncontrolled and unmeasured factors E in the environment sometimes cause some observed side effect S, as observed in the control group. Treatment T also sometimes causes S, regardless of E.
This seems too simplistic. It seems much more likely that E interacts with unknown internal factors I that vary per-person to sometimes cause side effect S in the control group. Treatment T also interacts with I to sometimes produce S.
If you already know that your patient is in a group observed to be at extra risk of S compared with the study population, it is reasonable to infer that this group has a different distribution of I putting them at greater risk of S than the studied group, even if you don’t know how much or what the gears-level mechanism is. Since T also interacts with I, it is reasonable as a general principleto expect a greater chance of T causing the side effect in your patient than independence would indicate.
So the question “shall we count the living or the dead” seems misplaced. The real question is to what degree side effects are expected to have common causes or susceptibilities that depend upon persistent factors.
Allergic reactions in particular are known to be not independent, so an example video based on some other side effect may have been better. I can’t really think of one though, which does sort of undermine the point of the article.
You are correct that someone who has one allergy may be more likely to have an other allergy, and that this violates the assumptions of our model. Our model relies on a strong independence assumption, there are many realistic cases where this independence assumption will not hold. I also agree that the video uses an example where the assumption may not hold. The video is oversimplified on purpose, in an attempt to get people interested enough to read the arXiv preprint.
If there is a small correlation between baseline risk and effect of treatment, this will have a negligible impact on the analysis. If there is a moderate correlation, you will probably be able to bound the true treatment effect using partial identification methods. If there is strong correlation, this may invalidate the analysis completely.
The point we are making is not that the model will always hold exactly. Any model is an approximation. Let’s suppose we have three choices:
Use a template for a causal model that “counts the living”, think about all the possible biological reasons that this model could go wrong, represent them in the model if possible, and account for them as best you can in the analysis
Use a template for a causal model that “counts the dead”, think about all the possible biological reasons that this model could go wrong, represent them in the model if possible, and account for them as best you can in the analysis
Use a model that is invariant to whether you count the living or the dead. This cannot be based on a multiplicative (relative risk) parameter.
The third approach will not be sensitive to the particular problems that I am discussing, but all the suggested methods of this type have their own problems. I have written this earlier, my view is that these problems are more troubling than the problems with the relative risk models.
What we are arguing in this preprint, is that if you decide to go with a relative risk model, you should choose between (1) and (2) based on the principles suggested by Sheps, and then reason about problems with this model and how it can be addressed in the analysis, based on the principles that you have correctly outlined in your comment.
I can assure you that if you decide to go with a multiplicative model but choose the wrong “base case”, then all of the problems you have discussed in your comments will be orders of magnitude more difficult to deal with in any meaningful way. In other words, it is only after you make the choice recommended by Sheps that it even becomes possibly the meaningfully analyze the reasons for deviation from effect homogeneity...
There is a serious underlying causal model difference here that cannot be addressed in a purely statistical manner.
The video proposes a model in which some fraction of the population have baseline allergies (yielding a 1% baseline allergic reaction rate over the course of the study), and independently from that the treatment causes allergic reactions in some fraction of people (just over 1% during the same period). If this model is universally correct, then the argument is reasonable.
Do we know that the side effects are independent in this way? It seems to me that we cannot assume this model of independence for every possible combination of treatment and side effect. If there is any positive correlation, then the assumption of independence will yield an extrapolated risk that is too low.
Putting this in terms of causal models, the independence model looks like: Various uncontrolled and unmeasured factors E in the environment sometimes cause some observed side effect S, as observed in the control group. Treatment T also sometimes causes S, regardless of E.
This seems too simplistic. It seems much more likely that E interacts with unknown internal factors I that vary per-person to sometimes cause side effect S in the control group. Treatment T also interacts with I to sometimes produce S.
If you already know that your patient is in a group observed to be at extra risk of S compared with the study population, it is reasonable to infer that this group has a different distribution of I putting them at greater risk of S than the studied group, even if you don’t know how much or what the gears-level mechanism is. Since T also interacts with I, it is reasonable as a general principle to expect a greater chance of T causing the side effect in your patient than independence would indicate.
So the question “shall we count the living or the dead” seems misplaced. The real question is to what degree side effects are expected to have common causes or susceptibilities that depend upon persistent factors.
Allergic reactions in particular are known to be not independent, so an example video based on some other side effect may have been better. I can’t really think of one though, which does sort of undermine the point of the article.
You are correct that someone who has one allergy may be more likely to have an other allergy, and that this violates the assumptions of our model. Our model relies on a strong independence assumption, there are many realistic cases where this independence assumption will not hold. I also agree that the video uses an example where the assumption may not hold. The video is oversimplified on purpose, in an attempt to get people interested enough to read the arXiv preprint.
If there is a small correlation between baseline risk and effect of treatment, this will have a negligible impact on the analysis. If there is a moderate correlation, you will probably be able to bound the true treatment effect using partial identification methods. If there is strong correlation, this may invalidate the analysis completely.
The point we are making is not that the model will always hold exactly. Any model is an approximation. Let’s suppose we have three choices:
Use a template for a causal model that “counts the living”, think about all the possible biological reasons that this model could go wrong, represent them in the model if possible, and account for them as best you can in the analysis
Use a template for a causal model that “counts the dead”, think about all the possible biological reasons that this model could go wrong, represent them in the model if possible, and account for them as best you can in the analysis
Use a model that is invariant to whether you count the living or the dead. This cannot be based on a multiplicative (relative risk) parameter.
The third approach will not be sensitive to the particular problems that I am discussing, but all the suggested methods of this type have their own problems. I have written this earlier, my view is that these problems are more troubling than the problems with the relative risk models.
What we are arguing in this preprint, is that if you decide to go with a relative risk model, you should choose between (1) and (2) based on the principles suggested by Sheps, and then reason about problems with this model and how it can be addressed in the analysis, based on the principles that you have correctly outlined in your comment.
I can assure you that if you decide to go with a multiplicative model but choose the wrong “base case”, then all of the problems you have discussed in your comments will be orders of magnitude more difficult to deal with in any meaningful way. In other words, it is only after you make the choice recommended by Sheps that it even becomes possibly the meaningfully analyze the reasons for deviation from effect homogeneity...