Coun­ter­fac­tual out­come state trans­ition parameters

Today, my pa­per “The choice of ef­fect meas­ure for bin­ary out­comes: In­tro­du­cing coun­ter­fac­tual out­come state trans­ition para­met­ers” has been pub­lished in the journal Epidemi­olo­gic Meth­ods. The ver­sion of re­cord is be­hind a pay­wall un­til Decem­ber 2019, but the fi­nal au­thor ma­nu­script is avail­able as a pre­print at arXiv.

This pa­per is the first pub­lic­a­tion about an am­bi­tious idea which, if ac­cep­ted by the stat­ist­ical com­munity, could have sig­ni­fic­ant im­pact on how ran­dom­ized tri­als are re­por­ted. Two other ma­nu­scripts from the same pro­ject are avail­able as work­ing pa­pers on arXiv. This blog post is in­ten­ded as a high-level over­view of the idea, to ex­plain why I think this work is im­port­ant.

Q: What prob­lem are you try­ing to solve?

Ran­dom­ized con­trolled tri­als are of­ten con­duc­ted in pop­u­la­tions that dif­fer sub­stan­tially from the clin­ical pop­u­la­tions in which the res­ults will be used to guide clin­ical de­cision mak­ing. My goal is to cla­rify the con­di­tions that must be met in or­der for the ran­dom­ized trial to be in­form­at­ive about what will hap­pen if the drug is given to a tar­get pop­u­la­tion which dif­fers from the pop­u­la­tion that was stud­ied.

As a first step, one could at­tempt to con­struct a sub­group of the par­ti­cipants in the ran­dom­ized trial, such that the sub­group is suf­fi­ciently sim­ilar to the pa­tients you are in­ter­ested in, in terms of some ob­served baseline co­v­ari­ates. However, this leaves open the ques­tion of how one can de­term­ine what baseline co­v­ari­ates need to be ac­coun­ted for.

In or­der to de­term­ine this, it would be ne­ces­sary to provide a pri­ori bio­lo­gical facts which would lead to the ef­fect in one pop­u­la­tion be­ing equal to the ef­fect in an­other pop­u­la­tion. For ex­ample, if we some­how knew that the ef­fect of a drug is en­tirely de­term­ined by some gene whose pre­val­ence dif­fers between two coun­tries, it is pos­sible that when we com­pare people in Coun­try A who have the gene with people in Coun­try B who also have the gene, and com­pare people in Coun­try A who don’t have the gene with people in Coun­try B who don’t have the gene, the ef­fect is equal between the rel­ev­ant groups. Using an ex­ten­sion of this ap­proach, we can try to look for a set of baseline co­v­ari­ates such that the ef­fect can be ex­pec­ted to be ap­prox­im­ately equal between two pop­u­la­tions once we make the com­par­is­ons within levels of the co­v­ari­ates.

Un­for­tu­nately, things are more com­plic­ated than this. Spe­cific­ally, we need to be more pre­cise about what we mean by the word “ef­fect”. When in­vest­ig­at­ors meas­ure ef­fects, they have sev­eral op­tions avail­able to them: They can use mul­ti­plic­at­ive para­met­ers (such as the risk ra­tio and the odds ra­tio), ad­dit­ive para­met­ers (such as the risk dif­fer­ence), or sev­eral other al­tern­at­ives that have fallen out of fash­ion (such as the arc­sine dif­fer­ence). If the baseline risks dif­fer between two pop­u­la­tions (for ex­ample, between men and wo­men), then at most one of these para­met­ers can be equal between the two groups. There­fore, a bio­lo­gical model that en­sures equal­ity of the risk ra­tio can­not also en­sure equal­ity of the risk dif­fer­ence. The lo­gic that de­term­ines whether a set of co­v­ari­ates is suf­fi­cient in or­der to get ef­fect equal­ity, is there­fore ne­ces­sar­ily de­pend­ent on how we choose to meas­ure the ef­fect.

Mak­ing things even worse, the com­monly used risk ra­tio is not sym­met­ric to the cod­ing of the out­come vari­able: Gen­er­al­iz­a­tions based on the ra­tio of prob­ab­il­ity of death, will give dif­fer­ent pre­dic­tions from gen­er­al­iz­a­tions based on the ra­tio of prob­ab­il­ity sur­vival.. In other words, when us­ing a risk ra­tio model, your con­clu­sions are not in­vari­ant to an ar­bit­rary de­cision that was made when the per­son who con­struc­ted the data­set de­cided whether to en­code the out­come vari­able as (death=1, sur­vival=0) or as (sur­vival=1, death=0).

The in­form­a­tion that doc­tors (and the pub­lic) ex­tract from ran­dom­ized tri­als is of­ten in the form of a sum­mary meas­ure based on a mul­ti­plic­at­ive para­meter. For ex­ample, a study will of­ten re­port that a par­tic­u­lar drug “doubled” the ef­fect of a par­tic­u­lar side ef­fect, and this then be­comes the meas­ure of ef­fect that the clini­cians will use in or­der to in­form their de­cision mak­ing. Moreover, the stand­ard meth­od­o­logy for meta-ana­lysis is es­sen­tially a weighted av­er­age of the mul­ti­plic­at­ive para­meter from each study. Any con­clu­sion that is drawn from these stud­ies would have been dif­fer­ent if in­vest­ig­at­ors had chosen a dif­fer­ent ef­fect para­meter, or a dif­fer­ent cod­ing scheme for the out­come vari­able. These ana­lytic choices are rarely jus­ti­fied by any kind of ar­gu­ment, and in­stead rely on a con­ven­tion to al­ways use the risk ra­tio based on the prob­ab­il­ity of death. No con­vin­cing ra­tionale for this con­ven­tion ex­ists.

My goal is to provide a gen­eral frame­work that al­lows an in­vest­ig­ator to reason from bio­lo­gical facts about what set of co­v­ari­ates are suf­fi­cient to con­di­tion on, in or­der for the ef­fect in one pop­u­la­tion to be equal to the ef­fect in an­other, in terms of a spe­cified meas­ure of ef­fect. While the ne­ces­sary bio­lo­gical con­di­tions can at best be con­sidered ap­prox­im­a­tions of the un­der­ly­ing data gen­er­at­ing mech­an­ism, cla­ri­fy­ing the pre­cise nature of these con­di­tions will be use­ful to as­sist reas­on­ing about how much un­cer­tainty there is about whether the res­ults will gen­er­al­ize to other pop­u­la­tion.

Q: What are the ex­ist­ing solu­tions to this prob­lem, and why do you think you can im­prove on them?

Re­cently, much at­ten­tion has been given to a solu­tion by Judea Pearl and Elias Bare­in­boim, based on an ex­ten­sion of causal dir­ec­ted acyc­lic graphs. Pearl and Bare­in­boim’s ap­proach is math­em­at­ic­ally valid and el­eg­ant. However, the con­di­tions that must be met in or­der for these graphs to be a reas­on­able ap­prox­im­a­tion of the data gen­er­at­ing mech­an­ism, are much more re­strict­ive than most tri­al­ists are com­fort­able with.

Here, I am go­ing to skip a lot of de­tails about these se­lec­tion dia­grams, and in­stead fo­cus on the spe­cific as­pect that I find prob­lem­atic. These se­lec­tion dia­grams aban­don meas­ures of ef­fect com­pletely, and in­stead con­sider the coun­ter­fac­tual dis­tri­bu­tion of the out­come un­der the in­ter­ven­tion sep­ar­ately from the coun­ter­fac­tual dis­tri­bu­tion of the out­come un­der the con­trol con­di­tion. This re­solves a lot of the prob­lems as­so­ci­ated with ef­fect meas­ures, but it also fails to make use of in­form­a­tion that is con­tained in how these two coun­ter­fac­tu­als re­late to each other.

Con­sider for ex­ample an ex­per­i­ment to de­term­ine the ef­fect of homeo­pathy on heart dis­ease. Sup­pose this ex­per­i­ment is con­duc­ted in men, and de­term­ines that there is no ef­fect. If we use se­lec­tion dia­grams to reason about whether these con­clu­sions also hold in wo­men, we will have to con­struct a causal graph that con­tains every cause of heart dis­ease whose dis­tri­bu­tion dif­fers between men and wo­men, meas­ure these vari­ables and con­trol for them. Most likely, this will not be pos­sible, and we will con­clude that we are un­able to make any pre­dic­tion for what will hap­pen if wo­men take homeo­pathic treat­ments. The ap­proach simply does not al­low us to try to ex­tra­pol­ate the ef­fect size (even when it is null), since it can­not make use of in­form­a­tion about how what happened un­der treat­ment relates to what hap­pens un­der the con­trol con­di­tion. The se­lec­tion dia­gram ap­proach there­fore leaves key in­form­a­tion on the table: In my view the in­form­a­tion that is left out is ex­actly those pieces of in­form­a­tion that could most re­li­ably be used to make gen­er­al­iz­a­tions about causal ef­fects.

A closely re­lated point is that the Bare­in­boim-Pearl ap­proach leads to a con­clu­sion that meta-ana­lysis can be con­duc­ted sep­ar­ately in the act­ive arm and the con­trol arm. Most meta-ana­lysts would con­sider this idea crazy, since it ar­gu­ably aban­dons ran­dom­iz­a­tion (which is an ob­ject­ive fact about how the data was gen­er­ated) in fa­vor of un­veri­fi­able and ques­tion­able as­sump­tions en­coded in the graph, es­sen­tially claim­ing that all causes of the out­come have been meas­ured.

Q: What are coun­ter­fac­tual out­come state trans­ition para­met­ers?

Our goal is to con­struct a meas­ure of ef­fect that al­lows us to cap­ture the re­la­tion­ship between what hap­pens if treated, to what hap­pens if un­treated. We want to do this in a way that avoids the math­em­at­ical prob­lems with stand­ard meas­ures of ef­fect, and such that mag­nitude of the para­met­ers has a bio­lo­gical in­ter­pret­a­tion. If we suc­ceed in do­ing this, we will be able to de­term­ine what co­v­ari­ates to con­trol for on the basis of ask­ing what bio­lo­gical prop­er­ties are as­so­ci­ated with the mag­nitude of the para­met­ers.

Coun­ter­fac­tual out­come state trans­ition para­met­ers are ef­fect meas­ures that quantify the prob­ab­il­ity of “switch­ing” out­come state if we move between coun­ter­fac­tual worlds. We define one para­meter which meas­ures the prob­ab­il­ity that the drug kills the pa­tient, con­di­tional on be­ing someone who would have sur­vived without the drug, and an­other para­meter which meas­ures the prob­ab­il­ity that the drug saves the pa­tient, con­di­tional on be­ing someone who would have died without the drug.

Im­port­antly, these para­met­ers are not iden­ti­fied from the data, ex­cept un­der strong mono­ton­icity con­di­tions. For ex­ample, if we be­lieve that the drug helps some people, harms other people and has no ef­fect on a third group, there is no mono­ton­icity and the method can­not be used. However, it is some­times the case that the drug only op­er­ates in one dir­ec­tion. For ex­ample, for most drugs, it is very un­likely that the drug pre­vents someone from get­ting an al­ler­gic re­ac­tion to it. There­fore, its ef­fect on al­ler­gic re­ac­tions is mono­tonic.

If the ef­fect of treat­ment is mono­tonic, one of the COST para­met­ers is equal to 0 or 1, and the other para­meter is iden­ti­fied as the risk ra­tio. If this is a treat­ment that re­duces in­cid­ence, the COST para­meter as­so­ci­ated with a pro­tect­ive ef­fect is equal to the stand­ard risk ra­tio based on the prob­ab­il­ity of death. If on the other hand the treat­ment in­creases in­cid­ence, the COST para­meter as­so­ci­ated with a harm­ful ef­fect is iden­ti­fied as the re­coded risk ra­tio based on the prob­ab­il­ity of sur­vival. There­fore, if we de­term­ine which risk ra­tio to use on the basis of the COST model, the risk ra­tio is con­strained between 0 and 1.

Q: Is this idea new?

The un­der­ly­ing in­tu­ition be­hind this idea is not new. For ex­ample, Mindel C. Sheps pub­lished a re­mark­able pa­per in the New Eng­land Journal of Medi­cine in 1958, in which she works from the same in­tu­ition and reaches es­sen­tially the same con­clu­sions. Sheps’ clas­sic pa­per has more than 100 cita­tions in the stat­ist­ical lit­er­at­ure, but her re­com­mend­a­tions have not been ad­ap­ted to any de­tect­able ex­tent in ap­plied stat­ist­ical lit­er­at­ure. Jon Deeks provided em­pir­ical evid­ence for the idea of us­ing the stand­ard risk ra­tio for pro­tect­ive treat­ments, and re­coded risk ra­tio for harm­ful ef­fects, in Stat­ist­ics in Medi­cine in 2012.

What is new to this pa­per, is that we form­al­ize the in­tu­ition Sheps was work­ing from in terms of a formal coun­ter­fac­tual causal model, which is used as a bridge between the back­ground bio­lo­gical know­ledge and the choice of ef­fect meas­ure. Formal­iz­ing the prob­lem in this way al­lows us to cla­rify the scope and lim­its of the ap­proach, and points the dir­ec­tion to how these ideas can be used to in­form fu­ture de­vel­op­ments in meta-ana­lysis.