is it generally best to take just one med (e.g antidepressant, adhd, anxiolytic), or is it best to take a mix of many meds, each at a lesser dosage? my intuitions seem to suggest that the latter could be better. in particular, consider the following toy model: your brain has parameters θ0 that should be at some optimal θ∗, and your loss function is a quadratic around θ∗. each dimension in this space represents some aspect of how your brain is configured—they might for instance represent your level of alertness, or impulsivity, or risk averseness, or motivation, etc. each med is some vector vi that you can add to your current state θ0, and the optimal dosage of that med in isolation is whichever quantity gets you closest to θ∗; but unless θ∗−θ0 happens to be exactly colinear with vi, you basically can’t do any better just by tuning the dosage of the one med. this seems especially important because most meds don’t seem to be exactly monosemantic, and also different people start out with substantially different θ0 and loss landscapes, such that you often get paradoxical reactions to meds.
A huge percentage of the job of a pharmacist is to keep track of potential negative interactions between different drugs, of which there are an incomprehensible number. I don’t think linearity is a reasonable assumption here, the interaction terms between multiple interventions should be though of as, on average, big. Augmentation and synergistic effects exist, but are in general risky and quite hard to find. Even the effects of one drug are not linear, there are significant nonlinearities in dosage effects for most drugs.
i’m not really making any strong linearity assumptions, only local linearity. this doesn’t seem that different from ML, where hyperparameters can sometimes interact heavily nonlinearly, but often they don’t. i also don’t think the quadratic assumption is crazy; we assume that loss land scapes are locally quadratic all the time, even though they are obviously highly nonconvex and it’s still a very useful intuition pump.
also, my understanding is most of the really bad interactions are pretty well known, so the probability of having a really weird surprising interaction that nobody has ever catalogued is small.
I think our mental models here might be different enough that it’s hard for me to understand what you’re saying here. By nonlinearity here I mean that, in addition to nonlinear interactions between drugs, there are interacting systems, equilibration mechanisms, etc., to the point that I think intuitions about ML systems basically shouldn’t transfer at all. But then I know your intuitions about ML are better than mine, so it’s hard to be sure of that.
Re: interactions specifically, this definitely isn’t true in polypharmacy situations. We know most of the bad drug pairs in the normal population, and because doctors are wary of prescribing many different medications, this means we rarely encounter new bad interactions in the normal population. But there are drug combinations that only become dangerous in triples (search terms: the Triple Whammy, a combination of 3 drug classes, any 2 of which are generally safe but which cause kidney failure in combination, this interaction was discovered in 2000 but the drugs became available in like 1980), there are interactions which are only dangerous in the context of certain mutations (for example there are ultrametabolizers who simply can’t use prodrugs like codeine).
Interactions like this are rare right now largely because doctors are wary of prescribing too many drugs at once, but polypharmacy is becoming more common and more bad interactions are emerging as a result, basically just for combinatorial reasons. It’s definitely possible for combinations of drugs to be prescribed safely and for them to just not interact, but if we push this further, I suspect there are very few combinations of, say, 10 drugs that are simultaneously safe for most people (even if we ignore cholinergic response).
Changing the dose of a medication does not necessarily result in linear effects. There are nonlinearities introduced by e.g. one receptor type being saturated before another one. This phenomenon also applies to polypharmacy.
I would also like to note that θ∗ is estimated not by some objective standard, but by θ0. There’s no guarantee that it remains in place as you start shifting θ.
In practice, we track our level of suffering and respond to it by trying to reduce it to acceptable levels, which is easier than trying to converge onto a hypothetical global optimum. For some, this state is reached with just one medication, for others it takes more, and for some this paradigm doesn’t produce any results.
I would see it more as casual learning with the do operator and so it isn’t necessarily about fitting to a MSE but rather doing testing of different combinations?
is it generally best to take just one med (e.g antidepressant, adhd, anxiolytic), or is it best to take a mix of many meds, each at a lesser dosage? my intuitions seem to suggest that the latter could be better. in particular, consider the following toy model: your brain has parameters θ0 that should be at some optimal θ∗, and your loss function is a quadratic around θ∗. each dimension in this space represents some aspect of how your brain is configured—they might for instance represent your level of alertness, or impulsivity, or risk averseness, or motivation, etc. each med is some vector vi that you can add to your current state θ0, and the optimal dosage of that med in isolation is whichever quantity gets you closest to θ∗; but unless θ∗−θ0 happens to be exactly colinear with vi, you basically can’t do any better just by tuning the dosage of the one med. this seems especially important because most meds don’t seem to be exactly monosemantic, and also different people start out with substantially different θ0 and loss landscapes, such that you often get paradoxical reactions to meds.
A huge percentage of the job of a pharmacist is to keep track of potential negative interactions between different drugs, of which there are an incomprehensible number. I don’t think linearity is a reasonable assumption here, the interaction terms between multiple interventions should be though of as, on average, big. Augmentation and synergistic effects exist, but are in general risky and quite hard to find. Even the effects of one drug are not linear, there are significant nonlinearities in dosage effects for most drugs.
i’m not really making any strong linearity assumptions, only local linearity. this doesn’t seem that different from ML, where hyperparameters can sometimes interact heavily nonlinearly, but often they don’t. i also don’t think the quadratic assumption is crazy; we assume that loss land scapes are locally quadratic all the time, even though they are obviously highly nonconvex and it’s still a very useful intuition pump.
also, my understanding is most of the really bad interactions are pretty well known, so the probability of having a really weird surprising interaction that nobody has ever catalogued is small.
I think our mental models here might be different enough that it’s hard for me to understand what you’re saying here. By nonlinearity here I mean that, in addition to nonlinear interactions between drugs, there are interacting systems, equilibration mechanisms, etc., to the point that I think intuitions about ML systems basically shouldn’t transfer at all. But then I know your intuitions about ML are better than mine, so it’s hard to be sure of that.
Re: interactions specifically, this definitely isn’t true in polypharmacy situations. We know most of the bad drug pairs in the normal population, and because doctors are wary of prescribing many different medications, this means we rarely encounter new bad interactions in the normal population. But there are drug combinations that only become dangerous in triples (search terms: the Triple Whammy, a combination of 3 drug classes, any 2 of which are generally safe but which cause kidney failure in combination, this interaction was discovered in 2000 but the drugs became available in like 1980), there are interactions which are only dangerous in the context of certain mutations (for example there are ultrametabolizers who simply can’t use prodrugs like codeine).
Interactions like this are rare right now largely because doctors are wary of prescribing too many drugs at once, but polypharmacy is becoming more common and more bad interactions are emerging as a result, basically just for combinatorial reasons. It’s definitely possible for combinations of drugs to be prescribed safely and for them to just not interact, but if we push this further, I suspect there are very few combinations of, say, 10 drugs that are simultaneously safe for most people (even if we ignore cholinergic response).
Changing the dose of a medication does not necessarily result in linear effects. There are nonlinearities introduced by e.g. one receptor type being saturated before another one. This phenomenon also applies to polypharmacy.
I would also like to note that θ∗ is estimated not by some objective standard, but by θ0. There’s no guarantee that it remains in place as you start shifting θ.
In practice, we track our level of suffering and respond to it by trying to reduce it to acceptable levels, which is easier than trying to converge onto a hypothetical global optimum. For some, this state is reached with just one medication, for others it takes more, and for some this paradigm doesn’t produce any results.
I would see it more as casual learning with the do operator and so it isn’t necessarily about fitting to a MSE but rather doing testing of different combinations?
Something something gears level models