I see now I could’ve described the model better. In Stan I don’t think you can literally write the observed data as the sum of the signal and the noise; I think the data always has to be incorporated into the model as something sampled from a probability distribution, so you’d actually translate the simplest additive model into Stan-speak as something like
data {
int<lower=1> N;
int<lower=1> Ncities;
int<lower=1> Nwidgets;
int<lower=1> city[N];
int<lower=1> widget[N];
real<lower=0> sales[N];
}
parameters {
real<lower=0> alpha;
real beta[Ncities];
real gamma[Nwidgets];
real<lower=0> sigma;
}
model {
// put code here to define explicit prior distributions for parameters
for (n in 1:N) {
// the tilde means the left side's sampled from the right side
sales[n] ~ normal(alpha + beta[city[n]] + theta[widget[n]], sigma);
}
}
which could give you a headache because a normal distribution puts nonzero probability density on negative sales values, so the sampler might occasionally try to give sales[n] a negative value. When this happens, Stan notices that’s inconsistent with sales[n]’s zero lower bound, and generates a warning message. (The quality of the sampling probably gets hurt too, I’d guess.)
And I don’t know a way to tell Stan, “ah, the normal error has to be non-negative”, since the error isn’t explicitly broken out into a separate term on which one can set bounds; the error’s folded into the procedure of sampling from a normal distribution.
The way to avoid this that clicks most with me is to bake the non-negativity into the model’s heart by sampling sales[n] from a distribution with non-negative support:
for (n in 1:N) {
sales[n] ~ lognormal(log(alpha * beta[city[n]] * theta[widget[n]]), sigma);
}
Of course, bearing in mind the last time I indulged my lognormal fetish, this is likely to have trouble too, for the different reason that a lognormal excludes the possibility of exactly zero sales, and you’d want to either zero-inflate the model or add a fixed nonzero offset to sales before putting the data into Stan. But a lognormal does eliminate the problem of sampling negative values for sales[n], and aligns nicely with multiplicative city & widget effects.
It does. However...
I see now I could’ve described the model better. In Stan I don’t think you can literally write the observed data as the sum of the signal and the noise; I think the data always has to be incorporated into the model as something sampled from a probability distribution, so you’d actually translate the simplest additive model into Stan-speak as something like
which could give you a headache because a normal distribution puts nonzero probability density on negative sales values, so the sampler might occasionally try to give
sales[n]
a negative value. When this happens, Stan notices that’s inconsistent withsales[n]
’s zero lower bound, and generates a warning message. (The quality of the sampling probably gets hurt too, I’d guess.)And I don’t know a way to tell Stan, “ah, the normal error has to be non-negative”, since the error isn’t explicitly broken out into a separate term on which one can set bounds; the error’s folded into the procedure of sampling from a normal distribution.
The way to avoid this that clicks most with me is to bake the non-negativity into the model’s heart by sampling
sales[n]
from a distribution with non-negative support:Of course, bearing in mind the last time I indulged my lognormal fetish, this is likely to have trouble too, for the different reason that a lognormal excludes the possibility of exactly zero sales, and you’d want to either zero-inflate the model or add a fixed nonzero offset to
sales
before putting the data into Stan. But a lognormal does eliminate the problem of sampling negative values forsales[n]
, and aligns nicely with multiplicative city & widget effects.