satt comments on Open thread, Sep. 14 - Sep. 20, 2015

satt 19 Sep 2015 12:33 UTC
3 points
0
It does. However...

I see now I could’ve described the model better. In Stan I don’t think you can literally write the observed data as the sum of the signal and the noise; I think the data always has to be incorporated into the model as something sampled from a probability distribution, so you’d actually translate the simplest additive model into Stan-speak as something like
```
data {
    int<lower=1> N;
    int<lower=1> Ncities;
    int<lower=1> Nwidgets;
    int<lower=1> city[N];
    int<lower=1> widget[N];
    real<lower=0> sales[N];
}

parameters {
    real<lower=0> alpha;
    real beta[Ncities];
    real gamma[Nwidgets];
    real<lower=0> sigma;
}

model {
    // put code here to define explicit prior distributions for parameters
    for (n in 1:N) {
        // the tilde means the left side's sampled from the right side
        sales[n] ~ normal(alpha + beta[city[n]] + theta[widget[n]], sigma);
    }
}
```
which could give you a headache because a normal distribution puts nonzero probability density on negative sales values, so the sampler might occasionally try to give sales[n] a negative value. When this happens, Stan notices that’s inconsistent with sales[n]’s zero lower bound, and generates a warning message. (The quality of the sampling probably gets hurt too, I’d guess.)

And I don’t know a way to tell Stan, “ah, the normal error has to be non-negative”, since the error isn’t explicitly broken out into a separate term on which one can set bounds; the error’s folded into the procedure of sampling from a normal distribution.

The way to avoid this that clicks most with me is to bake the non-negativity into the model’s heart by sampling sales[n] from a distribution with non-negative support:
```
for (n in 1:N) {
    sales[n] ~ lognormal(log(alpha * beta[city[n]] * theta[widget[n]]), sigma);
}
```
Of course, bearing in mind the last time I indulged my lognormal fetish, this is likely to have trouble too, for the different reason that a lognormal excludes the possibility of exactly zero sales, and you’d want to either zero-inflate the model or add a fixed nonzero offset to sales before putting the data into Stan. But a lognormal does eliminate the problem of sampling negative values for sales[n], and aligns nicely with multiplicative city & widget effects.