OT (except that I ran into this while visiting that page): That Beeline thing is really annoying. (I got the “blues” variant. I was annoyed enough by it that I modified the cookie to serve me a different version. I asked it for variant 3 (“gray1″) and actually it doesn’t appear to be doing anything; maybe that’s a bug somewhere. Anyway, my apologies if this introduces noise into your A/B/.../I testing.)
‘Blues’ is actually the best-performing variant so far! I have no idea why, I hate it too. If it succeeds, I’ll probably have to run another to try to find a version I can live with. ‘gray1’ is, IIRC, probably the subtlest of the running versions, so unless you set up a second identical tab set to ‘none’ and flicker back & forth, I suspect you simply weren’t noticing.
Does the metric you’re using (fraction of visitors staying at least N seconds?) actually measure what you care about? (A few possible confounding factors, off the top of my head: visitors may be intrigued by the weird colours and stay around while they try to work out what it is, but this doesn’t indicate that they got any actual value from the page content; if the Beeline thing works, visitors may find the one bit of information they’re looking for faster and then leave; if it’s just annoying, annoyance may show up in reduced repeat visits rather than likelihood of disappearing quickly.)
I think it’s a reasonable metric. It’s not perfect (I’d rather measure average time on page, not a cutoff), but I don’t know how to do any better: I am neither a Javascript programmer nor a Google Analytics expert.
OT (except that I ran into this while visiting that page): That Beeline thing is really annoying. (I got the “blues” variant. I was annoyed enough by it that I modified the cookie to serve me a different version. I asked it for variant 3 (“gray1″) and actually it doesn’t appear to be doing anything; maybe that’s a bug somewhere. Anyway, my apologies if this introduces noise into your A/B/.../I testing.)
‘Blues’ is actually the best-performing variant so far! I have no idea why, I hate it too. If it succeeds, I’ll probably have to run another to try to find a version I can live with. ‘gray1’ is, IIRC, probably the subtlest of the running versions, so unless you set up a second identical tab set to ‘none’ and flicker back & forth, I suspect you simply weren’t noticing.
EDIT: ‘Blues’ eventually succumbed, and the final result was no version clearly outperformed no-BLR at all. See http://www.gwern.net/AB%20testing#beeline-reader-text-highlighting
Does the metric you’re using (fraction of visitors staying at least N seconds?) actually measure what you care about? (A few possible confounding factors, off the top of my head: visitors may be intrigued by the weird colours and stay around while they try to work out what it is, but this doesn’t indicate that they got any actual value from the page content; if the Beeline thing works, visitors may find the one bit of information they’re looking for faster and then leave; if it’s just annoying, annoyance may show up in reduced repeat visits rather than likelihood of disappearing quickly.)
I think it’s a reasonable metric. It’s not perfect (I’d rather measure average time on page, not a cutoff), but I don’t know how to do any better: I am neither a Javascript programmer nor a Google Analytics expert.