The new dot com bubble is here: it’s called online advertising

If you want to understanding Goodharting in advertising, this is a great article for that.

At the heart of the problems in online advertising is selection effects, which the article explains with this cute example:

Picture this. Luigi’s Pizzeria hires three teenagers to hand out coupons to passersby. After a few weeks of flyering, one of the three turns out to be a marketing genius. Customers keep showing up with coupons distributed by this particular kid. The other two can’t make any sense of it: how does he do it? When they ask him, he explains: “I stand in the waiting area of the pizzeria.”

It’s plain to see that junior’s no marketing whiz. Pizzerias do not attract more customers by giving coupons to people already planning to order a quattro stagioni five minutes from now.

The article goes through an extended case study at eBay, where selection effects were causing particularly expensive results without anyone realizing it for years:

The experiment continued for another eight weeks. What was the effect of pulling the ads? Almost none. For every dollar eBay spent on search advertising, they lost roughly 63 cents, according to Tadelis’s calculations.

The experiment ended up showing that, for years, eBay had been spending millions of dollars on fruitless online advertising excess, and that the joke had been entirely on the company.

To the marketing department everything had been going brilliantly. The high-paid consultants had believed that the campaigns that incurred the biggest losses were the most profitable: they saw brand keyword advertising not as a $20m expense, but a $245.6m return.

The problem, of course, is Goodharting, by trying to optimize for something that’s easy to measure rather than what is actually cared about:

The benchmarks that advertising companies use – intended to measure the number of clicks, sales and downloads that occur after an ad is viewed – are fundamentally misleading. None of these benchmarks distinguish between the selection effect (clicks, purchases and downloads that are happening anyway) and the advertising effect (clicks, purchases and downloads that would not have happened without ads).

And unsurprisingly, there’s an alignment problem hidden in there:

It might sound crazy, but companies are not equipped to assess whether their ad spending actually makes money. It is in the best interest of a firm like eBay to know whether its campaigns are profitable, but not so for eBay’s marketing department.

Its own interest is in securing the largest possible budget, which is much easier if you can demonstrate that what you do actually works. Within the marketing department, TV, print and digital compete with each other to show who’s more important, a dynamic that hardly promotes honest reporting.

The fact that management often has no idea how to interpret the numbers is not helpful either. The highest numbers win.

To this I’ll just add that this problem is somewhat solvable, but it’s tricky. I previously worked at a company where our entire business model revolved around calculating lift in online advertising spend by matching up online ad activity with offline purchase data, and a lot of that involved having a large and reliable control group against which to calculate lift. The bad news, as we discovered, was that the data was often statistically underpowered and could only distinguish between negative, neutral, and positive lift and could only see not neutral lift in cases where the evidence was strong enough you could have eyeballed it anyway. And the worse news was that we had to tell people their ads were not working or, worse yet, were lifting the performance of competitor’s products.

Some marketers’ reactions to this were pretty much as the authors’ capture it:

Leaning on the table, hands folded, he gazed at his hosts and told them: “You’re fucking with the magic.”