In 2007, the Department of Children, Youth, and Families (DCYF) held a seminar for the nonprofits vying for a piece of $78 million in funding. Grant seekers were told that in the next funding cycle, they would be required — for the first time — to provide quantifiable proof their programs were accomplishing something.
The room exploded with outrage. This wasn’t fair. “What if we can bring in a family we’ve helped?” one nonprofit asked. Another offered: “We can tell you stories about the good work we do!” Not every organization is capable of demonstrating results, a nonprofit CEO complained. He suggested the city’s funding process should actually penalize nonprofits able to measure results, so as to put everyone on an even footing. Heads nodded: This was a popular idea.
Actually, these objections might not be quite as insane as they might sound at first.
The issue is that rigorously measuring results is hard, and frequently when people try to quantify results, they screw it up and force people to spend their time gaming a dysfunctional metric instead of doing real work. Just look at everyone who complains about academia forcing researchers to publish everything they can in as small bites as possible in order to maximize citations, instead of being able to do things in a way that’d be more useful for everyone. Or look at the software companies that used to measure programmer productivity in terms of lines of code written, and—as far as I know—still haven’t managed to come up with any very good objective metric for comparing their workers.
The fact is that there are plenty of cases where we know something, but don’t have any way of showing it in an objective and easy-to-quantify way. A boss might know for sure who’s a valuable researcher or programmer on the basis of her interactions with them, but be unable to prove it rigorously. And these are still relatively simple domains—take something very open-ended like “the impact of nonprofits”, and things get even worse.
Given that people are generally bad at designing good ways of quantifying such things, and that bad measures will produce worse results than no measures at all, then it can actually make perfect sense for somebody interested in helping people to object to the creation of such measures. Better (the thought goes) to give everyone money and end up funding both useless and high-impact organizations, than to concentrate all the money to a few organizations which are good at gaming the metrics and most probably all useless.
But there is always some metric to be gamed. There is always some causality chain which results in a specific distribution of money to organizations. Just because we close our eyes, it does not make the causality go away.
Instead of organizations wasting money on and time a dysfunctional official metric, there will be organizations wasting money and time on alternative ways (bribery, fraud, advertising, lobbying, propaganda...) of convincing government that it’s they who should get a bite from the budget.
But there is always some metric to be gamed. There is always some causality chain which results in a specific distribution of money to organizations. Just because we close our eyes, it does not make the causality go away.
This doesn’t necessarily mean that the system is gameable. If we suppose that what we’re measuring is actually exactly what we want to get out of the program, then the only way the program can get ahead is by providing more of what we want. The system can only be “gamed” to the extent that there’s a divorce between what we want from it, and what sort of output we measure.
I’m leery of organizations providing their own statistics on how effective they are, which may just be another form of lobbying and propaganda. I’d lean towards carving out from the budget a group that independently assesses effectiveness of each of the organizations. It’s admittedly imperfect, but it would be more impartial than what seems to be in place now, and agencies wouldn’t necessarily be at a disadvantage for lacking their own internal measurement tools. That still leaves the problem of choosing the right metrics. Something simple like budget percentages and ratios would be a good place to start. There are a lot of hard-to-compare types of services out there; after school programs aren’t like Meals on Wheels programs. It’s hard to come up outcome based metrics to say which service is better than another when there’s so many different categories. Adopting something along the lines of the financial ratings at Charity Navigator could at least get everyone on the same page for controlling costs at their organizations.
It’s hard to come up outcome based metrics to say which service is better than another when there’s so many different categories.
At least we could compare organizations within the same category. Splitting the budget across categories would remain a political decision, but within a category, inefficient organizations should be ignored.
Even in absence of measurements, it would be good if all organizations would have to make and publish their reports, which would have to approved by a person who could point at different places in the report and say “be more specific”.
For example if the first approximation is “Our organization supports world peace”, the 20th approximation would be “We have spent $1,000,000 on salaries of our employees; $500,000 on our building; $100,000 on food; and $100 on printing 200 flyers with big colored letters ‘World Peace is a Great Thing’. Then we used volunteers to distribute those flyers on the university campus. Also, we paid $10,000 for design.” And then the Chief Specificity Officer would say: “OK, this is specific enough, you may publish it online.”
Instead of organizations wasting money on and time a dysfunctional official metric, there will be organizations wasting money and time on alternative ways (bribery, fraud, advertising, lobbying, propaganda...) of convincing government that it’s they who should get a bite from the budget.
Sure, but the existence of bribery and corruption can actually be viewed as an argument for not establishing any rigorous metrics: the worse things are, the more likely it is that any supposedly objective metrics get turned into instruments for giving specific organizations plenty of money and shutting everyone else out. That happens all the time in the public sector, with e.g. lucrative contracts being supposedly offered for anyone who can fill the requirements and bids the lowest, but with the requirements being intentionally crafted so that only a few favored organizations can match them. Without such a setup, other organizations might actually get their offers into consideration.
That happens all the time in the public sector, with e.g. lucrative contracts being supposedly offered for anyone who can fill the requirements and bids the lowest, but with the requirements being intentionally crafted so that only a few favored organizations can match them. Without such a setup, other organizations might actually get their offers into consideration.
If someone is corrupted enough to design the requirements so that they match a favored organization… what will happen if the same person is allowed to make the same decision, without the condition of choosing the lowest price? I guess the same favored organization will get the contract, only the price will be much higher.
(Case study: This is how highways are built in Slovakia from European Union’s PPP money. When one specific political party is in government, only 20 kms per year are built for the same money that was enough to build 100 kms per other years. The reason is that all contracts go to companies belonging to the boss of the given party. All competing companies are refused, officially because their prices are “suspiciously low”.)
The issue is that rigorously measuring results is hard, and frequently when people try to quantify results, they screw it up and force people to spend their time gaming a dysfunctional metric instead of doing real work.
This is a problem in business as well. Marketo is able to charge companies thousands per month for tracking online advertising outcomes in companies with long, relationship-based B2B sales cycles (who might be aiming to make a few huge sales per year).
John Wanamaker: “Half the money I spend on advertising is wasted; the trouble is I don’t know which half.”
Actually, these objections might not be quite as insane as they might sound at first.
The issue is that rigorously measuring results is hard, and frequently when people try to quantify results, they screw it up and force people to spend their time gaming a dysfunctional metric instead of doing real work. Just look at everyone who complains about academia forcing researchers to publish everything they can in as small bites as possible in order to maximize citations, instead of being able to do things in a way that’d be more useful for everyone. Or look at the software companies that used to measure programmer productivity in terms of lines of code written, and—as far as I know—still haven’t managed to come up with any very good objective metric for comparing their workers.
The fact is that there are plenty of cases where we know something, but don’t have any way of showing it in an objective and easy-to-quantify way. A boss might know for sure who’s a valuable researcher or programmer on the basis of her interactions with them, but be unable to prove it rigorously. And these are still relatively simple domains—take something very open-ended like “the impact of nonprofits”, and things get even worse.
Given that people are generally bad at designing good ways of quantifying such things, and that bad measures will produce worse results than no measures at all, then it can actually make perfect sense for somebody interested in helping people to object to the creation of such measures. Better (the thought goes) to give everyone money and end up funding both useless and high-impact organizations, than to concentrate all the money to a few organizations which are good at gaming the metrics and most probably all useless.
But there is always some metric to be gamed. There is always some causality chain which results in a specific distribution of money to organizations. Just because we close our eyes, it does not make the causality go away.
Instead of organizations wasting money on and time a dysfunctional official metric, there will be organizations wasting money and time on alternative ways (bribery, fraud, advertising, lobbying, propaganda...) of convincing government that it’s they who should get a bite from the budget.
This doesn’t necessarily mean that the system is gameable. If we suppose that what we’re measuring is actually exactly what we want to get out of the program, then the only way the program can get ahead is by providing more of what we want. The system can only be “gamed” to the extent that there’s a divorce between what we want from it, and what sort of output we measure.
And to the extent that it’s easier to “game” the system than to legitimately provide services.
I’m leery of organizations providing their own statistics on how effective they are, which may just be another form of lobbying and propaganda. I’d lean towards carving out from the budget a group that independently assesses effectiveness of each of the organizations. It’s admittedly imperfect, but it would be more impartial than what seems to be in place now, and agencies wouldn’t necessarily be at a disadvantage for lacking their own internal measurement tools. That still leaves the problem of choosing the right metrics. Something simple like budget percentages and ratios would be a good place to start. There are a lot of hard-to-compare types of services out there; after school programs aren’t like Meals on Wheels programs. It’s hard to come up outcome based metrics to say which service is better than another when there’s so many different categories. Adopting something along the lines of the financial ratings at Charity Navigator could at least get everyone on the same page for controlling costs at their organizations.
At least we could compare organizations within the same category. Splitting the budget across categories would remain a political decision, but within a category, inefficient organizations should be ignored.
Even in absence of measurements, it would be good if all organizations would have to make and publish their reports, which would have to approved by a person who could point at different places in the report and say “be more specific”.
For example if the first approximation is “Our organization supports world peace”, the 20th approximation would be “We have spent $1,000,000 on salaries of our employees; $500,000 on our building; $100,000 on food; and $100 on printing 200 flyers with big colored letters ‘World Peace is a Great Thing’. Then we used volunteers to distribute those flyers on the university campus. Also, we paid $10,000 for design.” And then the Chief Specificity Officer would say: “OK, this is specific enough, you may publish it online.”
Sure, but the existence of bribery and corruption can actually be viewed as an argument for not establishing any rigorous metrics: the worse things are, the more likely it is that any supposedly objective metrics get turned into instruments for giving specific organizations plenty of money and shutting everyone else out. That happens all the time in the public sector, with e.g. lucrative contracts being supposedly offered for anyone who can fill the requirements and bids the lowest, but with the requirements being intentionally crafted so that only a few favored organizations can match them. Without such a setup, other organizations might actually get their offers into consideration.
If someone is corrupted enough to design the requirements so that they match a favored organization… what will happen if the same person is allowed to make the same decision, without the condition of choosing the lowest price? I guess the same favored organization will get the contract, only the price will be much higher.
(Case study: This is how highways are built in Slovakia from European Union’s PPP money. When one specific political party is in government, only 20 kms per year are built for the same money that was enough to build 100 kms per other years. The reason is that all contracts go to companies belonging to the boss of the given party. All competing companies are refused, officially because their prices are “suspiciously low”.)
This is a problem in business as well. Marketo is able to charge companies thousands per month for tracking online advertising outcomes in companies with long, relationship-based B2B sales cycles (who might be aiming to make a few huge sales per year).
John Wanamaker: “Half the money I spend on advertising is wasted; the trouble is I don’t know which half.”