The Costs of Reliability

Link post

A question that used to puzzle me is “Why can people be so much better at doing a thing for fun, or to help their friends and family, than they are at doing the exact same thing as a job?”

I’ve seen it in myself and I’ve seen it in others. People can be hugely more productive, creative, intelligent, and efficient on just-for-fun stuff than they are at work.

Maybe it’s something around coercion? But it happens to people even when they choose their work and have no direct supervisor, as when a prolific hobbyist writer suddenly gets writer’s block as soon as he goes pro.

I think it has a very mundane explanation; it’s always more expensive to have to meet a specific commitment than merely to do something valuable.

If I feel like writing sometimes and not other times, then if writing is my hobby I’ll write when I feel like it, and my output per hour of writing will be fairly high. Even within “writing”, if my interests vary, and I write about whatever I feel like, I can take full advantage of every writing hour. By contrast, if I’ve committed to write a specific piece by a specific deadline, I have to do it whether or not I’m in the mood for it, and that means I’ll probably be less efficient, spend more time dithering, and I’ll demand more external compensation in exchange for my inconvenience.

The stuff I write for fun may be valuable! And if you simply divide the value I produce by my hours of labor or the amount I need to be paid, I’m hugely more efficient in my free time than in my paid time! But I can’t just trivially “up my efficiency” in my paid time; reliability itself has a cost.

The costs of reliability are often invisible, but they can be very important. The cost (in time and in office supplies and software tools) of tracking and documenting your work so that you can deliver it on time. The cost (in labor and equipment) of quality assurance testing. The opportunity cost of creating simpler and less ambitious things so that you can deliver them on time and free of defects.

Reliability becomes more important with scale. Large organizations have more rules and procedures than small ones, and this is rational. Accordingly, they pay more costs in reliability.

One reason is that the attack surface for errors grows with the number of individuals involved. For instance, large organizations often have rules against downloading software onto company computers without permission. The chance that any one person downloads malicious software that seriously harms the company is small, but the chance that at least one person does rises with the number of employees.

Another reason is that coordination becomes more important with more people. If a project depends on many people cooperating, then you as an individual aren’t simply trying to do the best thing, but rather the best thing that is also understandable and predictable and capable of achieving buy-in from others.

Finally, large institutions are more tempting to attackers than small ones, since they have more value to capture. For instance, large companies are more likely to be targeted by lawsuits or public outcry than private individuals, so it’s strategically correct for them to spend more on defensive measures like legal compliance procedures or professional PR.

All of these types of defensive or preventative activity reduce efficiency — you can do less in a given timeframe with a given budget. Large institutions, even when doing everything right, acquire inefficiencies they didn’t have when small, because they have higher reliability requirements.

Of course, there are also economies of scale that increase efficiency. There are fixed expenses that only large institutions can afford, that make marginal production cheaper. There are ways to aggregate many risky components so that the whole is more robust than any one part, e.g. in distributed computation, compressed sensing, or simply averaging. Optimal firm size is a balance.

This framework tells us when we ought to find it possible to get better-than-default efficiency easily, i.e. without any clever tricks, just by accepting different tradeoffs than others do. For example:

1.) People given an open-ended mandate to do what they like can be far more efficient than people working to spec…at the cost of unpredictable output with no guarantees of getting what you need when you need it. (See: academic research.)

2.) Things that come with fewer guarantees of reliable performance can be cheaper in the average use case…at the cost of completely letting you down when they occasionally fail. (See: prototype or beta-version technology.)

3.) Activities within highly cooperative social environments can be more efficient…at the cost of not scaling to more adversarial environments where you have to spend more resources on defending against attacks. (See: Eternal September)

4.) Having an “opportunistic” policy of taking whatever opportunities come along (for instance, hanging out in a public place and chatting with whomever comes along and seems interesting, vs. scheduling appointments) allows you to make use of time that others have to spend doing “nothing” … at the cost of never being able to commit to activities that need time blocked out in advance.

5.) Sharing high-fixed-cost items (like cars) can be more efficient than owning…at the cost of not having a guarantee that they’ll always be available when you need them.

In general, you can get greater efficiency for things you don’t absolutely need than for things you do; if something is merely nice-to-have, you can handle it if it occasionally fails, and your average cost-benefit ratio can be very good indeed. But this doesn’t mean you can easily copy the efficiency of luxuries in the production of necessities.

(This suggests that “toys” are a good place to look for innovation. Frivolous, optional goods are where we should expect it to be most affordable to experiment, all else being equal; and we should expect technologies that first succeed in “toy” domains to expand to “important, life-and-death” domains later.)