Anatomy of a Gear

Thankyou to Sisi Cheng (of the Working as Intended comic) for the excellent drawings.

When modelling a physical gearbox, why do we chunk the system into gears? We’re essentially treating each gear as a black box, a discrete component without internal structure of its own, interesting mainly for its interactions with other components. Why not zoom out, and model the entire gearbox as a black box? Why not zoom in, and model the individual chunks of metal which comprise each gear, or even each individual atom?

It seems like gears are a natural choice of abstraction for a gearbox. But why that choice? In general, what makes a good “gear”—a good lowest-level component in a “gears-level” model?

Why choose “gears” as the components of our model? Why chunk the world that way, rather than chunk multiple gears into subsystems, or break up gears into even smaller subsystems?

More generally: we want to build gears-level models, but we don’t want to model things down to the last atom. At some point, we have to accept some black boxes in our model components. So how do we decide where that point is?

This post answers that question. We’ll start with a relatively-detailed discussion of a physical gear, then extend the reasoning to “gears” in other kinds of models.

A Physical Gear

Picture a gear, turning in a gearbox. We can pick out a little patch of metal on that gear, zoom in, and see lots of atoms vibrating around. Those vibrations aren’t completely random—vibrations of one atom are correlated with the atoms next to it, although the correlations do typically fall off over a short distance. If we look at the motions of all the atoms in one little chunk of the gear, it won’t tell us much about the motions of the atoms in some other chunk far away in the gear.

… with one exception. If we average together the motion of all the atoms in our little chunk, we should get a decent estimate of the overall rotation speed of the gear. And that does tell us something about all the other little chunks of metal: it tells us roughly what the average motion in all those other chunks is.

If we look at all the atoms in one little chunk of one gear, only the average motion of all the atoms will tell us much about motion of atoms in a neighboring gear.
Caution: physical accuracy not a priority in this visual.

Likewise, if we look at the motion of all the individual atoms in one little chunk of our gear, what does that tell us about the motion of atoms in other gears? Mostly nothing… except, again, the average motion of all the atoms tells us the overall rotation speed of our gear, which tells us something about the overall rotation speed of the other gears.

So, if we were to somehow model the whole system at atomic scales, we’d find that all the information in one little chunk of atoms, relevant to some other chunk of atoms far away, was summarized by the rotation speed of the gear.

In terms of dimensionality: the atom motions are extremely high dimensional. Every atom’s speed is a 3-dimensional vector, so with n atoms we have 3n dimensions. Even a tiny patch of metal has an awful lot of atoms, so that’s an awful lot of dimensions.

Yet most of that information is irrelevant to everything far away. For purposes of predicting things far away, that huge number of dimensions can be summarized by a one-dimensional number: the gear’s rotation speed. That rotation speed, in turn, informs the motions of huge numbers of other atoms elsewhere in the system.

It’s an “information bottleneck”: (high dimensional atom motions) → (one dimensional rotation speed) → (high dimensional atom motions elsewhere). We can think of the abstract object—the “gear”—as an interface to all the lower-level components which comprise the gear (e.g. atoms, in a physical gear).

In general, a good “gear” in a model picks out some chunk of an object for which we have a good low-dimensional summary. All the information from that chunk which is relevant to predicting things elsewhere in the system should be summarized by a few low-dimensional parameters. For a physical gear, it’s the rotation speed. The next few sections give other examples.

Rigid Body Dynamics

One direct generalization of a physical gearbox is rigid body dynamics: we have rigid, solid objects which move around, push each other, bounce off each other, etc. This is a good model for most of the non-flexible solid objects around us most of the time: tables and chairs, wheels, hammers and screwdrivers and screws and nails, pens and pencils, sticks and rocks, pots and pans and dishes and silverware, etc.

When thinking about the mechanics of rigid bodies (i.e. how they move), all the positions and motions of individual atoms in the object can be summarized by the overall position, orientation and motion of the object. It’s just like gears, though slightly more general—gears mostly just rotate in place, while rigid bodies in general can also move around. So, it’s natural to chunk rigid bodies into individual “objects”, i.e. treat them as single gears in our models. This is exactly what we do in our everyday lives: we treat a pencil as an object, a plate as an object, a rock as an object, …. These things make natural “objects” because the low-level dynamics of all their atoms can be summarized by just the position, orientation and motion of the whole object, for purposes of predicting how things will move.

Note, however, that this is not sufficient for all questions we might ask about the object. For instance, if I want to know what sound an object makes when struck, then the vibrations of all the little parts become relevant. Rigid bodies make natural “gears” because we can answer a very broad range of questions just given a low-dimensional summary, not because we can answer all questions just given a low-dimensional summary.

Things get more complicated for non-rigid bodies, like cloth or rope. Sometimes we can use a low-dimensional summary for the dynamics, like a rope in a pulley, but not always; cloth moves around in complicated ways. We can still answer a broad range of questions using only summary information, just not necessarily questions about dynamics. For instance, looking at the atoms in one little chunk of cloth tells us very little about other atoms in the same chunk of cloth, except that the material composition is probably the same. This is quite similar to the “gears” used in chemical models.

Chemical Equations

In chemistry, we summarize the state of a huge number of atoms with just a handful of chemical concentrations (plus temperature and pressure, depending on the application). Why? Well, as long as the system is mixed up, it doesn’t matter exactly where particular molecules are—they’re all more-or-less evenly distributed throughout the system. The exact positions of atoms in one patch of fluid at one time don’t tell us much about the exact positions of atoms in another patch of fluid at a later time, except insofar as they tell us about the average concentrations of each molecule type throughout the system as a whole.

High-dimensional atom positions in one patch of fluid tell us low-dimensional average concentrations, which is all the information relevant to high-dimensional atom positions in some other patch of fluid (assuming everything is well-mixed).

Note that this changes if the system isn’t well-mixed, e.g. a layer of oil on top of water, or a diffusion gradient. Then we either need to keep track of more information (e.g. concentrations in small patches of fluid) or we need to further restrict the questions we want to answer.

Electronic Circuits

Electronic circuits are a particularly interesting example, because avoiding “crosstalk” between circuit components is an explicit design goal. We generally don’t want e.g. a resistor to behave differently if there’s a magnet nearby.

In other words: the design of a resistor is chosen so that we can summarize all the information about the component using just its overall resistance and the overall current (or voltage) through it at any given moment. We don’t need to worry about the details of the physical connection between the resistor and a wire. We don’t need to worry about whether the resistor is upside down, or spinning around, or a little hotter/​colder than room temperature. We don’t need to worry about the low-level behavior of individual atoms within the resistor. We just need a one dimensional summary: overall current is proportional to overall voltage delta.

The same applies to other electronic components—transistors, wires, capacitors, transformers, diodes, etc. Components are designed to make good “gears”: their behavior has a simple, low-dimensional summary.

API Functions

In software design, we often draw high-level diagrams of the software in which each component is a function in some API, and lines/​arrows between components show which information is passed between them. This is useful mainly to the extent that the inputs and outputs of each component can be summarized without needing a detailed representation of the internal behavior of each function.

This is often considered a major criterion of good software design: function should have simple interfaces. Even if there’s lots of internal complexity, like some function with many internal calculations (i.e. high dimensional, with many intermediate values calculated), the inputs and outputs should be low-dimensional (at least compared to the internals). When functions satisfy this criterion, they make good “gears” in our conceptual models of the software’s behavior.

Companies

In economics, companies work a lot like functions in software. The “interface” they provide is their catalogue of products. The product itself is relatively low-dimensional, compared to the complicated process which produced the product. It’s the same idea behind the classic “I, Pencil” essay: even something as simple as a pencil is produced by hundreds of people and machines with specialized functions. The pencil-user does not understand all the details of the pencil production process, but they don’t need to—they just need to know how to use a pencil.

As an anti-example, imagine some preindustrial ironsmith. Without reliable standards for metal composition and quality, the smith might buy very different “iron” from different suppliers at different times. The smith would either have to know quite a bit about the production of the raw iron, or else accept an inconsistent product. If the smith needs to keep track of the whole production process, then the iron supplier would be a bad gear in an economic model.

Organs

The physiology of the kidney or liver or other internal organs is rather involved. Each contains a wide variety of specialized cell types and substructures, all in large numbers; their internal structure is high-dimensional. Yet the overall function of most organs allows a relatively low-dimensional summary: they regulate levels of specific hormones or metabolites or cell types.

Takeaway

A good “gear”—i.e. a lowest-level component in a model—should offer a (relatively) low-dimensional summary of its own internal subcomponents. That low-dimensional summary makes it practical to treat the gear as a black box, even though its internal components may be complicated and high dimensional (e.g. made of lots of atoms). The information which needs to be included in the summary depends on what questions we want to answer about the system, but a common theme is that broad classes of questions about behavior not-too-close to a particular subcomponent depend only on a few summary dimensions.

Finally, I’ll note that some of the examples above brush some subtleties under the rug (though the main idea does generally work as advertised). People are invited to try to spot some of those subtleties, and possibly propose the resolutions, in the comments. I’ll point to one to kick things off: how does temperature fit in to our physical gearbox example?