Specializing in Problems We Don’t Understand
Most problems can be separated pretty cleanly into two categories: things we basically understand, and things we basically don’t understand. Some things we basically understand: building bridges and skyscrapers, treating and preventing infections, satellites and GPS, cars and ships, oil wells and gas pipelines and power plants, cell networks and databases and websites. Some things we basically don’t understand: building fusion power plants, treating and preventing cancer, high-temperature superconductors, programmable contracts, genetic engineering, fluctuations in the value of money, biological and artificial neural networks. Problems we basically understand may have lots of moving parts, require many people with many specialties, but they’re generally problems which can be reliably solved by throwing resources at it. There usually isn’t much uncertainty about whether the problem will be solved at all, or a high risk of unknown unknowns, or a need for foundational research in order to move forward. Problems we basically don’t understand are the opposite: they are research problems, problems which likely require a whole new paradigm.
In agency terms: problems we basically understand are typically solved via adaptation-execution rather than goal-optimization. Problems we basically don’t understand are exactly those for which existing adaptations fail.
Main claim underlying this post: it is possible to specialize in problems-we-basically-don’t-understand, as a category in its own right, in a way which generalizes across fields. Problems we do understand mainly require relatively-specialized knowledge and techniques adapted to solving particular problems. But problems we don’t understand mainly require general-purpose skills of empiricism, noticing patterns and bottlenecks, model-building, and design principles. Existing specialized knowledge and techniques don’t suffice—after all, if the existing specialized knowledge and techniques were sufficient to reliably solve the problem, then it wouldn’t be a problem-we-basically-don’t-understand in the first place.
So… how would one go about specializing in problems we basically don’t understand? This post will mostly talk about how to choose what to formally study, and how to study it, in order to specialize in problems we don’t understand.
Specialize in Things Which Generalize
Suppose existing models and techniques for hot plasmas don’t suffice for fusion power. A paradigm shift is likely necessary. So, insofar as we want to learn skills which will give us an advantage (relative to existing hot plasma specialists) in finding the new paradigm, those skills need to come from some other area—they need to generalize from their original context to the field of hot plasmas. We want skills which generalize well.
Unfortunately, a lot of topics which are advertised as “very general” don’t actually add much value on most problems in practice. A lot of pure math is like this—think abstract algebra or topology. Yes, they can be applied all over the place, but in practice the things they say are usually either irrelevant or easily noticed by some other path. (Though of course there are exceptions.) Telling us things we would have figured out anyway doesn’t add much value.
There are skills and knowledge which do generalize well. Within technical subjects, think probability and information theory, programming and algorithms, dynamical systems and control theory, optimization and microeconomics, linear algebra and numerical analysis. Systems and synthetic biology generalize well within biology, mechanics and electrodynamics are necessary for fermi estimates in most physical sciences, continuum mechanics and PDEs are useful for a wide variety of areas in engineering and science.
But just listing subjects isn’t all that useful—after all, a lot of the most generally-useful skills and techniques don’t explicitly appear in a university course catalogue (or if they do, they appear hidden in a pile of more-specialized information). Many aren’t explicitly taught at all. What we really need is an outside-view criterion or heuristic, some way to systematically steer toward generalizable knowledge and skills.
To Build General Problem-Solving Capabilities, Tackle General Problems
It sounds really obvious: if we want to build knowledge and skills which will apply to a wide variety of problems, then we should tackle a wide variety of problems. Then, steer toward knowledge and skills which address bottlenecks relevant to multiple problems.
Early on in a technical education, this will usually involve fairly basic things, like “how do I do a Fermi estimate for this design?” or “what are even the equations needed to model this thing?” or “how do the systems involved generally work?”—questions typically answered in core classes in physics or engineering, and advanced classes in biology or economics. Propagating back from that, it will also involve the math/programming skills needed to both think about and simulate a wide variety of systems.
But even more important than coursework, having a wide variety of problems in mind is directly useful for learning to actually use the relevant skills and knowledge. A lot of the value of studying generalizable knowledge/skills comes from being able to apply them in new contexts, very different from any problem one has seen before. One needs to recognize, without prompting, situations-in-the-wild in which a model or technique applies.
A toy example which I encountered in the wild: Proset is a variant of the game Set. We draw a set of cards with random dots of various colors, and the goal is to find a (nonempty) subset of the cards such that each color appears an even number of times.
How can we build a big-O-efficient algorithmic solver for this game? Key insight (hover to reveal):
Write down a binary matrix in which each column is a card, each row is a color, and the 0⁄1 in each entry says whether that color is present on that card. Then, the game is to find the nullspace of that matrix, in arithmetic mod 2. We can solve it via row-reduction.
(Read the spoiler text before continuing, but don’t worry if you don’t know what the jargon means.)
If we’re comfortable with linear algebra, then finding a nullspace via row-reduction is pretty straightforward. (Remember the claim from earlier that the things abstract algebra says are “usually either irrelevant or easily noticed by some other path”? The generalization of row reduction to modular arithmetic is the sort of thing you’d see in an abstract algebra class, rather than a linear algebra class, but if you understand row-reduction then it’s not hard to figure out even without studying abstract field theory.) Once we have a reasonable command of linear algebra, the rate-limiting step to figuring out Proset is to notice that it’s a nullspace problem.
This requires its own kind of practice, quite different from the relatively rote exercises which often show up in formal studies.
Keeping around 10 or 20 interesting problems on which to apply new techniques is a great way to practice this sort of thing. In particular, since the point of all this is to develop skills for problems which we don’t understand or know how to solve, it’s useful to keep around 10 or 20 problems which you don’t understand or know how to solve. For me, it used to be things like nuclear fusion energy, AGI, aging, time travel, solving NP-complete problems, government mechanism design, beating the financial markets, restarting Moore’s law, building a real-life flying broomstick, genetically engineering a dragon, or factoring large integers. Most classes I took in college were chosen for likely relevance to at least one of these problems (usually more than one), and whenever I learned some interesting new technique or theorem or model I’d try to apply it to one or more of these problems. When I first studied linear algebra, one of the first problems I applied it to was constructing uncorrelated assets to beat the financial markets, and I also tried for quite some time to apply it to integer factorization (and later various NP-complete problems). Those were the sorts of experiences which built the mental lenses necessary to recognize a modular nullspace problem in Proset.
Use-Cases of Knowledge and Suggested Exercises
If the rate-limiting step is to notice that a particular technique applies (e.g. noticing that proset is a nullspace problem), then we don’t even necessarily need to be good at using the technique. We just need to be good at noticing problems where the technique applies, and then we can google it if and when we need it. This suggests exercises pretty different from exercises in a lot of classes—for instance, a typical intro linear algebra class involves a lot of practice executing row reduction, but not as much recognizing linear systems in the wild.
More generally: we said earlier that problems-we-basically-understand are usually solved by adaption-execution, i.e. executing a known method which usually works. In that context, the main skill-learning problem is to reliably execute the adaptation; rote practice is a great way to achieve that. But when dealing with problems we basically don’t understand, the use-cases for learned knowledge are different, and therefore require different kinds of practice. Some example use-cases for the kinds of things one might formally study:
Learn a skill or tool which you will later use directly. Ex.: programming classes.
Learn the gears of a system, so you can later tackle problems involving the system which are unlike any you’ve seen before. Ex.: physiology classes for doctors.
Learn how to think about a system at a high level, e.g. enough to do Fermi estimates or identify key bottlenecks relevant to some design problem. Ex.: intro-level fluid mechanics.
Uncover unknown unknowns, like pitfalls which you wouldn’t have thought to check for, tools you wouldn’t have known existed, or problems you didn’t know were tractable/intractable. Ex.: intro-level statistics, or any course covering NP-completeness.
Learn jargon, common assumptions, and other concepts needed to effectively interface to some field. Ex.: much of law school.
Learn enough to distinguish experts from non-experts in a field. Ex.: programming or physiology, for people who don’t intend to be programmers/doctors but do need to distinguish good work from quackery in these fields.
These different use-cases suggest different strategies for study, and different degrees of investment. Some require in-depth practice (like skills/tools), others just require a quick first pass (like unknown unknowns), and some can be done with a quick pass if you have the right general background knowledge but require more effort otherwise (like Fermi estimates).
What kind of exercises might we want for some of these use-cases? Some possible patterns for flashcard-style practice:
Include some open-ended, babble-style questions. For instance, rather than “What is X useful for?”, something like “Come up with an application for X which is qualitatively different from anything you’ve seen before”. (I’ve found that particular exercise very useful—for instance, trying to apply coherence theorems to financial markets led directly to the subagents post.)
Include some pull-style questions, i.e. questions in which you have to realize that X is relevant. For instance “Here’s a problem in layman’s terms; what keywords should you google?” or “Here’s a system, what equations govern it?”. These are how problems will show up in real life.
Questions of the form “which of these are not relevant?” or “given <situation>, which of these causes can we rule out?” are probably useful for training gearsy understanding, and reflect how the models are used in the real world.
Debugging-style questions, i.e. “system X has weird malfunction Y, what’s likely going on, and what test should we try next?”. This is another one which reflects how gearsy models are used in the real world.
For unknown unknowns, questions like “Here’s a solution to problem X; what’s wrong with it?”. (Also relevant for distinguishing experts from non-experts.)
For jargon and the like, maybe copy some sentences or abstracts from actual papers, and then translate them into layman’s terms or otherwise say what they mean.
Similarly, a useful exercise is to read an abstract and then explain why it’s significant/interesting (assuming that it is, in fact, significant/interesting). This would mean connecting it to the broader problems or applications to which the research is relevant.
For recognizing experts, I’d recommend exercises like “Suppose you want to find someone who can help with problem X, what do you google for?”.
Cautionary note: I have never made heavy use of exercises-intended-as-exercises (including flashcards), other than course assignments. I brainstormed these exercises to mimic the kinds of things I naturally ended up doing in the process of pursuing the various hard problems we talked about earlier. (This was part of a conversation with AllAmericanBreakfast, where we talked about exercises specifically.) I find it likely that explicitly using these sorts of exercises would build similar skills, faster.
Problems-we-basically-understand are usually solved by executing specialized strategies which are already known to usually work. Problems-we-basically-don’t-understand are exactly those for which such strategies fail. Because the specialized techniques fail, we have to fall back on more general-purpose methods and models. To specialize in problems-we-basically-don’t-understand, specialize in skills and knowledge which generalize well.
To learn the sort of skills and knowledge which are likely to generalize well to new, poorly-understood problems, it’s useful to have a fairly-wide variety of problems which you basically don’t understand or know how to solve. Then, prioritize techniques and models which seem relevant to multiple such problems. The problems also provide natural applications in which to test new techniques, and in particular to test the crucial skill of recognizing (without prompting) situations-in-the-wild in which the technique applies.
This sort of practice differs from the exercises often seen in classes, which tend to focus more on reliable execution of fixed strategies. Such exercises make sense for problems-we-basically-understand, since reliable execution of a known strategy is the main way we solve such problems. But learned skills and knowledge have different use-cases for problems-we-basically-don’t-understand, and these use-cases suggest different kinds of exercises. For instance, take a theorem, and try to find a system to apply it to which is qualitatively different from anything you’ve seen before. Or, try to translate the abstract of a paper into layman’s terms.
I’ve ended up doing things like this in the pursuit of a variety of problems in the wild, and I find it likely that explicit exercises could build similar skills faster.