Teaching the Unteachable

Previously in series: Unteachable Excellence
Followup to: Artificial Addition

The literary industry that I called “excellence pornography” isn’t very good at what it does. But it is failing at a very important job. When you consider the net benefit to civilization of Warren Buffett’s superstar skills, versus the less glamorous but more communicable trick of “reinvest wealth to create more wealth”—there’s hardly any comparison. You can see how much it would matter, if you could figure out how to communicate just one more skill that used to be a secret sauce. Not the pornographic promise of consuming the entire soul of a superstar. Just figuring out how to reliably teach one more thing, even if it wasn’t everything...

What makes a success hard to duplicate?

Naked statistical chance is always incommunicable. No matter what you say about your historical luck, you can’t teach someone else to have it. The arts of seizing opportunity, and exposing yourself to positive randomness, are commonly underestimated; I’ve seen people stopped in their tracks by “bad luck” that a Silicon Valley entrepreneur would drive over like a steamroller flattening speed bumps… Even so, there is still an element of genuine chance left over.

Einstein’s superstardom depended on his genetics that gave him the potential to learn his skills. If a skill relies on having that much brainpower, you can’t teach it to most people… Though if the potential is one-in-a-million, then six thousand Einsteins around the world would be an improvement. (And if we’re going to be really creative, who says genes are incommunicable? It just takes more advanced technology than a blackboard, that’s all.)

So when we factor out the genuinely unteachable—what’s left? Where you can you push the border? What is it that might be possible to teach—albeit perhaps very difficult—and isn’t being taught?

I was once told that half of Nobel laureates were the students of other Nobel laureates. This source seems to assert 155 out of 503. (Interestingly, the same source says that the number of Nobel laureates with Nobel “grandparents” (teachers of teachers) is just 60.) Even after discounting for cherry-picking of students and political pull, this suggests to me that you can learn things by apprenticeship—close supervision, free-form discussion, ongoing error correction over a long period of time—that no Nobel laureate has yet succeeding in putting into any of their many books.

What is it that the students of Nobel laureates learn, but can’t put into words?

This subject holds a fascination for me, because how it delves into the meta, the source behind, the gap between the output and the generator. We can explain Einstein’s General Relativity to students, but we can’t make them Einstein. (If you look at it from the right angle, the whole trick of human intelligence is just an incommunicable insight that humans have and can’t explain to a computer.)

The amount of wordless intelligence in our work tends to be underestimated because the words themselves are so much easier to introspect on. But when I’m paying attention, I can see how much of my searchpower takes place in fast flashes of perception that tell me what’s important, which thought to think next.

When I met my apprentice Marcello he was already better at mathematical proof than myself, certainly much faster. He’d competed at the national level—but in competitions like that you get told which problems are important. (And also in competitions, you instantly hand in the problem when you’re done, and rush on to the next one; without looking over your proof to see if you can simplify it, see it at a glance, learn something more.) But the really critical thing I was trying to teach him—testing to see if it could even be taught at all—was this sense of which AI problems led somewhere. “You can pedal as well as I can,” I said to him early on when he asked how he was doing, “but I’m still doing ninety percent of the steering.” And it was a constant, tremendous struggle to put anything into words about why I thought that we hadn’t yet found the really important insight that was lurking somewhere in a problem, and so we were going to discard Marcello’s current proof and reformulate the problem and try again from another angle, to see if this time we would really understand something.

We go through our life events, and our brain uses an opaque algorithm to grind the experiences to grist, and outputs yet another opaque neural net of circuitry: the procedural skill, the source of wordless intuitions that you know so fast you can’t see yourself knowing them. “The zeroth step”, I called it, the step in reasoning that comes before the first step and goes by so quickly that you don’t realize it’s there.

I pride myself on being good at putting things into words, at being able to introspect on the momentary flashes and see their pattern and trend, even if I can’t print out the circuitry that is their source. But when I tried to communicate my cutting edge, the borderline where I advanced my knowledge—then my words were defeated, and I was left working with Marcello on problem after problem, hoping his brain would pick up that unspoken rhythm of the steering: Turn left, turn right; this is probably worth pursuing, this is not; this seems like a valuable insight, this is just a black box around our ignorance.

I’d expected it to go like that; I’d never had the delusion that the most important parts of thought would be easy to put in words. If it were that simple we really would have had Artificial Intelligence in the 1970s.

Civilization gets by on teaching the output of the generator without teaching the generator. Einstein output his various discoveries, and then the generated knowledge is verbal enough to be passed on to students in university. When another Einstein is needed, civilization just holds its breath and hopes.

But if these wordless skills are the product of experience—then why not communicate the experiences? Or if fiction isn’t good enough, and it probably isn’t even close, then why not duplicate the experiences—put people through the same events?

(1) Superstars may not know what their critical experiences were.

(2) The critical experiences may be difficult to duplicate—for example, everyone already knows the answer to Special Relativity, and now we can’t train people by giving them the problem of Special Relativity. Just knowing that it has something to do with space and time shifting around, is already too much of a spoiler. The really important part of the problem is the one where you stare at a blank sheet of paper until drops of blood form on your forehead, trying to figure out what to think next. The skills of genius are rare, I’ve suggested, because there is not enough opportunity to practice them.

(3) There may be luck or genetic talent involved in your brain hitting on the right thing to learn—finding a solution of high quality in the space of wordless procedural skills. Even if we put you through the same experiences, there’s components of true chance and genetic talent left over in having your brain learn the same wordless skill.

But I think there’s still reason to go on trying to describe the indescribable and teach the unteachable.

Consider the transition in gambling skill associated with the invention of probability theory a few centuries back. There’s still a leftover art to poker, wordless skills that poker superstars can only partially describe in words. But go back far enough, and no one would have any idea how to calculate the odds of rolling three dice and coming up with all ones. And maybe an experienced enough gambler would have a wordless intuition that some things were likelier than others, but they couldn’t have put it into words—couldn’t have told anyone else what they’d learned about the chances; except, maybe, through a long process of watching over an apprentice’s shoulder and supervising their bets.

The more we learn about a domain, and the more we systematically observe the stars at work, and the more we learn about the human mind in general, the more we can hope for new skills to make the transition from unteachable to apprenticeable to publishable.

And you can hope to trailblaze certain paths, even if you can’t set down all the path in words. Even if you yourself got somewhere through luck (including genetic luck), you can hope to diminish the role of luck on future occasions:

(A) Warning against blind alleys that delayed you, is one obvious sort of help.

(B) If you lay down a set of thoughts that are the product of wordless skills, someone reading through the set of thoughts may find their brain picking up the rhythm, making the leap to the unspoken thing behind; and this might require less luck than the events that led to your own original acquisition of those wordless skills.

(C) There are good attractors in the solution-space—clustered sub-solutions which make it easier to reach other solutions in the same attractor. Then—even if some of the thoughts can’t be put into words, and even if it took a lot of luck to wander into the attractor the first time through—describing everything that can be put into words, may be enough to anchor the attractor.

(D) Some important experiences are duplicable: for example, you can advise people what areas to study, what books to read.

(E) And finally, the simple advance of science may just describe a domain better, so that you realize what it is you know, and are suddenly able to communicate it outright.

And of course the punchline is that this is the transition I hope to see in certain aspects of human rationality—skills which have been up until now unteachable, or only passed down from master to apprentice. We’ve learned a lot about the domain in the past few decades, and I think it’s time to take another shot at systematizing it.

I aspire to diminish the role of luck and talent in producing rationalists of a higher grade.