Please Understand

In which a case is made for worrying about the AI Prompt Box.

Preamble

Technology serves to abstract away nonessential aspects of creative activities, giving us more direct access to their conceptual cores. Few audio engineers pine for the days of flaky reel-to-reel tape machines that unspool at the worst moments; few graphic designers long to swap their Macbooks for bulky old photostat rigs; few mathematicians grieve for the sliderule or the log table.

Yet domain understanding survived those leaps to digital abstraction. Music producers working entirely ‘in the box’ still know and/​or intuit dynamics, frequency equalisation, melody and harmony. Photoshop natives still know and/​or intuit colour theory, visual communication, the rules of composition. Recent mathematics and physics graduates experience the beauty of Euler’s identity, its vast arms linking trigonometry to arithmetic to analysis to the complex plane, just as vividly as their predecessors did a century ago. Indeed, with the time these modern creatives save by not having to re-ravel 1/​4″ tape, wrestle with Zipatone and fixative or pore over columns of logarithms (to say nothing of their access to new tools), they can elevate their understanding of their fields’ fundamentals[1].

The GenAI Prompt Box declares itself the asymptote of this march to abstraction: a starry empyrean of ‘pure’, unfettered creative actualisation, in every medium, on the instant, where all pesky concrete details are swept away and the yellow brick road to Perfect Self-Expression is illuminated like a kaleidoscopic runway.

Here are some problems with that dream.

The Worse Angels of Our Nature

Consider the normal distribution of intellectual curiosity in the human population.

The long tail on the right is the minority whose genius and drivenness guarantee that they will seek and find whatever high-hanging epistemic fruit it was their destiny to pluck, no matter how alluring the paths of less resistance on offer.

On the left is the opposing minority whom no amount of goading could make curious. They want simple, earthly pleasures; they are not prepared to invest near-term time towards a long-term goal. They don’t care how things work. (And that’s fine.)

The majority of us are in the middle, on the bell: we who are naturally curious and eager to understand the world, but also fallible and time-poor and often money-poor. On a bad or short or busy day, we just don’t have the mental wherewithal to engage with the universe’s underpinnings. But if we happen to read a good pop-sci/​cultural history book on a good day, and the author blesses us with a (temporary, imperfect) understanding of some heady concept like Special Relativity or convolution or cubism, our brains sparkle with delight. We are rewarded for our effort, and might make one again.

For us willing-of-spirit but weak-of-flesh, the multimodal Prompt Box is a seductive, decadent thing. Silver-tongued, it croons an invitation to use it, abuse it, at no cognitive cost, with no strings. It will ‘write’ and ‘record’ a song for us (which we can then claim as ours), making zero demands on our lyrical or music-theoretical acumen. It will ‘paint’ a picture with content and style of our choosing (but its doing), letting us ignore such trifles as vanishing point, brush technique, tonal balance. It will write copyright-free code for a fast Fourier transform in n dimensions, shielding us from the tediums of discretisation, debugging, complex mathematics and linear algebra.

This will feel liberating for a while, giving us bell-dwellers what seem to be new, godlike powers: to bypass the nuisance of technical mastery, to wire the inchoate creative urge directly to its fulfilment. But each time we reach for the Prompt Box — where in the AI-less counterfactual we stood a nonzero chance of getting our thumbs out and at least trying to understand the concepts behind what we want — we will have a) lost another opportunity to understand, b) received a superficially positive, encouraging reward for our decision to let the machine do the understanding for us, increasing the probability of our doing so again.

The curiosity distribution will drift — maybe only slightly, but inevitably — toward the incurious.

After years or generations of this, our endogenous ability to understand will atrophy.

Lingua Franca

Language, for all its capacity for beauty and meaning, is a spectacularly low-bandwidth channel[2].

Even we, who invented it and use it to communicate intersubjectively, regularly feel language’s limitations. When we pause or stammer because we can’t quite find the words to express an idea shining clearly in our minds; when we struggle to express verbally our love (or hate) for a work of art or a person because no words are sharp or big or hot or fast or loud enough to do the job; when misunderstandings lead to violence or death.

We even use language to talk to ourselves, most of the time. But the famous flow state, coveted by artists[3] and scientists and meditators of every stripe, lets us dispense with the bottleneck of translation between verbal language and the brain’s native encoding. In the flow, time feels irrelevant. Thinking and acting feel frictionless; superconductive; efficient[4].

It’s not crazy to suspect that deep understanding and deep creativity mostly or only occur in this ‘superconducting’ phase; the non-flow, language-first state might be prohibitively slow.

It’s therefore not crazy to worry that a human/​AI collaboration in which all intra-system communication is by natural language might fundamentally lack the access to flow that a purely human system has[5].

Baby With Bathwater

It is astonishing that current AI, which in its descendants’ shadow will seem laughably primitive, can already shoulder cognitive burdens that until recently required significant human effort.

From afar, this looks like a clear net good. But up close, its trajectory is crashing through a checkpoint beyond which something essential will be leached from us.

Abstracting is understanding. To see how a particular thing generalises to a broader concept is to understand at least some of that concept. And this understanding is cumulative, iterative. It begets more of itself.

Constraints are a kind of paradox here. On one hand, to understand a conceptual model — its dynamics, its rules, its quirks, its character — is to be constrained by the model being thus and not otherwise. But one who has real insight into a system, constraints and all, commands a far bigger, richer search space than one who doesn’t. (The list of scientific and cultural breakthroughs from people with no relevant training or practice is short.)

Via the ‘unconstrained’ Prompt Box, we are offloading not just the tedious concrete instantiation of interesting concepts, but also the instructive process of abstracting those instantiations into those concepts, and therefore understanding itself, to machines whose methods of abstraction are almost totally opaque to us.

It’s too much delegation. We’re going from hunger to sausage without touring the factory, and the sausage courier compliments us on our excellent butchery. Is that a system for improving sausages?

Enter the Steel Men

“Humans have been outsourcing understanding to other humans since the dawn of civilisation, and expressing in natural language what they want from the rented expertise. A Madison Avenue executive with a great campaign idea farms its execution out to a design team, and they communicate in English. Is GenAI any different?”

It is. Humans have been outsourcing their understanding to other humans who understand, and who iterate upon that understanding. The amount of understanding in the world[6] doesn’t fall in this scenario, as it would if the advertising exec enlisted a GenAI to create their campaign; it is merely distributed. And humans regularly augment natural language with other communication channels: architectural plans, equations, body language, chemical formulae[2].

“How do you know that a world of fully machine-offloaded understanding is bad?”

I don’t, but there’s plenty of circumstantial evidence in recent history. Strong correlations between obscurantism and misery (Mao’s Cultural Revolution, Stalin’s purges, the Cambodian genocide, McCarthyism, Q-Anon, ISIS, the Taliban, the Inquisition, the Third Reich, etc.) suggest that a drop in understanding has catastrophic consequences.

Even if GenAI really understands the concepts whose concrete instantiations it abstracts away from us (and many experts doubt this), it cannot share back this understanding with us. The iterative chain is broken.

“Maybe, freed from the shackles of having to learn how what we know about now works, we’ll be able to learn how wild new things work. Things you and I can’t even imagine.”

And then slothful/​distractible human nature will kick in (see the “Worse Angels” section above), and we’ll get the machines to abstract away understanding of the wild new things as well. Also, in the absence of human understanding and with the wild new things being unimaginable, who will discover them?

“In the preamble, you give examples of insight surviving the radical abstracting-away of some of its concrete prerequisites by technology, and even of this abstracting-away freeing up more resources for further understanding. Why are you now saying that in the limit, abstracting-away goes in the opposite direction?”

The Prompt Box has already ferried us past the checkpoint that reverses these dynamics, beyond which offloaded abstraction no longer facilitates new understanding. Typing “a beautiful pastoral nighttime scene, in the style of Cézanne, with robot farmers tending to giant cow/​spider hybrids, three neon moons and a wooden space station in the sky, ultra detailed”, and getting a convincing depiction back, gives zero insight into art or art history or painting or arachnology or mammalogy or agriculture or robotics or hybridisation or fluorescence or astronomy or orbital mechanics or carpentry or computer graphics or generative artificial intelligence.

Conclusion

A curious human begins the journey to understanding by randomly perturbing an existing concrete system. The relationship between the perturbations and their consequences reveals regularities: the human has discovered an abstraction. Informed by this abstraction, the human builds a new concrete system and perturbs it, this time a little less randomly. It performs a little better.

There is constant interaction and feedback between the concrete and abstract levels. That is what unites and defines understanding, insight, play, imagination, iterative improvement, wonder, science, art, and basically all the Good Stuff humans can offer the world, sparsely distributed among the other, mostly horrible stuff we’re currently doing to it.

By indiscriminately abstracting everything away, the Prompt Box will push us away from this tightly-coupled, looping system that is the source and hallmark of ingenuity.

We might want to push back.

  1. ^

    While leaving plenty of nonjudgemental space for purists and nostalgics to play around with vinyl records, film photography, sextants and Letraset if they please.

  2. ^

    This is a problem specific to models with textual front-ends; it could be resolved by a multimodal front-end and/​or grim, neurally invasive means.

  3. ^

    Even writers, strangely.

  4. ^

    It is speculative and not universally agreed that flow state is less verbal than normal brain states, but the distinction in Gold and Ciorciari’s paper between the explicit/​linguistic and implicit/​flow processes strongly suggests it.

  5. ^

    The topology of an LLM-front end collaboration is flat, its time steps discrete:

    Human types text; machine renders media.
    Human types text; machine refines media.
    Human types text; machine refines media.

    We know little about how the brain does ideation, but it’s probably continuous in time and it probably mobilises feedback loops that make it an altogether less linear affair than the stilted conversation between carbon Promptsmith and silicon magus.

  6. ^

    If you think LLMs have or will develop (or their successors will have or will develop) a real, recursive, symbolic world-model, preface ‘world’ with ‘human’.