Deliberative Cognitive Algorithms as Scaffolding

As rationalists, we are interested in finding systematic techniques that boost our effective intelligence. Because we tend to be mathematical thinkers, we usually look for precise algorithmic solutions, such as computing the optimal inference laws of Bayesian probability when this is feasible. I will call this the “rationalist project” because it seems like a reasonable candidate for the title (and not because it is the only or necessarily the primary one). There are other natural ways to increase intelligence (neuroscience or pharmaceutical tricks) but at least for me, these are not as intellectually satisfying, and also seem unlikely to yield large percentage or qualitative gains for most people. Pursuing our project, we often borrow from other (academic) fields, most notably artificial intelligence, cognitive science, and economics. None of these is really suitable.

Artificial intelligence studies how to build intelligent artifacts. In the early days, this had something to do with studying logic, heuristics, and abstractions, which are tools that humans can and did apply (though I am not so sure GOFAI increased our ability to apply them ourselves, except through loose analogies). Nowadays, the goals of A.I. research have start to come apart from the goals of the rationalist project. The art of distributing training across GPUs and choosing architectures that fully utilize them, constructing stable gradients, etc., has no direct relationship to the project. Building a superintelligence does not necessarily require the same skills as bootstrapping yourself into one without access to your own hardware. I think that this divergence of goals is sometimes ignored on lesswrong, because A.I. seems more interesting and important the more it succeeds. But in many ways, shifts in focus that drove its success also make it less central to the project.

Cognitive science is much closer to the project by its nature. However, as a science, it focuses primarily on how people do think, and not on how they should. The cognitive science research that does address how people should think tends to be a little less algorithmic and rigorous than in A.I., at least in my limited experience. One exception is this excellent paper from Josh Tenenbaum, which I will return to in this sequence: https://​​cocosci.princeton.edu/​​tom/​​papers/​​OneAndDone.pdf

Bayesian decision theory is the central mathematical framework of rationalist thought, and one of many useful tools adopted from economics. However, it is a mathematical specification of optimal decision making, not an algorithm for optimal decision making. Jaynes has contributed much to the foundations of the project, but even his goals in Probability Theory as Logic were (explicitly!) not the goals of the project. He imagines ideal reasoning as carried out by an ideal “robot” reasoner unconstrained by computational limits. Economists do, of course, consider models of limited computation decision making. However, even this does not precisely describe our situation as human reasoners: because much of our processing is unconscious, we are not capable of adopting the optimal algorithm for a decision problem even if we knew it! Instead we must adopt the optimal algorithm that we can actually run, with access to the parts of our minds that we don’t consciously control.

The project seeks deliberative cognitive algorithms: algorithms that we can learn and consciously execute. These algorithms execute in an environment with access to certain fixed resources: our senses are very good feature extractors, and our unconscious minds are very good pattern recognizers. The optimal deliberative cognitive algorithms would take advantage of these resources; but the internal workings of these resources may not be relevant (that is, may be “screened off” by their teleology and limitations). Constructing such DCA’s with black box access to powerful unconscious modules is the goal of the rationalist project.

There is already a term for such DCA’s: scaffolding.

This term was originally introduced in Scaffolded LLMs as natural language computers. I don’t really endorse the frame in this post, and I am interested in scaffolding various kinds of modules, not restricted to LLMs.

In complexity theory, we might call these “algorithms with oracle access.” So I will tend to say that DCA’s have oracle access to modules.

Explicitly, scaffolding research advances the rationalist project because it tells us how to best consciously take advantage of our unconscious abilities.

I do not claim that scaffolding is the best way to achieve A.G.I. This is probably false. It is probably best for higher level executive functions to interweave their computations closely with the modules they use, perhaps even to the extent that there is no obvious separation (though in the brain, I understand from Capabilities and alignment of LLM cognitive architectures that there is great functional separation). The fastest way to build an A.G.I. is probably to train the whole thing end to end in some fashion, so that it can take the best possible advantage of all available synergies between tasks. Scaffolding generative models to create agents resembles the long tradition of neuro-symbolic A.I. that never really works; I think it is based on the fantasy that if we don’t know how to build capability X into system M, we can fake it by adding capability X on top of system M. This is consistently a kludge, and I don’t know of any significant progress arising from it.

Indeed, the interweaving of deliberate executive functions with unconscious functions probably takes place in human minds as well. It’s possible that these are so inextricable that humans can’t effectively wield scaffolding methods, but I suspect not.

I want to explore the possibility that scaffolding is the right frame for the rationalist project. Over the course of this sequence, I will explore the theoretical and experimental sides of this thesis, attempting to establish scaffolding on firmer mathematical and scientific ground. Though this is not my primary goal, I will also explore the implications for alignment (at risk of being caught in an affective death spiral, I believe scaffolding is relevant to running our minds more effectively with black box access to our unconscious resources AND perhaps to building safe agents with black box access to un-agentic parts).

Let’s get started!