In the 70s and 80s, Kahneman and Tversky did a bunch of pioneering research on heuristics and biases in human thought. Then, in *Thinking Fast and Slow, *Kahneman divided human cognition into System 1 and System 2 - basically, System 1 applies quick heuristics which are prone to biases, and System 2 does the slow, effortful thinking.

But what does System 2 actually add to the theory in terms of explanatory power? Consider an alternative version of *Thinking Fast and Slow *in which Kahneman wrote something like “Here are the conditions in which humans use this mode of reasoning I’m calling System 1, which is fast and approximate and effortless and uses heuristics and demonstrates biases which can be detected in certain ways. The rest of the time, I have no idea what’s going on, except that it doesn’t display the traits that would qualify it as System 1 inference.” In what ways would this be less informative than his actual claims?

As I recall Kahneman is somewhat careful to avoid presenting S1/S2 as part of a dual process theory, and in doing so naturally cuts off some of the chance to turn around and use S2 causally upstream of the things he describes. I think you are correctly seeing the Kahneman is very careful in how he writes, such that S1/S2 are not gears in his model so much as post hoc patterns that act as nice referents to, in his model, isolated behaviors that share certain traits without having to propose a unifying causal mechanism.

Nonetheless, I think we can identify S2 roughly with the neocortex and S1 roughly with the rest of the brain, and understand S1/S2 behaviors as those primarily driven by activity in those parts of the brain. Kahneman just is careful, in my recollection, to avoid saying things like that because there’s no hard proof for it, just inference.

For me there are two key components: the transition of a task from an S2 task to an S1 task through repetition and hypothesizing/internalising heuristics, and the use of S1 subtasks to solve more difficult S2 tasks. As an example, consider how mathematical operations move from being S2 to S1 in human learning processes.

Consider a child that can count up and down on the integers—i.e. given an integer, we can apply the increment function and get the next integer, or the decrement function to get the previous one. This is a S1 task, where the result of the operation is taken as “just-so”. At that moment addition is still a S2 task for them, and one they solve through repeated application of S1 subtasks: one approach to solve A+B is to sequentially and repeatedly increment A and decrement B until B=0, at which point your incremented result is the answer.

With enough practice, the child learns the basic rules of addition, and it becomes so deeply ingrained that addition is now an S1 task. Multiplication, however, is still S2 to them, but might be solved like this: to multiply A and B, start with a C=0, and then decrement A every time you add B to C. Once A=0, C=A*B. Through enough repetition, they internalise this algorithm (or learn many examples of it by rote) and multiplication might be an S1 task now.

By now you can hopefully see where I’m going—exponentiation is the analogous S2 task on the next level up, and there’s an algorithm a learner might perform to decompose it into a sequence of S1 tasks. (Of course, outside the realm of mathematics S2 tasks may be much fuzzier, e.g. puzzling over ethical dilemmas.)

The interesting thing about this (to me) is that the transition from S2 task to S1 task is the critical time where systemic errors and biases may be introduced. I see this as analogous to how a neural net can underfit/overfit training data, depending on the heuristics that are learned. With this analogy, training a neural network transitions from a difficult S2 task into an S1, black-box-esque input/output mapper. This can provide rapid “intuitive” results for us in the same way as S1 human thinking does—but is similarly error-prone.