i’ve been thinking about what recursively improving intelligence actually is. It’s been helpful for me to see it as having three parts:
substrate—the thing which can store improvements
environment—the thing which gives feedback
learning loop—the mechanism for converting feedback into improvements
you can see LLM pre-training in this light. the substrate is the neural net, the environment is the internet text, the learning loop is gradient descent (shaped by transformer architecture)
you can see LLM post-training in this light. the substrate is the neural net (but outer layers in particular), the environment is often reasoning path/text, the learning loop is gradient descent.
you can see LLM-based assistants in this light. the substrate is the conversation context/messages, the environment is the user’s responding, the learning loop is the way the LLM-core converts user responses into conversation progress. the substrate here (message history) is obviously less rich than in LLM training (big neural net) and the learning loop (LLM responses to user messages) is more inconsistent/average/weak and the environment (user messages) may be of similar quality to internet text (thought depends on the aim).
LLM pre and post training so far are two main recursively intelligences which we’ve seen make great strides. but none of these have live rather than static environments; ie both are supervised learning.
if we want recursive digital intelligence in live environments, we need a rich substrate which can update live and tastefully as it explores. this suggests to me that the path toward ever-capability digital intelligence will come from something as rich as a neural net updating live in response to interacting with people. research in sparse NN online learning therefore seems interesting as the capabilities path here, even if nascent.
just spelling out my thought path on this fine monday evening. i realize it may already be well trodden.
i’ve been thinking about what recursively improving intelligence actually is. It’s been helpful for me to see it as having three parts:
substrate—the thing which can store improvements
environment—the thing which gives feedback
learning loop—the mechanism for converting feedback into improvements
you can see LLM pre-training in this light. the substrate is the neural net, the environment is the internet text, the learning loop is gradient descent (shaped by transformer architecture)
you can see LLM post-training in this light. the substrate is the neural net (but outer layers in particular), the environment is often reasoning path/text, the learning loop is gradient descent.
you can see LLM-based assistants in this light. the substrate is the conversation context/messages, the environment is the user’s responding, the learning loop is the way the LLM-core converts user responses into conversation progress. the substrate here (message history) is obviously less rich than in LLM training (big neural net) and the learning loop (LLM responses to user messages) is more inconsistent/average/weak and the environment (user messages) may be of similar quality to internet text (thought depends on the aim).
LLM pre and post training so far are two main recursively intelligences which we’ve seen make great strides. but none of these have live rather than static environments; ie both are supervised learning.
if we want recursive digital intelligence in live environments, we need a rich substrate which can update live and tastefully as it explores. this suggests to me that the path toward ever-capability digital intelligence will come from something as rich as a neural net updating live in response to interacting with people. research in sparse NN online learning therefore seems interesting as the capabilities path here, even if nascent.
just spelling out my thought path on this fine monday evening. i realize it may already be well trodden.