Not sure if I am even looking in the right direction, but in addition to token predictors, humans also have:
instincts;
feedback from reality.
Who knows, maybe without these two we wouldn’t really be better than GPT-3.
To implement feedback from reality, we would need some virtual environment, and special commands to “do” things in that environment. Doing things could have consequences; it could help or hurt the chatbot.
Chatbot could also “see” the environment, which could be implemented as getting special tokens in the prompt? They could also observe each other’s actions (maybe only when they are close to each other).
On top of that, implement instincts, something like very simple personalized extra training data. Something like, to simulate getting burned by fire, add 1000000 strings “observe:fire OUCH” to the training data, thus increasing the probability the chatbot will output OUCH after observing “fire”.
Now some genetic algorithm on top of this, so chatbots can live in their environment, have their instincts somehow modified (randomly? intentionally?), the successful chatbots reproduce, the unsuccessful ones die.
Problems with this approach:
It it works, we could create tons of suffering.
If it works, we might not understand the chatbots anyway. Like, they might evolve their own special language. Or the instincts may be just as illegible as looking at values in a neural network.
EDIT: Also, memory. If I understand it correctly, chatbots only remember a certain length of previous discussion. That is fair, human memory is not unlimited either. But perhaps the chatbot could have multiple communication channels: at least, one to talk to humans, and one to talk to itself. The chatbot could write things to its private channel to remember them after they scrolled off the remembered region of its public channel.
Not sure if I am even looking in the right direction, but in addition to token predictors, humans also have:
instincts;
feedback from reality.
Who knows, maybe without these two we wouldn’t really be better than GPT-3.
To implement feedback from reality, we would need some virtual environment, and special commands to “do” things in that environment. Doing things could have consequences; it could help or hurt the chatbot.
Chatbot could also “see” the environment, which could be implemented as getting special tokens in the prompt? They could also observe each other’s actions (maybe only when they are close to each other).
On top of that, implement instincts, something like very simple personalized extra training data. Something like, to simulate getting burned by fire, add 1000000 strings “observe:fire OUCH” to the training data, thus increasing the probability the chatbot will output OUCH after observing “fire”.
Now some genetic algorithm on top of this, so chatbots can live in their environment, have their instincts somehow modified (randomly? intentionally?), the successful chatbots reproduce, the unsuccessful ones die.
Problems with this approach:
It it works, we could create tons of suffering.
If it works, we might not understand the chatbots anyway. Like, they might evolve their own special language. Or the instincts may be just as illegible as looking at values in a neural network.
EDIT: Also, memory. If I understand it correctly, chatbots only remember a certain length of previous discussion. That is fair, human memory is not unlimited either. But perhaps the chatbot could have multiple communication channels: at least, one to talk to humans, and one to talk to itself. The chatbot could write things to its private channel to remember them after they scrolled off the remembered region of its public channel.