The implication for AI / AGI is that humans will never create human-similar AI. Everything we make will be way ahead in many areas and way behind in others
Is this not a mere supervised learning problem? You’re saying, for some problem domain D, you want to predict the probability distribution of actions a Real Human would emit when given a particular input sample.
This is what a GPT is, it’s doing something very close to this, by predicting, from the same input text string a human was using, what they are going to type next.
We can extend this, to video, and obviously first translate video of humans to joint coordinates, and from sounds they emit back to phonemes, then do the same prediction as above.
We would expect to get an AI system from this method that approximates the average human from the sample set we trained on. This system will be multimodaland able to speak, run robotics, and emit text.
Now, after that, we train using reinforcement learning, and that feedback can clear out mistakes, so that the GPT system is now less and less likely to emit “next tokens” that the consensus for human knowledge believes is wrong. And the system never tires and the hardware never miscalculates.
And we can then use machine based RL—have robots attempt tasks in sim and IRL, autonomously grade them on how well the task was done. Have the machine attempt to use software plugins, RL feedback on errors and successful tool usage. Because the machinery can learn on a larger scale due to having more time to learn than a human lifetime, it will soon exceed human performance.
And we also have more breadth with a system like this than any single individual living human.
But I think you can see how, if you wanted to, you could probably find a solution based on the above that emulates the observable outputs of a single typical human.
The implication for AI / AGI is that humans will never create human-similar AI. Everything we make will be way ahead in many areas and way behind in others
Is this not a mere supervised learning problem? You’re saying, for some problem domain D, you want to predict the probability distribution of actions a Real Human would emit when given a particular input sample.
This is what a GPT is, it’s doing something very close to this, by predicting, from the same input text string a human was using, what they are going to type next.
We can extend this, to video, and obviously first translate video of humans to joint coordinates, and from sounds they emit back to phonemes, then do the same prediction as above.
We would expect to get an AI system from this method that approximates the average human from the sample set we trained on. This system will be multimodal and able to speak, run robotics, and emit text.
Now, after that, we train using reinforcement learning, and that feedback can clear out mistakes, so that the GPT system is now less and less likely to emit “next tokens” that the consensus for human knowledge believes is wrong. And the system never tires and the hardware never miscalculates.
And we can then use machine based RL—have robots attempt tasks in sim and IRL, autonomously grade them on how well the task was done. Have the machine attempt to use software plugins, RL feedback on errors and successful tool usage. Because the machinery can learn on a larger scale due to having more time to learn than a human lifetime, it will soon exceed human performance.
And we also have more breadth with a system like this than any single individual living human.
But I think you can see how, if you wanted to, you could probably find a solution based on the above that emulates the observable outputs of a single typical human.