See “Brain complexity is easy to overstate” section here.
Sure, but I still think it’s probably more way more complex than LLMs even if we’re just looking at the parts key for AGI performance (in particular, the parts which learn from scratch). And, my guess would be that performance is substantially greatly degraded if you only take only as much complexity as the core LLM learning algorithm.
Let’s imagine installing an imitation learning module in Alice’s brain that makes her reflexively say X in context Y upon hearing Bob say it. I think I’d expect that module to hinder her learning and understanding, not accelerate it, right?
This isn’t really what I’m imagining, nor do I think this is how LLMs work in many cases. In particular, LLMs can transfer from training on random github repos to being better in all kinds of different contexts. I think humans can do something similar, but have much worse memory.
I think in the case of humans and LLMs, this is substantially subconcious/non-explicit, so I don’t think this is well described as having a shoulder Bob.
Also, I would say that humans do learn from imitation! (You can call it prediction, but it doesn’t matter what you call it as long as it implies that data from humans makes things scale more continuously through the human ragne.) I just think that you can do better at this than humans based on the LLM case, mostly because humans aren’t exposed to as much data.
Also, I think the question is “can you somehow make use of imitation data” not “can the brain learning algorithm immediately use of imitation”?
In my mind, the (imperfect!) analogy here would be (LLMs, new paradigm) ↔ (previous Go engines, AlphaGo and successors).
Notably this analogy implies LLMs will be able to automate substantial fractions of human work prior to a new paradigm which (over the course of a year or two and using vast computational resources) beats the best humans. This is very different from the “brain in a basement” model IMO. I get that you think the analogy is imperfect (and I agree), but it seems worth noting that the analogy you’re drawing suggests something very different from what you expect to happen.
Is there a list somewhere? A paper I could read? (Or is it all proprietary?)
It’s substantially proprietary, but you could consider looking at the Deepseek V3 paper. We don’t actually have great understanding of the quantity and nature of algorithmic improvment after GPT-3. It would be useful for someone to do a more up to date review based on the best available evidence.
I’m not sure that complexity is protecting us. On the one hand, there are just 1MB of bases coding for the brain (and less for the connectome), but that doesn’t mean we can read it and it may take a long time to reverse engineer.
On the other hand, our existing systems of LLMs are already much more complex than that. Likely more than a GB of source code for modern LLM-running compute center servers. And here the relationship between the code and the result is better known and can be iterated on much faster. We may not need to reverse engineer the brain. Experimentation may be sufficient.
Sure, but I still think it’s probably more way more complex than LLMs even if we’re just looking at the parts key for AGI performance (in particular, the parts which learn from scratch). And, my guess would be that performance is
substantiallygreatly degraded if you only take only as much complexity as the core LLM learning algorithm.This isn’t really what I’m imagining, nor do I think this is how LLMs work in many cases. In particular, LLMs can transfer from training on random github repos to being better in all kinds of different contexts. I think humans can do something similar, but have much worse memory.
I think in the case of humans and LLMs, this is substantially subconcious/non-explicit, so I don’t think this is well described as having a shoulder Bob.
Also, I would say that humans do learn from imitation! (You can call it prediction, but it doesn’t matter what you call it as long as it implies that data from humans makes things scale more continuously through the human ragne.) I just think that you can do better at this than humans based on the LLM case, mostly because humans aren’t exposed to as much data.
Also, I think the question is “can you somehow make use of imitation data” not “can the brain learning algorithm immediately use of imitation”?
Notably this analogy implies LLMs will be able to automate substantial fractions of human work prior to a new paradigm which (over the course of a year or two and using vast computational resources) beats the best humans. This is very different from the “brain in a basement” model IMO. I get that you think the analogy is imperfect (and I agree), but it seems worth noting that the analogy you’re drawing suggests something very different from what you expect to happen.
It’s substantially proprietary, but you could consider looking at the Deepseek V3 paper. We don’t actually have great understanding of the quantity and nature of algorithmic improvment after GPT-3. It would be useful for someone to do a more up to date review based on the best available evidence.
I’m not sure that complexity is protecting us. On the one hand, there are just 1MB of bases coding for the brain (and less for the connectome), but that doesn’t mean we can read it and it may take a long time to reverse engineer.
source: https://xkcd.com/1605/
On the other hand, our existing systems of LLMs are already much more complex than that. Likely more than a GB of source code for modern LLM-running compute center servers. And here the relationship between the code and the result is better known and can be iterated on much faster. We may not need to reverse engineer the brain. Experimentation may be sufficient.