Tim is simply neglecting the obvious brute force solution to achieve brain-like capabilities. This is yet another startup and I’m not saying this approach will commercially succeed, but : [singularity hub]
The linked article is one on a startup called Cerebras who has gotten a ‘wafer scale engine’ to at least run in demos. This is where an entire silicon wafer is made into a large chip.
Enough of these, connected by hollow core optical fiber, would be what you need to hit that 10^21 threshold.
Also note that AI systems get a bunch of advantages that humans don’t have. Each system is immortal and is always doing it’s best. Human beings trivially make mistakes on simple tasks at high error rates—we do not “do our best” consistently 24/7/365. What does it mean to achieve human-like performance? Did you mean average performance or performance of the best human alive who is well rested?
Do you want broad spectrum capabilities or just the objects in imagenet? Because, again, it’s harder than it sounds to for a human to do better.
AI systems in applications like autonomous cars get to learn from the experiences of their peers in way that is not biased. Think about how biased the information you get from your peers is—for one thing, humans tend to only tell each other about successes, which can cause you to overestimate your chance of success for a risky venture like a startup.
While a peer autonomous vehicle can report in an unbiased way the (novel situation, true outcome) to a cloud farm that updates the learning to the fleet. Which is something each individual car doesn’t have to do—each vehicle doesn’t need to learn in itself.
In fact, here’s another flaw of Tim’s reasoning. He’s assuming we must have an AI system that learns in real time like a human does. This is not true—humans don’t learn in real time, either, it’s why we need 16-20 years of education to be useful.
Each AI system used in a field can give answers to questions in realtime, but record high prediction error results. This is sorta how OpenAI’s current algorithms already do it though I am neglecting details.
For a useful AI system used in a field, therefore, you need a tiny fraction of all the neurons a human uses—most are never going to contribute in any single task you might do as a human. And if a rare edge case shows up that needs more capability than a pared down, ‘sparse’ system used in a real application, you would have the field AI system pause it’s robotics and query a larger version of itself for the answer.
The more I type the more I realize how bullshit everything in this argument was. And there are efforts to make a silicon chip with more of the tradeoffs of the human brain. If you think you need power efficiency and breadth of capabilities more than accuracy, you can just do this. [an article on a startup that has built analog computers for neural network convolution. ]
So for Tim to be correct he needs to take into account a ‘best effort’ example of a large array of analog silicon processors, filling a whole warehouse, and conclude you cannot hit the computational needs required.
That startup is at about 300 TOPs for a single chip. Therefore, for a quick napkin estimate, that’s 10^14. It’s a startup making some of the first analog computers used in decades. So let’s assume there’s at least a power of 10 of “easy gains” leftover if this became a commercial technology. So 10^15.
10^21-10^15 = 6, or 1 million chips in a warehouse. Go to a ‘chiplet’ architecture to cram them into less packages, cram 10 per package, and you have 100,000 chips.
Current number 1 supercomputer is Fugaku with 158,976 48-core CPUs.
Cheap and easy if you had to do this next week? No, but it sounds like if enough resources available you could solve the problem even if we never get another improvement in silicon.
Tim is simply neglecting the obvious brute force solution to achieve brain-like capabilities. This is yet another startup and I’m not saying this approach will commercially succeed, but : [singularity hub]
The linked article is one on a startup called Cerebras who has gotten a ‘wafer scale engine’ to at least run in demos. This is where an entire silicon wafer is made into a large chip.
Enough of these, connected by hollow core optical fiber, would be what you need to hit that 10^21 threshold.
Also note that AI systems get a bunch of advantages that humans don’t have. Each system is immortal and is always doing it’s best. Human beings trivially make mistakes on simple tasks at high error rates—we do not “do our best” consistently 24/7/365. What does it mean to achieve human-like performance? Did you mean average performance or performance of the best human alive who is well rested?
Do you want broad spectrum capabilities or just the objects in imagenet? Because, again, it’s harder than it sounds to for a human to do better.
AI systems in applications like autonomous cars get to learn from the experiences of their peers in way that is not biased. Think about how biased the information you get from your peers is—for one thing, humans tend to only tell each other about successes, which can cause you to overestimate your chance of success for a risky venture like a startup.
While a peer autonomous vehicle can report in an unbiased way the (novel situation, true outcome) to a cloud farm that updates the learning to the fleet. Which is something each individual car doesn’t have to do—each vehicle doesn’t need to learn in itself.
In fact, here’s another flaw of Tim’s reasoning. He’s assuming we must have an AI system that learns in real time like a human does. This is not true—humans don’t learn in real time, either, it’s why we need 16-20 years of education to be useful.
Each AI system used in a field can give answers to questions in realtime, but record high prediction error results. This is sorta how OpenAI’s current algorithms already do it though I am neglecting details.
For a useful AI system used in a field, therefore, you need a tiny fraction of all the neurons a human uses—most are never going to contribute in any single task you might do as a human. And if a rare edge case shows up that needs more capability than a pared down, ‘sparse’ system used in a real application, you would have the field AI system pause it’s robotics and query a larger version of itself for the answer.
The more I type the more I realize how bullshit everything in this argument was. And there are efforts to make a silicon chip with more of the tradeoffs of the human brain. If you think you need power efficiency and breadth of capabilities more than accuracy, you can just do this. [an article on a startup that has built analog computers for neural network convolution. ]
So for Tim to be correct he needs to take into account a ‘best effort’ example of a large array of analog silicon processors, filling a whole warehouse, and conclude you cannot hit the computational needs required.
That startup is at about 300 TOPs for a single chip. Therefore, for a quick napkin estimate, that’s 10^14. It’s a startup making some of the first analog computers used in decades. So let’s assume there’s at least a power of 10 of “easy gains” leftover if this became a commercial technology. So 10^15.
10^21-10^15 = 6, or 1 million chips in a warehouse. Go to a ‘chiplet’ architecture to cram them into less packages, cram 10 per package, and you have 100,000 chips.
Current number 1 supercomputer is Fugaku with 158,976 48-core CPUs.
Cheap and easy if you had to do this next week? No, but it sounds like if enough resources available you could solve the problem even if we never get another improvement in silicon.