Please let me know if you suspect I’ve over-interpreted that validation.
Slightly? My view is more like:
For AIs to be superhuman AI researchers, they probably need to match humans at most underlying/fundamental cognitive tasks, including reasonably sample efficient learning. (Or at least learning which is competitive with humans given the AIs structural advantages.)
This means they can probably learn how to do arbitary things pretty quickly and easily.
I think non-ML/software-engineering expertise (that you can’t quickly learn on the job) is basically never important in building more generally capable AI systems aside from maybe various things related to acquiring data from humans. (But IMO this won’t ultimately be needed.)
Does an SAR have to be superhuman at creative writing, so that it can push forward creative writing capabilities in future models?
Do human ML researcherse have to be superhuman at creative writing to push forward creative writing capabilites? I don’t particularly think so. Data might need to come from somewhere, but in the vision case, there are plenty of approaches which don’t require AIs with superhuman vision.
In the creative writing case, it’s a bit messy because the domain is intrinsically subjective. I nonetheless think you could make an AI which is superhuman at creative writing without good understanding of creative writing using just the (vast vast) quantity of data we already have on the internet.
I’m now very strongly feeling the need to explore the question of what sorts of activities go into creating better models, what sorts of expertise are needed, and how that might change as things move forward. Which unfortunately I know ~nothing about, so I’ll have to find some folks who are willing to let me pick their brains...
I think this is a good question. I’d love to hear from people with experience building frontier models have to say about it.
Meanwhile, my first pass at decomposing “activities that go into creating better models” into some distinct components that might be relevant in this discussion:
ML engineering: build & maintain distributed training setup, along with the infra and dev ops that go along with a complex software system
Data acquisition and curation: collect, filter, clean datasets; hire humans to produce/QA; generate synthetic data
Safety research and evaluation: red-teaming, interpretability, safety-specific evals, AI-assisted oversight, etc.
External productization: product UX and design, UX-driven performance optimization, legal compliance and policy, marketing, and much more.
Physical compute infrastructure: GPU procurement, data center building and management, power procurement, likely various physical logistics.
(I wonder what’s missing from this?)
Eli suggested above that we should bracket the issue of data. And I think it’s also reasonable to set aside 4 and 5 if we’re trying to think about how quickly a lab could iterate internally.
If we do that, we’re left with 1, 2, and 6. I think 1 and 2 are covered even by a fairly narrow definition of “superhuman (AI researcher + coder)”. I’m uncertain what to make of 6, besides having a generalized “it’s probably messier and more complicated than I think” kind of feeling about it.
Slightly? My view is more like:
For AIs to be superhuman AI researchers, they probably need to match humans at most underlying/fundamental cognitive tasks, including reasonably sample efficient learning. (Or at least learning which is competitive with humans given the AIs structural advantages.)
This means they can probably learn how to do arbitary things pretty quickly and easily.
I think non-ML/software-engineering expertise (that you can’t quickly learn on the job) is basically never important in building more generally capable AI systems aside from maybe various things related to acquiring data from humans. (But IMO this won’t ultimately be needed.)
Do human ML researcherse have to be superhuman at creative writing to push forward creative writing capabilites? I don’t particularly think so. Data might need to come from somewhere, but in the vision case, there are plenty of approaches which don’t require AIs with superhuman vision.
In the creative writing case, it’s a bit messy because the domain is intrinsically subjective. I nonetheless think you could make an AI which is superhuman at creative writing without good understanding of creative writing using just the (vast vast) quantity of data we already have on the internet.
Thanks.
I’m now very strongly feeling the need to explore the question of what sorts of activities go into creating better models, what sorts of expertise are needed, and how that might change as things move forward. Which unfortunately I know ~nothing about, so I’ll have to find some folks who are willing to let me pick their brains...
I think this is a good question. I’d love to hear from people with experience building frontier models have to say about it.
Meanwhile, my first pass at decomposing “activities that go into creating better models” into some distinct components that might be relevant in this discussion:
Core algorithmic R&D: choose research questions, design & execute experiments, interpret findings
ML engineering: build & maintain distributed training setup, along with the infra and dev ops that go along with a complex software system
Data acquisition and curation: collect, filter, clean datasets; hire humans to produce/QA; generate synthetic data
Safety research and evaluation: red-teaming, interpretability, safety-specific evals, AI-assisted oversight, etc.
External productization: product UX and design, UX-driven performance optimization, legal compliance and policy, marketing, and much more.
Physical compute infrastructure: GPU procurement, data center building and management, power procurement, likely various physical logistics.
(I wonder what’s missing from this?)
Eli suggested above that we should bracket the issue of data. And I think it’s also reasonable to set aside 4 and 5 if we’re trying to think about how quickly a lab could iterate internally.
If we do that, we’re left with 1, 2, and 6. I think 1 and 2 are covered even by a fairly narrow definition of “superhuman (AI researcher + coder)”. I’m uncertain what to make of 6, besides having a generalized “it’s probably messier and more complicated than I think” kind of feeling about it.