AI alignment researcher, ML engineer. Masters in Neuroscience.
I believe that cheap and broadly competent AGI is attainable and will be built soon. This leads me to have timelines of around 2024-2027. Here’s an interview I gave recently about my current research agenda. I think the best path forward to alignment is through safe, contained testing on models designed from the ground up for alignability trained on censored data (simulations with no mention of humans or computer technology). I think that current ML mainstream technology is close to a threshold of competence beyond which it will be capable of recursive self-improvement, and I think that this automated process will mine neuroscience for insights, and quickly become far more effective and efficient. I think it would be quite bad for humanity if this happened in an uncontrolled, uncensored, un-sandboxed situation. So I am trying to warn the world about this possibility.
See my prediction markets here:
I also think that current AI models pose misuse risks, which may continue to get worse as models get more capable, and that this could potentially result in catastrophic suffering if we fail to regulate this.
I now work for SecureBio on AI-Evals.
relevant quote:
“There is a powerful effect to making a goal into someone’s full-time job: it becomes their identity. Safety engineering became its own subdiscipline, and these engineers saw it as their professional duty to reduce injury rates. They bristled at the suggestion that accidents were largely unavoidable, coming to suspect the opposite: that almost all accidents were avoidable, given the right tools, environment, and training.” https://www.lesswrong.com/posts/DQKgYhEYP86PLW7tZ/how-factories-were-made-safe
About 15 years ago, before I’d started professionally studying and doing machine learning research and development, my timeline had most of its probability mass around 60 − 90 years from then. This was based on my neuroscience studies and thinking about how long I thought it would take to build a sufficiently accurate emulation of the human brain to be functional. About 8 years ago, studying machine learning full time, AlphaGo coming out was inspiration for me to carefully rethink my position, and I realized there were a fair number of shortcuts off my longer figure that made sense, and updated to more like 40 − 60 years. About 3 years ago, GPT-2 gave me another reason to rethink with my then fuller understanding. I updated to 15 − 30 years. In the past couple of years, with the repeated success of various explorations of the scaling law, the apparent willingness of the global community to rapidly scale investments in large compute expenditures, and yet further knowledge of the field, I updated to more like 2 − 15 years as having 80% of my probability mass. I’d put most of that in the 6 − 12 year range, but I wouldn’t be shocked if things turned out to be easier than expected and something really took off next year.
One of the things that makes me think the BioAnchors estimate is a bit too far into the future is that I know from neuroscience that it’s possible for a human to have a sufficient set of brain functions to count as a General Intelligence by a fairly reasonable standard with significant chunks of their brain dead or missing. I mean, they’ll not be in great shape as they’ll be missing some stuff, but if the stuff they’re missing is non-critical they can still function well enough to be a minimal GI. Plenty well enough to be scary if they were a self-improving self-replicating agentive AI.
So anyway, yeah, I’ve been scared for a while now. Latest news has just reinforced my belief we are in a short timeline world, not surprised me. Glad to see more people getting on board with my point of view.