Here is a paper from January 2022 on arXiv that details the sort of generalization-hop we’re seeing models doing.
awg
So I guess any case in which you don’t have the data ready would still have more difficulties.
I’m not so sure...another interesting/alarming thing is noting how these models are “grokking” concepts in a way that lets them generalize.
Agreed that superhuman intelligence seems like the kind of thing that could be a very powerful lever. What gets me is that we don’t seem to know how orthogonal or non-orthogonal intelligence and empathy are to one another.[1] If we were capable of creating a superhumanly intelligent AI and we were to be able to give it superhuman empathy, I might be inclined to trust ceding over a large amount of power and control to that system (or set of systems whatever). But a sociopathic superhuman intelligence? Definitely not ceding power over to that system.
The question then becomes to me, how confident are we that we are not creating dangerously sociopathic AI?
- ^
If I were to take a stab, I would say they were almost entirely orthogonal, as we have perfectly intelligent yet sociopathic humans walking around today who lack any sort of empathy. Giving any of these people superhuman ability and control would seem like an obviously terrible idea to me.
- ^
It’s interesting to think how this fits in with other notions like the Simulators idea.
This nicely sums up the feeling of “excitement” I get when thinking about this sort of thing too. It’s more of an anxiety really. I agree it’s very related to the feeling of COVID-19 as it was unfolding in the first weeks of February and March 2020. It’s also similar to the feeling I got when Trump was elected, or on Jan 6. It’s just this feeling like something big and momentous is happening or is about to happen. That all of a sudden something totally consequential that you hadn’t previously considered is coming to pass. It’s a weird, doom-y sort of feeling.
This is kind of an “oh doy!” approach and I love it. It’s so straightforward and yet works so well for being such an obtuse, first-pass implementation. (Though I agree that there’s a high likelihood jailbreaks to this would be found quickly.) Still. This is wild.
Gary Marcus talked about this in his newsletter today and reached sort of the opposite conclusion: that a deal like this might indicate that OpenAI leadership are bailing out now while the getting is good because the long term prospects for what they have are not looking great. Then again, he would say that?
Either way this is indeed a big deal (if true), whatever comes of it!
I’m interested why you would think that writing “Superintelligence” would require less GI than full self-driving from NY to SF. The former seems like a pretty narrow task compared to the latter.
If it truly raises your hackles then maybe it’s worth sharing with at least one or two people who are working in safety research directly? Spreading it by ones and twos amongst people who would use the information for good (as it were) doesn’t seem too dangerous to me.
EY gets mentioned in a recent newsletter in the Atlantic from writer Derek Thompson.
Another article in The Atlantic today explicitly mentions the existential risk of AI and currently sits as the 9th most popular article on the website.
Some advanced intelligence that takes over doesn’t have to be directed toward human suffering for s-risk to happen. It could just happen as a byproduct of whatever unimaginable things the advanced intelligence might want/do as it goes about its own business completely heedless of us. In those cases we’re suffering in the same way that some nameless species in some niche of the world is suffering because humans, unaware that species even exists, are encroaching on and destroying its natural domain in the scope of just going about our own comparatively unimaginable business.
In addition, there is a front-page article on The Economist website today about AI accelerationism among the big tech firms.
A particular interesting figure from the article (unrelated to Overton Window):
In 2022, amid a tech-led stockmarket crunch, the big five poured $223bn into research and development (R&D), up from $109bn in 2019 (see chart 1). That was on top of $161bn in capital expenditure, a figure that had also doubled in three years. All told, this was equivalent to 26% of their combined annual revenues last year, up from 16% in 2015.
Thanks for the helpful clarification!
On the front-page of the NY Times today (and this week):
I heavily endorse the tone and message of this post!
I also have a sense of optimism coming from society’s endogenous response to threats like these, especially with respect to how public response to COVID went from Feb → Mar 2020 and how the last 6 months have gone for public response to AI and AI safety (or even just the 4-5 months since ChatGPT was released in November). We could also look at the shift in response to climate change over the past 5-10 years.
Humanity does seem to have a knack for figuring things out just in the nick of time. Can’t say it’s something I’m glad to be relying on for optimism in this moment, but it has worked out in the past...
“Sorcerer’s Apprentice” from Fantasia as an analogy for alignment
Totally! Just figured for many like myself this might have been the first/only exposure to the tale.
COVID and climate change are actually easy problems
I’m not sure I’d agree with that at all. Also, how are you calculating what cost is “necessary” for problems like COVID/climate change vs. incurred because of a “less-than-perfect” response? How are we even determining what the “perfect” response would be? We have no way of measuring the counterfactual damage from some other response to COVID, we can only (approximately) measure the damage that has happened due to our actual response.
For those reasons alone I don’t make the same generalization you do about predicting the approximate range of damage from these types of problems.
To me the generalization to be made is simply that: as an exogenous threat looms larger on the public consciousness, the larger the societal response to that threat becomes. And the larger the societal response to exogenous threats, the more likely we are to find some solution to overcoming them: either by hard work, miracle, chance, or whatever.
And I think there’s a steady case to be made that the exogenous xrisk from AI is starting to loom larger and larger on the public consciousness: The Overton Window widens: Examples of AI risk in the media.
If I might take a crack at summarizing, it seems that you’ve realized the full scope of both the inner alignment and outer alignment problems. It’s pretty scary indeed. I think your insight that we don’t even need full AGI for these kinds of things to become large social problems is spot on and as an industry insider I’m super glad you’re seeing that now!
One other thing that I think is worth contemplating is how much computational irreducibility would actually come into play with respect to still having real-world power and impact. I don’t think you would need to get anywhere near perfect simulation in order to begin to have extremely good predictive power over the world. We’re already seeing this in graphics and physics modeling. We’re starting to be able to do virtual wind tunnel simulations that yield faster and better results than physical simulations,[1] and I think we’ll continue to find this to be the case. So presumably an advanced AGI would be able to create even better simulations that still work just-as-good in the real world and would forego much of the need for doing actual physical experimentation. Though I’m curious about what others think here!
See here. Link to paper here.