I think peak intelligence (peak capability to reach a goal) will not be limited by the amount of compute, raw data, or algorithmic capability to process the data well, but by the finite amount of reality that’s relevant to achieving that goal. If one wants to take over the world, the way internet infrastructure works is relevant. The exact diameters of all the stones in the Rhine river are not, and neither is the amount of red dwarves in the universe. If we’re lucky, the amount of reality that turns out to be relevant for taking over the world, is not too far beyond what humanity can already collectively process. I can see this as a way for the world to be saved by default (but don’t think it’s super likely). I do think this makes an ever-expanding giant pile of compute an unlikely outcome (but some other kind of ever-expanding AI-led force a lot more likely).
otto.barten
AI Regulation May Be More Important Than AI Alignment For Existential Safety
Should we postpone AGI until we reach safety?
Why Uncontrollable AI Looks More Likely Than Ever
Announcing #AISummitTalks featuring Professor Stuart Russell and many others
Paper Summary: The Effectiveness of AI Existential Risk Communication to the American and Dutch Public
What Failure Looks Like is not an existential risk (and alignment is not the solution)
[Crosspost] An AI Pause Is Humanity’s Best Bet For Preventing Extinction (TIME)
[Question] Looking for non-AI people to work on AGI risks
Me and @Roman_Yampolskiy published a piece on AI xrisk in a Chinese academic newspaper: http://www.cssn.cn/skgz/bwyc/202303/t20230306_5601326.shtml
We were approached after our piece in Time and asked to write for them (we also gave quotes for another provincial newspaper). I have the impression (I’ve also lived and worked in China) that leading Chinese decision makers and intellectuals (or perhaps their children) read Western news sources like Time, NYTimes, Economist, etc. AI xrisk is currently probably mostly unknown in China, and if stumbled upon people might have trouble believing it (as they have in the west). But if/when we’re going to have a real conversation about AI xrisk in the west, I think the information will seep into China as well, and I’m somewhat hopeful that if this happens, it could perhaps prepare China for cooperation to reduce xrisk. In the end, no one wants to die.
Curious about your takes though, I’m of course not Chinese. Thanks for the write-up!
I have kind of a strong opinion in favor of policy intervention because I don’t think it’s optional. I think it’s necessary. My main argument is as follows:
I think we have two options to reduce AI extinction risk:
1) Fixing it technically and ethically (I’ll call the combination of both working out the ‘tech fix’). Don’t delay.
2) Delay until we can work out 1. After the delay, AGI may or may not still be carried out, depending mainly on the outcome of 1.
If option 1 does not work, of which there is a reasonable chance (it hasn’t worked so far and we’re not necessarily close to a safe solution), I think option 2 is our only chance to reduce the AI X-risk to acceptable levels. However, AI academics and corporations are both strongly opposed to option 2. It would therefore take a force at least as powerful as those two groups combined to still pursue this option. The only option I can think of is a popular movement. Lobbying and think tanking may help, but corporations will be better funded and therefore the public interest is not likely to prevail. Wonkery could be promising as well. I’m happy to be convinced of more alternative options.
If the tech fix works, I’m all for it. But currently, I think the risks are way too big and it may not work at all. Therefore I think it makes sense to apply the precautionary principle here and start with policy interventions, until it can be demonstrated that X-risk for AGI has fallen to an acceptable level. As a nice side effect, this should dramatically increase AI Safety funding, since suddenly corporate incentives are to fund this first in order to reach allowed AGI.
I’m aware that this is a strong minority opinion on LW, since:1) Many people here have affinity with futurism which would love an AGI revolution
2) Many people have backgrounds in AI academia, and/or AI corporations, which both have incentives to continue working on AGI
3) It could be wrong of course. :) I’m open for arguments which would change the above line of thinking.
So I’m not expecting a host of upvotes, but as rationalists, I’m sure you appreciate the value of dissent as a way to move towards a careful and balanced opinion. I do at least. :)
I think it’s a great idea to think about what you call goalcraft.
I see this problem as similar to the age-old problem of controlling power. I don’t think ethical systems such as utilitarianism are a great place to start. Any academic ethical model is just an attempt to summarize what people actually care about in a complex world. Taking such a model and coupling that to an all-powerful ASI seems a highway to dystopia.
(Later edit: also, an academic ethical model is irreversible once implemented. Any goal which is static cannot be reversed anymore, since this will never bring the current goal closer. If an ASI is aligned to someone’s (anyone’s) preferences, however, the whole ASI could be turned off if they want it to, making the ASI reversible in principle. I think ASI reversibility (being able to switch it off in case we turn out not to like it) should be mandatory, and therefore we should align to human preferences, rather than an abstract philosophical framework such as utilitarianism.)
I think letting the random programmer that happened to build the ASI, or their no less random CEO or shareholders, determine what would happen to the world, is an equally terrible idea. They wouldn’t need the rest of humanity for anything anymore, making the fates of >99% of us extremely uncertain, even in an abundant world.
What I would be slightly more positive about is aggregating human preferences (I think preferences is a more accurate term than the more abstract, less well defined term values). I’ve heard two interesting examples, there are no doubt a lot more options. The first is simple: query chatgpt. Even this relatively simple model is not terrible at aggregating human preferences. Although a host of issues remain, I think using a future, no doubt much better AI for preference aggregation is not the worst option (and a lot better than the two mentioned above). The second option is democracy. This is our time-tested method of aggregating human preferences to control power. For example, one could imagine an AI control council consisting of elected human representatives at the UN level, or perhaps a council of representative world leaders. I know there is a lot of skepticism among rationalists on how well democracy is functioning, but this is one of the very few time tested aggregation methods we have. We should not discard it lightly for something that is less tested. An alternative is some kind of unelected autocrat (e/autocrat?), but apart from this not being my personal favorite, note that (in contrast to historical autocrats), such a person would also in no way need the rest of humanity anymore, making our fates uncertain.
Although AI and democratic preference aggregation are the two options I’m least negative about, I generally think that we are not ready to control an ASI. One of the worst issues I see is negative externalities that only become clear later on. Climate change can be seen as a negative externality of the steam/petrol engine. Also, I’m not sure a democratically controlled ASI would necessarily block follow-up unaligned ASIs (assuming this is at all possible). In order to be existentially safe, I would say that we would need a system that does at least that.
I think it is very likely that ASI, even if controlled in the least bad way, will cause huge externalities leading to a dystopia, environmental disasters, etc. Therefore I agree with Nathan above: “I expect we will need to traverse multiple decades of powerful AIs of varying degrees of generality which are under human control first. Not because it will be impossible to create goal-pursuing ASI, but because we won’t be sure we know how to do so safely, and it would be a dangerously hard to reverse decision to create such. Thus, there will need to be strict worldwide enforcement (with the help of narrow AI systems) preventing the rise of any ASI.”
About terminology, it seems to me that what I call preference aggregation, outer alignment, and goalcraft mean similar things, as do inner alignment, aimability, and control. I’d vote for using preference aggregation and control.
Finally, I strongly disagree with calling diversity, inclusion, and equity “even more frightening” than someone who’s advocating human extinction. I’m sad on a personal level that people at LW, an otherwise important source of discourse, seem to mostly support statements like this. I do not.
I strongly agree with Section 1. Even if we would have aligned superintelligence, how are we going to make sure no one runs an unaligned superintelligence? A pivotal act? If so, which one? Or does defense trump offense? If so, why? Or are we still going to regulate heavily? If so, wouldn’t the same regulation be able to stop superintelligence altogether?
Would love to see an argument landing at 1% p(doom) or lower, even if alignment would be easy.
[Crosspost] Organizing a debate with experts and MPs to raise AI xrisk awareness: a possible blueprint
I don’t disagree. But I do think people dismissing the pivotal act should come up with an alternative plan that they believe is more likely to work. Because the problem is still there: “how can we make sure that no-one, ever builds an unaligned superintelligence?” My alternative plan is regulation.
[Crosspost] Unveiling the American Public Opinion on AI Moratorium and Government Intervention: The Impact of Media Exposure
You could pick corporations as an example of coordinated humans, but also e.g. Genghis Khan’s hordes. And they did actually take over. If you do want to pick corporations, look e.g. at East India companies that also took over parts of the world.
(4): I think regulation should get much more thought than this. I don’t think you can defend the point that regulation would have 0% probability of working. It really depends on how many people are how scared. And that’s something we could quite possibly change, if we would actually try (LW and EA haven’t tried).
In terms of implementation: I agree that software/research regulation might not work. But hardware regulation seems much more robust to me. Data regulation might also be an option. As a lower bound: globally ban hardware development beyond 1990 levels, confiscate the remaining hardware. It’s not fun, but I think it would work, given political support. If we stay multiple OOM below the brain, I don’t think any researcher could come up with an algorithm that much better than evolution (they haven’t in the 60s-90s).
There is probably something much smarter and less economically damaging out there that would also be robust. Research that tells us what the least damaging but still robust regulation option is, is long overdue.
First, I don’t propose ‘no AGI development’. If companies can create safe and beneficial AGIs (burden of proof is on them), I see no reason to stop them. On the contrary, I think it might be great! As I wrote in my post, this could e.g. increase economic growth, cure disease, etc. I’m just saying that I think that existential risk reduction, as opposed to creating economic value, will not (primarily) originate from alignment, but from regulation.
Second, the regulation that I think has the biggest chance of keeping us existentially safe will need to be implemented with or without aligned AGI. With aligned AGI (barring a pivotal act), there will be an abundance of unsafe actors who could run the AGI without safety measures (also by mistake). Therefore, the labs themselves propose regulation to keep almost everyone but themselves from building such AGI. The regulation required to do that is almost exactly the same.
Third, I’m really not as negative as you are about what it would take to implement such regulation. I think we’ll keep our democracies, our freedom of expression, our planet, everyone we love, and we’ll be able to go anywhere we like. Some industries and researchers will not be able to do some things they would have liked to do because of regulation. But that’s not at all uncommon. And of course, we won’t have AGI as long as it isn’t safe. But I think that’s a good thing.
As co-author of one of the mentioned pieces, I’d say it’s really great to see the AGI xrisk message mainstreaming. It doesn’t nearly go fast enough, though. Some (Hawking, Bostrom, Musk) have already spoken out about the topic for close to a decade. So far, that hasn’t been enough to change common understanding. Those, such as myself, who hope that some form of coordination could save us, should give all they have to make this go faster. Additionally, those who think regulation could work should work on robust regulation proposals which are currently lacking. And those who can should work on international coordination, which is currently also lacking.
A lot of work to be done. But the good news is that the window of opportunity is opening, and a lot of people could work on this which currently aren’t. This could be a path to victory.