Couldn’t agree more. Variants of this strategy get proposed often.
If you are a proponent of this strategy—I’m curious whether you know of any examples in history where humanity purposefully and succesfully steered towards a significantly less competitive [economically, militarily,...] technology that was nonetheless safer.
It’s not about building less useful technology, that’s not what Abram or Ryan are talking about (I assume). The field of alignment has always been about strongly superhumanagents. You can have tech that is useful and also safe to use, there’s no direct contradiction here.
Maybe one weak-ish historical analogy is explosives? Some explosives are unstable, and will easily explode by accident. Some are extremely stable, and can only be set off by a detonator. Early in the industrial chemistry tech tree, you only have access to one or two ways to make explosives. If you’re desperate, you use these whether or not they are stable, because the risk-usefulness tradeoff is worth it. A bunch of your soldiers will die, and your weapons caches will be easier to destroy, but that’s a cost you might be willing to pay. As your industrial chemistry tech advances, you invent many different types of explosive, and among these choices you find ones that are both stable explosives and effective, because obviously this is better in every way.
Maybe another is medications? As medications advanced, as we gained choice and specificity in medications, we could choose medications that had both low side-effects and were effective. Before that, there was often a choice, and the correct choice was often to not use the medicine unless you were literally dying.
In both these examples, sometimes the safety-usefulness tradeoff was worth it, sometimes not. Presumably people in both cases people often made the choice not to use unsafe explosives or unsafe medicine, because the risk wasn’t worth it.
As it is with these technologies, so it is with AGI. There are a bunch future paradigms of AGI building. The first one we stumble into isn’t looking like one where we can precisely specify what it wants. But if we were able to keep experimenting and understanding and iterating after the first AGI, and we gradually developed dozens of ways of building AGI, then I’m confident we could find one that is just as intelligent and also could have its goals precisely specified.
My two examples above don’t quite answer your question, because “humanity” didn’t steer away from using them, just individual people at particular times. For examples where all or large sections of humanity steered away from using an extremely useful tech whose risks purportedly outweighed benefits: Project Plowshare, nuclear power in some countries, GMO food in some countries, viral bioweapons (as far as I know), eugenics, stem cell research, cloning. Also {CFCs, asbestos, leaded petrol, CO2 to some extent, radium, cocaine, heroin} after the negative externalities were well known.
I guess my point is that safety-usefulness tradeoffs are everywhere, and tech development choices that take into account risks are made all the time. To me, this makes your question utterly confused. Building technology that actually does what you want (which is be safe and useful) is just standard practice. This is what everyone does, all the time, because obviously safety is one of the design requirements of whatever you’re building.
The main difference with between above technologies and AGI is that it’s a trapdoor. The cost of messing up AGI is that you lose any chance to try again. AGI shares with some of the above technologies an epistemic problem. For many of them it isn’t clear in advance, to most people, how much risk there actually is, and therefore whether the tradeoff is worth it.
After writing this, it occurred to me that maybe by “competitive” you meant “earlier in the tech tree”? I interpreted it in my comment as a synonym of “useful” in a sense that excluded safe-to-use.
I’m curious whether you know of any examples in history where humanity purposefully and succesfully steered towards a significantly less competitive [economically, militarily,...] technology that was nonetheless safer.
This sounds much like a lot of the history of environmentalism and safety regulations? As in, there’s a long history of [corporations selling X, using a net-harmful technology], then governments regulating. Often this happens after the technology is sold, but sometimes before it’s completely popular around the world.
I’d expect that there’s similarly a lot of history of early product areas where some people realize that [popular trajectory X] will likely be bad and get regulated away, so they help further [safer version Y].
Going back to the previous quote:
“steer the paradigm away from AI agents + modern generative AI paradigm to something else which is safer”
I agree it’s tough, but would expect some startups to exist in this space. Arguably there are already several claiming to be focusing on “Safe” AI. I’m not sure if people here would consider this technically part of the “modern generative AI paradigm” or not, but I’d imagine these groups would be taking some different avenues, using clear technical innovations.
There are worlds where the dangerous forms have disadvantages later on—for example, they are harder to control/oversee, or they get regulated. In those worlds, I’d expect there should/could be some efforts waiting to take advantage of that situation.
I feel confused by how broad this is, i.e., “any example in history.” Governments regulate technology for the purpose of safety all the time. Almost every product you use and consume has been regulated to adhere to safety standards, hence making them less competitive (i.e., they could be cheaper and perhaps better according to some if they didn’t have to adhere to them). I’m assuming that you believe this route is unlikely to work, but it seems to me that this has some burden of explanation which hasn’t yet been made. I.e., I don’t think the only relevant question here is whether it’s competitive enough such that AI labs would adopt it naturally, but also whether governments would be willing to make that cost/benefit tradeoff in the name of safety (which requires eg believing in the risks enough, believing this would help, actually having the viable substitute in time, etc.). But that feels like a different question to me from “has humanity ever managed to make a technology less competitive but safer,” where the answer is clearly yes.
My comment was a little ambiguous. What I meant was human society purposely differentially researching and developing technology X instead of Y where Y has a public (global) harm Z but private benefit and X is based on a different design principle than Y but slightly less competitive but still able to replace Y.
A good example would be the development of renewable energy to replace fossil fuels to prevent climate change.
The new tech (fusion, fission, solar, wind) is based on fundamental principles than the old tech (oil and gas).
Lets zoom in:
Fusion would be an example but perpetually thirty years away. Fission works but wasnt purposely develloped to fight climate change. Wind is not competitive without large subsidies and most likely never will.
Solar is at least lomited competitive with fossil fuels [except because of load balancing it may not be able to replace fossil fuels completely] , purposely developped out of environmental concerns and would be the best example.
I think my main question marks here is: solar energy is still a promise. It hasnt even begun to make a dent in total energy consumption ( a quick perplexity search reveals only 2 percent of global energy is solar-generated). Despite the hype it is not clear climate change will be solved by solar energy.
Moreover, the real question is to what degree the development of competitive solar energy was the result of a purposeful policy. People like to believe that tech development subsidies have a large counterfactual but imho this needs to be explicitly proved and my prior is that the effect is probably small compared to overall general development of technology & economic incentives that are not downstream of subsidies / government policy.
Let me contrast this with two different approaches to solving a problem Z (climate change).
Deploy existing competitive technology (fission)
Solve the problem directly (geo-engineering)
It seems to me that in general the latter two approaches have a far better track record of counterfactually Actually Solving the Problem.
Moreover, the real question is to what degree the development of competitive solar energy was the result of a purposeful policy. People like to believe that tech development subsidies have a large counterfactual but imho this needs to be explicitly proved and my prior is that the effect is probably small compared to overall general development of technology & economic incentives that are not downstream of subsidies / government policy.
But we don’t need to speculate about that in the case of AI! We know roughly how much money we’ll need for a given size of AI experiment (eg, a training run). The question is one of raising the money to do it. With a strong enough safety case vs the competition, it might be possible.
I’m curious if you think there are any better routs; IE, setting aside the possibility of researching safer AI technology & working towards its adoption, what overall strategy would you suggest for AI safety?
Couldn’t agree more. Variants of this strategy get proposed often.
If you are a proponent of this strategy—I’m curious whether you know of any examples in history where humanity purposefully and succesfully steered towards a significantly less competitive [economically, militarily,...] technology that was nonetheless safer.
It’s not about building less useful technology, that’s not what Abram or Ryan are talking about (I assume). The field of alignment has always been about strongly superhuman agents. You can have tech that is useful and also safe to use, there’s no direct contradiction here.
Maybe one weak-ish historical analogy is explosives? Some explosives are unstable, and will easily explode by accident. Some are extremely stable, and can only be set off by a detonator. Early in the industrial chemistry tech tree, you only have access to one or two ways to make explosives. If you’re desperate, you use these whether or not they are stable, because the risk-usefulness tradeoff is worth it. A bunch of your soldiers will die, and your weapons caches will be easier to destroy, but that’s a cost you might be willing to pay. As your industrial chemistry tech advances, you invent many different types of explosive, and among these choices you find ones that are both stable explosives and effective, because obviously this is better in every way.
Maybe another is medications? As medications advanced, as we gained choice and specificity in medications, we could choose medications that had both low side-effects and were effective. Before that, there was often a choice, and the correct choice was often to not use the medicine unless you were literally dying.
In both these examples, sometimes the safety-usefulness tradeoff was worth it, sometimes not. Presumably people in both cases people often made the choice not to use unsafe explosives or unsafe medicine, because the risk wasn’t worth it.
As it is with these technologies, so it is with AGI. There are a bunch future paradigms of AGI building. The first one we stumble into isn’t looking like one where we can precisely specify what it wants. But if we were able to keep experimenting and understanding and iterating after the first AGI, and we gradually developed dozens of ways of building AGI, then I’m confident we could find one that is just as intelligent and also could have its goals precisely specified.
My two examples above don’t quite answer your question, because “humanity” didn’t steer away from using them, just individual people at particular times. For examples where all or large sections of humanity steered away from using an extremely useful tech whose risks purportedly outweighed benefits: Project Plowshare, nuclear power in some countries, GMO food in some countries, viral bioweapons (as far as I know), eugenics, stem cell research, cloning. Also {CFCs, asbestos, leaded petrol, CO2 to some extent, radium, cocaine, heroin} after the negative externalities were well known.
I guess my point is that safety-usefulness tradeoffs are everywhere, and tech development choices that take into account risks are made all the time. To me, this makes your question utterly confused. Building technology that actually does what you want (which is be safe and useful) is just standard practice. This is what everyone does, all the time, because obviously safety is one of the design requirements of whatever you’re building.
The main difference with between above technologies and AGI is that it’s a trapdoor. The cost of messing up AGI is that you lose any chance to try again. AGI shares with some of the above technologies an epistemic problem. For many of them it isn’t clear in advance, to most people, how much risk there actually is, and therefore whether the tradeoff is worth it.
After writing this, it occurred to me that maybe by “competitive” you meant “earlier in the tech tree”? I interpreted it in my comment as a synonym of “useful” in a sense that excluded safe-to-use.
This sounds much like a lot of the history of environmentalism and safety regulations? As in, there’s a long history of [corporations selling X, using a net-harmful technology], then governments regulating. Often this happens after the technology is sold, but sometimes before it’s completely popular around the world.
I’d expect that there’s similarly a lot of history of early product areas where some people realize that [popular trajectory X] will likely be bad and get regulated away, so they help further [safer version Y].
Going back to the previous quote:
I agree it’s tough, but would expect some startups to exist in this space. Arguably there are already several claiming to be focusing on “Safe” AI. I’m not sure if people here would consider this technically part of the “modern generative AI paradigm” or not, but I’d imagine these groups would be taking some different avenues, using clear technical innovations.
There are worlds where the dangerous forms have disadvantages later on—for example, they are harder to control/oversee, or they get regulated. In those worlds, I’d expect there should/could be some efforts waiting to take advantage of that situation.
I feel confused by how broad this is, i.e., “any example in history.” Governments regulate technology for the purpose of safety all the time. Almost every product you use and consume has been regulated to adhere to safety standards, hence making them less competitive (i.e., they could be cheaper and perhaps better according to some if they didn’t have to adhere to them). I’m assuming that you believe this route is unlikely to work, but it seems to me that this has some burden of explanation which hasn’t yet been made. I.e., I don’t think the only relevant question here is whether it’s competitive enough such that AI labs would adopt it naturally, but also whether governments would be willing to make that cost/benefit tradeoff in the name of safety (which requires eg believing in the risks enough, believing this would help, actually having the viable substitute in time, etc.). But that feels like a different question to me from “has humanity ever managed to make a technology less competitive but safer,” where the answer is clearly yes.
My comment was a little ambiguous. What I meant was human society purposely differentially researching and developing technology X instead of Y where Y has a public (global) harm Z but private benefit and X is based on a different design principle than Y but slightly less competitive but still able to replace Y.
A good example would be the development of renewable energy to replace fossil fuels to prevent climate change.
The new tech (fusion, fission, solar, wind) is based on fundamental principles than the old tech (oil and gas).
Lets zoom in:
Fusion would be an example but perpetually thirty years away. Fission works but wasnt purposely develloped to fight climate change. Wind is not competitive without large subsidies and most likely never will.
Solar is at least lomited competitive with fossil fuels [except because of load balancing it may not be able to replace fossil fuels completely] , purposely developped out of environmental concerns and would be the best example.
I think my main question marks here is: solar energy is still a promise. It hasnt even begun to make a dent in total energy consumption ( a quick perplexity search reveals only 2 percent of global energy is solar-generated). Despite the hype it is not clear climate change will be solved by solar energy.
Moreover, the real question is to what degree the development of competitive solar energy was the result of a purposeful policy. People like to believe that tech development subsidies have a large counterfactual but imho this needs to be explicitly proved and my prior is that the effect is probably small compared to overall general development of technology & economic incentives that are not downstream of subsidies / government policy.
Let me contrast this with two different approaches to solving a problem Z (climate change).
Deploy existing competitive technology (fission)
Solve the problem directly (geo-engineering)
It seems to me that in general the latter two approaches have a far better track record of counterfactually Actually Solving the Problem.
But we don’t need to speculate about that in the case of AI! We know roughly how much money we’ll need for a given size of AI experiment (eg, a training run). The question is one of raising the money to do it. With a strong enough safety case vs the competition, it might be possible.
I’m curious if you think there are any better routs; IE, setting aside the possibility of researching safer AI technology & working towards its adoption, what overall strategy would you suggest for AI safety?