If geoengineering approaches successfully counteract climate change, and it’s cheaper to burn carbon and dim the sun than generate power a different way (or not use the power), then presumably civilization is better off burning carbon and dimming the sun.
AFAIK, the main arguments against solar radiation management (SRM) are:
1. High level of CO2 in the atmosphere creates other problems too (e.g. ocean acidification) but those problems are less urgent / impactful so we’ll end up not caring about them if we implement SRM. Reducing CO2 emissions allows us to “do the right thing” using already existing political momentum.
2. Having the climate depend on SRM gives a lot of power to those in control of SRM and makes the civilization dependent on SRM. We are bad at global cooperation as is and having SRM to manage will put additional stress on that. This is a more fragile solution than reducing emissions.
It’s certainly possible to argue against either of these points, especially introducing the assumption that humanity as a whole is close enough to a rational agent. My opinion is that geoengineering solutions lead to more fragility than reducing emissions and we would be better off avoiding them or at least doing something along the lines of carbon sequestration and not SRM. It also seems increasingly likely that we won’t have that option. Our emission reduction efforts are too slow and once we hit +5ºC and beyond the option to “turn this off tomorrow” will look too attractive.
What if we think about it the following way? ML researchers range from _theorists_ (who try to produce theories that describe how ML/AI/intelligence works at the deep level and how to build it) to _experimenters_ (who put things together using some theory and lots of trial and error and try to make it perform well on the benchmarks). Most people will be somewhere in between on this spectrum but people focusing on interpretability will be further towards theorists than most of the field.
Now let’s say we boost the theorists and they produce a lot of explanations that make better sense of the state of the art that experimenters have been playing with. The immediate impact of this will be improved understanding of our best models and this is good for safety. However, when the experimenters read these papers, their search space (of architectures, hyperparameters, training regimes, etc.) is reduced and they are now able to search more efficiently. Standing on the shoulders of the new theories they produce even better performing models (however they still incorporate a lot of trial and error because this is what experimenters do).
So what we achieved is better understanding of the current state of the art models combined with new improved state of the art that we still don’t quite understand. It’s not immediately clear whether we’re better off this way. Or is this model too coarse to see what’s going on?