That’s not quite how I’d put it. What I actually think is more like: “A lot of optimisation went into making human society, and there were structural forces pushing that towards finding good solutions. We shouldn’t be surprised if our attempts to figure out how to handle AI end up converging on things we ‘already discovered’, and we might be able to save a lot of bother by skipping ahead.” This is kind of what I was trying to get at with the ‘utilitarians rediscovering honour’ point.
A potential crux is that I think the structural reasons why current society is good by our values fundamentally disappears with AI, and critically whether or not it’s corrigible don’t matter here.
Specifically, the reasons are a combination of humans being necessary to run the economy + better economic performance requires giving citizens most of their selfish wants + democracy turning out to be better for economies and wars than autocracies.
Ai threatens all three, due to humans becoming less necessary and thus you no longer have to give humans what they want in order for the economy to do well, and democracy in the AI era will start to worsen in performance compared to autocracy because UBI + almost everyone being out of a job means that perpetual unrest similar to 2020 America will worsen efficiency/economic growth dramatically compared to machine autocracies which don’t have to deal with unrest.
And critically, it doesn’t matter whether or not the AI is corrigible or value-aligned, it only matters that due to AI, incentives to give humans what they want is fundamentally weaker, meaning that some amount of intrinisic value-alignment is necessary if humans are to survive with anything like a decent life.
Another useful section for people to notice, which says that even if the prior is doing most of the work, the marginal sample efficiency for AIs is also very bad, because compared to humans they need 100x more marginal data:
Even if it were the case that we can explain away the trillions of tokens required to pretrain a base model as catching up to evolution, it doesn’t explain why the marginal capabilities take so much data—once you have been educated, you don’t need 100 different professors to learn a new programming language, but the AIs (even once pretrained) do.