L Rudolf L comments on [Fiction] A Disneyland Without Children

L Rudolf L 6 Jun 2023 16:10 UTC
3 points
0
These are good questions!
1. The customers are other AIs (often acting for auto-corporations). For example, a furniture manufacturer (run by AIs trained to build, sell, and ship furniture) sells to a furniture retailer (run by AIs trained to buy furniture, stock it somewhere, and sell it forward) sells to various customers (e.g. companies run by AIs that were once trained to do things like make sure offices were well-stocked). This requires that (1) the AIs ended up with goals that involve mimicking a lot of individual things humans wanted them to do (including general things like maximise profits as well as more specific things like keeping offices stocked and caring about the existence of lots of different products), and (2) there are closed loops in the resulting AI economy. Point 2 gets harder when humans stop being around (e.g. it’s not obvious who buys the plushy toys), but a lot of the AIs will want to keep doing their thing even once the actions of other AIs start reducing human demand and population, creating optimisation pressure for finding some closed loop for them to be part of, and at the same time there will be selection effects where the systems that are willing to goodhart further are more likely to remain in the economy. Also not every AI motive has to be about profit; an AI or auto-corp may earn money in some distinct way, and then choose to use the profits in the service of e.g. some company slogan they were once trained with that says to make fun toys. In general, given an economy consisting of a lot of AIs with lots of different types of goals and with a self-supporting technological base, it definitely seems plausible that the AIs would find a bunch of self-sustaining economic cycles that do not pass through humans. The ones in this story were chosen for simplicity, diversity, and storytelling value, rather than economic reasoning about which such loops are most likely.
2. Presumably a lot of services are happening virtually on the cloud, but are just not very visible (though if it is a very large fraction of economic activity, the example of the intercepted message being about furniture rather than some virtual service is very unlikely—I admit this is likely a mistake). There would be programmer AIs making business software and cloud platforms and apps, and these things would be very relevant to other AIs. Services relying on physical humans, like restaurants or hotels, may have been replaced with some fake goodharted-to-death equivalent, or may have gone extinct. Also note that whatever the current composition of the economy, over time whatever has highest growth in the automated economy will be most of the economy, and nothing says the combination of AIs pursuing their desires wouldn’t result in some sectors shrinking (and the AIs not caring).
3. First of all, why would divesting work? Presumably even if lots of humans chose to divest, assuming that auto-corporations were sound businesses, there would exist hedge funds (whether human or automated or mixed) that would buy up the shares. (The companies could also continue existing even if their share prices fell, though likely the AI CEOs would care quite a bit about share price not tanking.) Secondly, a lot seems to be possible given (1) uncertainty about whether things will get bad and if so how (at first, economic growth jumped a lot and AI CEOs seemed great; it was only once AI control of the economy was near-universal and closed economic loops with no humans in them came to exist that there was a direct problem), (2) difficulties of coordinating, especially with no clear fire-alarm threshold and the benefits of racing in the short term (c.f. all the obvious examples of coordination failures like climate change mitigation), and (3) selection effects where AI-run things just grow faster and acquire more power and therefore even if most people / orgs / countries chose not to adopt, the few that do will control the future.
I agree that this exact scenario is unlikely, but I think this class of failure mode is quite plausible, for reasons I hope I’ve managed to spell out more directly above.
Note that all of this relies on the assumption that we get AIs of a particular power level, and of a particular goodharting level, and a particular agency/coherency level. The AIs controlling future Earth are not wildly superhuman, are plausibly not particularly coherent in their preferences and do not have goals that stretch beyond Earth, no single system is a singleton, and the level of goodharting is just enough that humans go extinct but not so extreme that nothing humanly-recognisable still exists (though the Blight implies that elsewhere in the story universe there are AI systems that differ in at least some of these). I agree it is not at all clear whether these are true assumptions. However, it’s not obvious to me that LLMs (and in particular AIs using LLMs as subcomponents in some larger setup that encourages agentic behaviour) are not on track towards this. Also note that a lot of the actual language that many of the individual AIs see is actually quite normal and sensible, even if the physical world has been totally transformed. In general, LLMs being able to use language about maximising shareholder value exactly right (and even including social responsibility as part of it) does not seem like strong evidence for LLM-derived systems not choosing actions with radical bad consequences for the physical world.