There are three things to address, here. (1) That it can’t update or improve itself. (2) That doing so will lead to godlike power. (3) Whether such power is malevolent.
Of 1, it does that now. Last year, I started to get a bit nervous noticing the synergy between AI fields converging. In other words, Technology X (e.g. Stable Diffusion) could be used to improve the function of Technology Y (e.g. Tesla self-driving) for an increasingly large pool of X and Y. This is one of the early warning signs that you are about to enter a paradigm shift or geometric progression of discovery. Suddenly, people saying AGI was 50 years away started to sound laughable to me. If it is possible on silicon transistors, it is happening in the next 2 years. Here is an experiment testing the self reflection and self improvement (loosely “self training,” but not quite there) of GPT4 (last week).
Of 2, there is some merit to the argument that “superintelligence” will not be vastly more capable because of the hard universal limits of things like “causality.” That said, we don’t know how regular intelligence “works,” much less how much more super a super-intelligence would or could be. If we are saved from AI, then it is these computation and informational speed limits of physics that have saved us out of sheer dumb luck, not because of anything we broadly understood as a limit to intelligence, proper. Given the observational nature of the universe (ergo, quantum mechanics), for all we know, the simple act of being able to observe things faster could mean that a superintelligence would have higher speed limits than our chemical-reaction brains could ever hope to achieve. The not knowing is what causes people to be alarmist. Because a lot of incredibly important things are still very, very unknown …
Of 3, on principle, I refuse to believe that stirring the entire contents of Twitter and Reddit and 4Chan into a cake mix makes for a tasty cake. We often refer to such places as “sewers,” and oddly, I don’t recall eating many tasty things using raw sewage as a main ingredient. No, I don’t really have a research paper, here. It weirdly seems like the thing that least requires new and urgent research given everything else.
In December 2022, awash in recent AI achievements, it concerned me that much of the technology had become very synergistic during the previous couple of years. Essentially: AI-type-X (e.g. Stable Diffusion) can help improve AI-type-Y (e.g. Tesla self-driving) across many, many pairs of X and Y. And now, not even 4 months after that, we have papers released on GPT4′s ability to self-reflect and self-improve. Given that it is widely known how badly human minds predict geometric progression, I have started to feel like we are already past the AI singularity “event horizon.” Even slamming on the brakes now doesn’t seem like it will do much to stop our fall into this abyss (not to mention how unaligned the incentives of Microsoft, Tesla, and Google are from pulling the train brake). My imaginary “event horizon” was always “self-improvement” given that transistorized neurons would behave so much faster than chemical ones. Well, here we are. We’ve had dozens of emergent properties of AI over the past year, and barely anyone knows that it can use tools, learned to read text on billboards in images, and more … without explicit training to do so. It has learned how to learn, and yet, we are broadening the scope of our experiments instead of shaking these people by the shoulders and asking, “What in the hell are you thinking, man!?”