ML enthusiasts sometimes refer to autonomy dismissively, as something that will be solved incidentally by scaling up current models
I’m definitely in this camp.
One common “trick” you can use to defeat current chatbots is to ask “what day is today”. Since chatbots are pretty much all using static models, they will get this wrong every time.
But the point is, it isn’t hard to make a chatbot that know what day today is. Nor is it hard to make a chatbot that reads the news every morning. The hard part is making an AI that is truly intelligent. Adding autonomy is then a trivial and obvious modification.
This reminds me a bit of Scott Aaronson’s post about “Toaster Enhanced Turing Machines”. It’s true that there are things Turing complete languages cannot compute. But adding these features doesn’t fundamentally change the system in any significant way.
It’ll be interesting to see Imagen fine-tuned on laion aesthetic
Agree with almost all of your points.
The goal of writing this post was “this is a slight improvement on IIT”, not “I expect normal people to understand/agree with this particular definition of consciousness”.
But the vast majority of automation doesn’t seem to be militarily relevant. Even if you assume some sort of feedback loop where military insubstantial automation leads to better military automation, world powers already have the trump card in terms of nukes for wars of aggression against them.
I think your underestimating the use of non-military tech for military purposes. As a point of comparison, the US pre-WWII had a massive economy (and very little of it dedicated to the military). But this still proved to be a decisive advantage.
Or, as admiral Yamomoto said
Anyone who has seen the auto factories in Detroit and the oil fields in Texas knows that Japan lacks the national power for a naval race with America.
A country that has 100 million drones delivering “Mediterranean wraps” is also going to have a huge advantage when it comes to building drones for other purposes.
Nuclear weapons are also only a trump card as long as their use remains unthinkable. In a war with actual use of tactical nuclear weapons, you’re going to want to be on the side that has the advantage in terms of missile defense, precision strikes, dominating the infosphere, etc.
But unlike the last big pass in automation, when missing out meant getting conquered, this time the penalty for missing out seems insubstantial.
This claim has been empirically refuted in Armenia and Ukraine. Missing out on drones DOES mean getting conquered.
I disagree that they are all that interesting: a lot of TASes don’t look like “amazing skilled performance that brings you to tears to watch” but “the player stands in place twitching for 32.1 seconds and then teleports to the YOU WIN screen”.
I fully concede that a Paperclip Maximizer is way less interesting if there turns out to be some kind of false vacuum that allows you to just turn the universe into a densely tiled space filled with paperclips expanding at the speed of light.
It would be cool to make an classification of games where perfect play is interesting (Busy Beaver Game, Mao, Calvinball) vs games where it is boring (Tic-Tac-Toe, Checkers). I suspect that since Go is merely EXP-Time complete (not Turing complete) it falls in the 2nd category. But it’s possible that e.g. optimal Go play involves a Mixed Strategy Nash Equilibrium drawing on an infinite set of strategies with ever-decreasing probability.
Problem left for the reader: prove the existence of a game which is not Turing Complete but where optimal play requires an infinite number of strategies such that no computable algorithm outputs all of these strategies.
the idea of A and D being in an eternal stasis is improbable
I did cheat in the story by giving D a head start (so it could eternally outrun A by fleeing away at 0.99C). However, in general this depends on how common intelligent life is elsewhere in the universe. If the majority of A’s future light-cone is filled with non-paperclipping intelligent beings (and there is no false-vacuum/similar “hack”), then I think A has to remain intelligent.
So, in the domains where we can approach perfection, the idea that there will always be large amounts of diversity and interesting behaviors does not seem to be doing well.
I suspect that a paperclip maximizer would look less like perfect Go play and more like a TAS speedrun of Mario. Different people have different ideas of interesting, but I personally find TAS’s fun to watch.
The much longer version of this argument is here.
We are already connected to machines (via keyboards and monitors). The question is how a higher bandwidth interface will help in mitigating risks from huge, opaque neural networks.
I think the idea is something along the lines of:
Build high-bandwidth interface between the human brain and a computer
figure out how to simulate a single cortical column
Give human beings a million extra cortical columns to make us really smart
This isn’t something you could do with a keyboard and monitor.
But, as stated, I’m not super-optimistic this will result in a sane, super-intelligent human being. I merely think that it is physically possible to do this before/around the same time as the Singularity.
Contra #4: nope. Landauer’s principle implicates that reversible computation cost nothing (until you’d want to read the result, which then cost next to nothing time the size of the result you want to read, irrespective of the size of the computation proper). Present day computers are obviously very far from this limit, but you can’t assume « computronium » is too.
Reading the results isn’t the only time you erase bits. Any time you use an “IF” statement, you have to either erase the branch that you don’t care about or double the size of your program in memory.
This seems backwards to me. If you prove a cryptographic protocol works, using some assumptions, then the only way it can fail is if the assumptions fail. Its not that a system using RSA is 100% secure, someone could peak in your window and see the messages after decryption. But its sure more secure than some random nonsense code with no proofs about it, like people “encoding” data into base 16.
The context isn’t “system with formal proof” vs “system I just thought of 10 seconds ago” but “system with formal proof” vs “system without formal proof but extensively tested by cryptographers in real-world settings”. Think One-Time-Pad vs AES. In theory One-Time-Pad is perfectly information theoretically secure, but in practice AES is much better.
Obviously “system with formal proof and extensively tested/demonstrated to work in real world settings” would be even better. And if anyone ever proves P=NP, AES will presumably enter this category.
I will try to do a longer write-up sometime, but in a Bureaucracy of AIs, no individual AI is actually super-human (just as Google collectively knows more than any human being but no individual at Google is super-human).
It stays aligned because there is always a “human in the loop”, in fact the whole organization simply competes to produce plans which are then approved by human reviewers (under some sort of futarchy-style political system). Importantly, some of the AIs compete by creating plans, and other AIs compete by explaining to humans how dangerous those plans are.
All of the individual AIs in the Bureaucracy have very strict controls on things like: their source code, their training data, the amount of time they are allowed to run, how much compute they have access to, when and how they communicate with the outside world and with each other. They are very much not allowed to alter their own source code (except after extensive review by the outside humans who govern the system).
Regarding DAOs, I think they are an excellent breeding-grounds for developing robust bureaucracies, since between pseudonymous contributors and a reputation for hacking, building on the blockchain is about as close to simulating a world filled will less-than-friendly AIs as we currently have. If we can’t even create a DAO that robustly achieves its owners goals on the blockchain, I would be less optimistic that we can build one that obeys human values out of non-aligned (or weakly aligned) AIs.
Also, I think, idea of Non-Agentic AI deserves a place on the list. I understand EY’s arguments about why it will not work (“nonwet water”), but I think it is quite popular.
I will add a section Tool AIs (Non Agentic AI)
These all sound like really important questions that we should be dedicating a ton of effort/resources into researching. Especially since there is a 50% chance we will discover immortality this century and a 30% chance we will do so before discovering AGI.
There’s Humans Consulting Humans, but my understanding is this is meant as a toy model, not as a serious approach to Friendly AI.
On the one hand, your definition of “cool and interesting” may be different from mine, so it’s entirely possible I would find a paperclip maximizer cool but you wouldn’t. As a mathematician I find a lot of things interesting that most people hate (this is basically a description of all of math).
On the other hand, I really don’t buy many of the arguments in “value is fragile”. For example:
And you might be able to see how the vast majority of possible expected utility maximizers, would only engage in just so much efficient exploration, and spend most of its time exploiting the best alternative found so far, over and over and over.
I simply disagree with this claim. The coast guard and fruit flies both use Levy Flights because they are mathematically optimal. Boredom isn’t some special feature of human beings, it is an approximation to the best possible algorithm for solving the exploration problem. Super-intelligent AI will have a better approximation, and therefore better boredom.
EY seems to also be worried that super-intelligent AI might not have qualia, but my understanding of his theory of consciousness is that “has qualia” is synonymous with “reasons about coalitions of coalitions”, so I’m not sure how an agent can be good at that and not have qualia.
The most defensible version of “paperclip maximizers are boring” would be something like this video. But unlike MMOs, I don’t think there is a single “meta” that solves the universe (even if all you care about is paperclips). Take a look at this list of undecidable problems and consider whether any of them might possibly be relevant to filling the universe with paperclips. If they are, then an optimal paperclip maximizer has an infinite set of interesting math problems to solve in its future.
I’ll just go ahead and change it to “Aligned By Definition” which is different and still seems to get the point across.
Is there a better/ more commonly used phrase for “AI is just naturally aligned”? Yours sounds like what I’ve been calling Trial and Error and has also been called “winging it”