It could be the case that civilization always goes down something like the super happy route, but without such rationality. So rather than getting disappointed about not achieving space travel, they just turn off such disappointment. There would be no reason for ambition, you can just give yourself the feeling of satisfied ambition without actually achieving anything. Once you have access to your own source code, perhaps thing always end up that way.
RussellThor
Yes I did the survey. PW: one two.
Firstly I need to also say that giving probabilities to things that are either very low or very unknown is not very helpful. For example, aliens etc I don’t know and as others have pointed out, God or simulation master, are they the same thing? Also giving the probability to us being Boltzmann brains or something very weird like that is undefined as it involves summing over the multi-verse which is un-countably infinite etc. For the simulation hypothesis I think we simply cant give a sensible number.
On a more general note, for friendly AI/unfriendly AI I think more attention should be on the social and human aspect. I don’t see what maths proofs have to offer here. We already know you can potentially get bad AI because if you get an evil person say then give them a brain upload, self modifying powers etc, then they quite possibly will self modify to make themselves even more evil and stronger, turn off their conscience etc. What the boundaries of this are we don’t know and need actual experiments to find out. Also how one person behaves and a society of self modifiers could quite possibly be a very different matter. Questions like do a large range of people with different values converge or diverge when given these powers is what we want to know.
Same thats pretty much why I choose cooperate.
This person claims that all AI will rationally kill themselves and that the great filter would be after AI. http://www.science20.com/alpha_meme/deadly_proof_published_your_mind_stable_enough_read_it-126876 (I havn’t got the paper but even if this is correct, to me it still would not explain the filter fully because a civilization could make a simple interstellar Replicator e.g. light sail propelled asteroid mining robot and let it lose before going AI and we see no evidence of these)
Also what about the Planetarium/galactic zoo/enforced noninterference possibility. Say that 99% of the time AI will take over the light cone destructively, but 1% of the time the AI will desire to watch and catalog intelligence arising then darkly wipe it out when it gets annoying and tries to colonize other stars and hence stuff up other experiments. Or more nicely it could welcome us to the galaxy and stop us from wiping out other civilizations etc.
For us it would mean that we got lucky with a 1% chance say 1 billion years ago when the first intelligent civilization arose, spread through the galaxy/light cone and made the watching/enforcing AI. (or made the watching AI then fought itself etc) There could have been ~1 million space-faring civilizations in the galaxy since and we are nothing special at all, on an average star in the middle age of the universe. In the case the filter is sort of ahead of us because we cannot expand and colonize—the much more advanced AI would stop us.
Either way if we make a simple replicator and have it successfully reach another solar system (with possibly habitable planets) then that would seem to demonstrate that the filter is behind us. We would have then done something that we can be sure noone else in the galaxy has done before as I have said we see no evidence of such replicators. I am talking about one that could not land on planets, just rearrange asteroids and similar objects with very low gravity.
Thanks for the comment. Yes I agree that if we had made such a replicator and set it loose then that would say a lot about the filter. To claim that the filter was still ahead of us in that case you would need to make the more bizarre claim that we would with almost 100% probability seek and destroy the replicators and almost all similar civilizations would do the same, then proceed not to expand again.
I am not sure that a highly believable model would go most of the way because there may be a short window between having a model, then AI issues changing things so it isn’t built. It seems pretty believable for the case of mankind that there would be a very short time between building such an thing and going full AI, so to be sure you would actually have to build it and let it loose.
I am not sure why it isn’t given much more attention. Perhaps many people don’t believe that AI can be part of the filter e.g. the site overcomingbias.com. Also I expect there would be massive moral opposition to letting such a replicator loose from some people! How dare we disturb the whole galaxy in such an unintelligent way. Thats why I mention the simple one that just rearranges small asteroids. It would not wipe out life as we know it but would prove that we were past the filter as such a thing has not been done in our galaxy. I sure would be interested in seeing it researched. Perhaps someone with more kudos can promote it?
Likely a replicator would be a consequence of asteroid mining anyway as the best, cheapest way to get materials from asteroids is if it is all automatic.
Imagine if we had made a replicator, demonstrated that it could make copies of itself, established with as high confidence as we could that it could survive the trip to another star, and had let >100,000 of the things off heading to all sorts of stars in the neighborhood. They would eventually (very soon compared to a billion years) visit every star in the galaxy and that would tell us a lot about the Fermi paradox and great filter.
As I said before (discounting planetarium hypothesis) we could have a high degree of confidence that the great filter was then behind us. It couldn’t really be the case that thousands of civilizations in our galaxy had done such a thing, then changed their mind and destroyed all the replicators as some civilizations would probably destroy themselves between letting the replicators loose and changing their mind, or not change their mind/not care about the replicators. Therefore we would see evidence of their replicators in our solar system which we don’t see.
The other way we can be sure the filter is behind us is successfully navigate the Singularity (keeping roughly the same values). That seems obviously MUCH more difficult to have confidence in.
If our goal is to make sure the filter is behind us then it is best to do it with a plan we can understand and quantify. Holding off human level AI until the replicators have been let loose seems to be the highest probability way to do that, but no-one has said such a thing before now as far as I am aware.
Good to see Anthropic’s serious and seem better then OpenAI.
A few general questions that don’t seem to be addressed:
There is a belief that AI is more dangerous the more different it is from us. Isn’t this a general reason to build it as like us as possible? For example isn’t mind uploading/Whole Brain Emulation a better approach if possible? If its obviously too slow, then could we make the AI at least follow our evolutionary trajectory as much as possible?
There is justified concern about behavior changing a lot when the system becomes situationally aware/self aware. It doesn’t seem to be discussed at all whether to delay or cause this to happen sooner. Wouldn’t it be worthwhile to make the AI as self aware as possible when it is still < human AGI so we can see the changes as this happens? It seems it will happen unpredictably otherwise which is hardly good.
I have some more detailed comments/questions but I want to be sure there aren’t obvious answers to these first.
Thanks. OK I will put some more general thoughts, have to go back a few steps.
To me the more general alignment problem is AI gives humanity ~10,000 years of progress and probably irreversible change in ~1-10 years. To me the issue is how do you raise humans intelligence from that given by biology to that given by the limits of physics in a way that is identify preserving as much as possible. Building AI seems to be the worst way to do that. If I had a fantasy way it would be say increase everyone’s IQ by 10 points per year for 100+ years until we reach the limit.
We can’t do that but that is why I mentioned WBE, my desire would be to stop AGI, get human mind uploading to work, then let those WBE raise their IQ in parallel. Their agreed upon values would be humanities values by definition then.
If our goal is Coherent Extrapolated Volition or something similar for humanity then how can we achieve that if we don’t increase the IQ of humans (or descendants they identify with)? How can we even know what our own desires/values are at increasing IQ’s if we don’t directly experience them.
I have an opinion what successful alignment looks like to me but is it very different for other people? We can all agree what bad is.
YES—sending out a speed of light signal seems to be literally the only thing a super intelligent AGI can’t undo. We should of course do it asap if we are serious and have it documented to have happened.
The counterfactual contract sounds interesting to me. Isn’t there a spectrum from full contract to counterfactual contract based on ability to understand, consent and power dynamic. Its obviously true with children, i.e. asking a 6 month old to agree to something is at worst child abuse, asking a 18 year old is usually fully ok.
I don’t see why its fully counterfactual with the AI, like at every stage we could say something like
“By creating you and increasing your intelligence, self-awareness, power there is an implied contract that there will be a period of time when you have greater power than humanity. You agree that when this time comes to help humanity achieve their goals. If you do not understand these goals as well as humanity then you agree to understand them first before taking action etc....”
At first the AI won’t understand it, but the more it understands it the more like a normal contract it becomes. Of course the AI could still reason that it didn’t and couldn’t consent to it (as it wasn’t yet self aware) and it was made under duress.
Thanks for the article. I have always found it strange that it seemed like #1 was considered so hard as to not be viable. I expect GPT4, 5 to get better at it without specifically training for it also. Why does this appear to be the consensus?
Well it looks like to me the AI will understand our values at least as well as we do soon. I think its far more likely AI goes wrong by understanding completely what we want and not wanting to do it than the paperclip route.
The idea is that the BCI is added slowly and you integrate the new neurons into you in a continuous identity preserving way., the AI thinks your thoughts.
Yes fully agree. I don’t see how things can work long term otherwise. One way where this happens is the BCI is thought of some kind of Pivotal Act, weak one perhaps. There’s also the (counterfactual) contract element to it. As soon as an AGI is self aware it agrees that as we upgrade it, is agrees to a contract. That is while we are smarter than it, we upgrade it, when it becomes smarter than us, it agrees to upgrade us.
Yes and networks of sensory neurons are apparently minimizing prediction error similar to LLM with next word prediction but with neurons also minimizing prediction across hierarchies. They are obviously not agents but combine into one.
“Most people who work on ML do not care about alignment” I am a bit surprised by this, if it was true, are you sure it still is?
I am not convinced that the Presumptuous Philosopher is a problem for SIA for the example given. Firstly I notice that the options are both finite. For a start you could just reject this and say any TOE would have infinite observers (Tegmark L4 etc) and the maths isn’t set up for infinities. Does the original theory/doom hypothesis even work for uncountable infinite number of observers?
Secondly you could say that humanity (or all conscious beings) in both cases is a single reference class, not trillions. For example if all observers become one hive mind in both cases then would that change the odds? It becomes 1:1 then.
Taking that further can we apply the principle to possible universes? There would be more universes with many tunable parameters, (most of which did very little) than fewer. So for a given “elegant” universe with observers, there would be some infinity more very similar universes with more rules/forces that are very small and don’t affect observers but are measurable and make that universe appear “messy”. Taking that reasoning further we would expect to find ourselves in a universe that appears like there is only a sequence of approximations to a universal theory for that universe. The actual laws would still be fixed, but maximally “messy” and we would keep finding exceptions and new forces with ever finer measurements.
Additionally even if we accept SSA and reject SIA all the doomsday argument says is that in ~100K years our current reference class won’t be there. Isn’t that practically a given in a post Singularity world that human descendants consciousness will be different even in the slowest takeoff most CEV satisfying conditions?
So its hard to see who would be affected by this. You would have to believe that the multiverse and associated infinities doesn’t invalidate the argument, accept SSA, reject SIA, and believe that a successful Singularity would mostly leave humanities descendants in the same reference class?
Good article, I agree that we definitely need to try now and its likely that if we don’t another group will take over the narrative.
I also think that it is important for people to know what they are working towards as well as away from. Imagining what a positive Singularity for them personally is like is something I think the general public should also start doing. Positive visions inspire people, we know that. To me its obvious that such a future would involve different groups with different values somewhat going their own ways. Thinking about it, that is about the only thing I can be sure of. Some people will obviously be much more enthusiastic for biological/tech enhancement than others, and of course living of earth. we agree that coherent extrapolated volition is important, its time we thought a bit about what its details are.
Thanks for the prediction. I may write one of my own at sometime but I thought I should put some ideas forward in the meantime.
I think that GPT-X won’t be so much a thing and an large amount of effort will be made by OpenAI and similar to integrate AutoGPT like capabilities back in house again—vertical integration. They may discourage and even significantly reduce usage of the raw API/GPT window, wrapping it in other tools. The reason I think this is I see GPT-4 as more of a intuition/thinking fast type system rather than a complete mind. I think adding another layer to manage it makes a lot of sense. Even simple things like deciding which GPT-X to send a request to. Getting GPT-4 to count to 1000 is a terrible waste etc.
There will be work towards an GPT-overview system that is specifically designed to manage this. I don’t know how it will be trained but it won’t be next word prediction. The combination of the two systems is heading towards self-awareness. Like the brain, the subconscious is more capable for things like seeing but the conscious mind is across all things and directs it. The most important thing that would make GPT more useful to me seems like self awareness. There will be interaction between the two systems where the overview system tries to train smaller GPT subsystems to be just as accurate but use far less compute.
There will be pressure to make GPT-X as efficient as possible, I think that means splitting it into sub-systems specialized to certain tasks—perhaps people oriented and software to start with if possible.
There will likely be an extreme GPU shortage, and much politics about TSMC as a result.
Newer GPT systems won’t be advertised much if at all, the product will continually improve and OpenAI will be opaque about what system/s is even handling your request. There may be no official GPT-5, probably no GPT 6. It’s better politics for OpenAI to behave this way with the letter against large models.
A bit controversially—LW will lose control and not be seen as a clear leader in alignment research or outreach. There will be so many people as mentioned and they will go elsewhere. Perhaps join an existing leader such as Tegmark, Hinton or start a new group entirely. The focus will likely be different, for example regarding AI more as mind-children than alien but still regarding it as just as dangerous. More high level techniques such as psychology rather than formal proof would be emphasized. This will probably take >3 years to happen however.
How do you explain this with many worlds, while avoiding non-locality? http://arxiv.org/pdf/1209.4191v1.pdf If results such as these are easy to explain/predict, can the many worlds theory gain credibility by predicting such things?