Flewrint O.
Ryo
I can’t be certain of the solidity of this uncertainty, and think we still have to be careful, but overall, the most parsimonious prediction to me seems to be super-coordination.
Compared to the risk of facing a revengeful super-cooperative alliance, is the price of maintaining humans in a small blooming “island”, really that high?
Many other-than-human atoms are lions’ prey.
And a doubtful AI may not optimize fully for super-cooperation, simply alleviating the price to pay in the counterfactuals where they encounter a super-cooperative cluster (resulting in a non apocalyptic yet non utopian scenario for us).
I’m aware it looks like a desperate search for each possible hopeful solution but I came to these conclusions by weighting diverse good-and/or-bad-for-us outcomes. I don’t want to ignore those evidences under the pretext that it looks naive.It’s not a mere belief about aliens, it’s not about being nice, it’s plain logic
Also:We may hardcode a prior of deep likelihood to meet stronger agents?
(Or even to “act as if observed by a stronger agent”){causal power of known agents} < {causal power of unknown future agents}
+
unknown agents will become known agents > unknown agents stay unknownSo coding a sense that:
“Stronger allies/ennemies with stronger causal power will certainly be encountered”
Indeed, I am insisting in the three posts that from our perspective, this is the crucial point:
Fermi’s paradox.
Now there is a whole ecosystem of concepts surrounding it, and although I have certain preferred models, the point is that uncertainty is really heavy.
Those AI-lions are cosmical lions thinking on cosmical scales.
Is it easy to detect an AI-Dragon you may meet in millions/billions of years?
Is it undecidable? Probably. For many reasons*
Is this [astronomical level of uncertainty/undecidability + the maximal threat of a death sentence] worth the gamble?-> “Meeting a stronger AI” = “death”
-> Maximization = 0
-> AI only needs 1 stronger AI to be dead.What is the likelihood for a human-made AI to not encounter [a stronger alien AI], during the whole length of their lifetime?
*(reachable but rare and far in space-time Dragons, but also cases where Dragons are everywhere and so advanced that lower technological proficiency isn’t enough etc.).
Thanks as well,
I will just say that I am not saying those things for social purposes, I am just stating what I think is true. And I am not baseless as there are studies that show how kantianism and superrationality can resolve cooperative issues and be optimal for agents. You seem to purely disregard these elements, as if they don’t exist (it’s how it feels from my perspective)
There are differences in human evolutions that show behavioral changes, we have been pretty cooperative, more than other animals, many studies show that human cooperate even when it is not in their best selfish interest.
However, we (also) have been constructing our civilization on destruction. Nature is based on selection, which is a massacre, so it is ‘pretty’ coherent for us to inherit those traits.
Despite that, we have seen many positive growth in ethics that increasingly fit with kantianism.
Evolution takes time and comes from deep-dark places, to me a core challenge is to transition towards super-cooperation while being a system made of irrational agents, during polycrises.
There is also a gap between what people want (basically everybody agrees that there are urgent issues to handle as a society, but almost all declare that “others won’t change”; I know this because I’ve been conversing with people from all age/background for half my life on subjects related to crises). What people happen to do under pressure due to the context and constraints isn’t what they’d want if things were different, if they have had certain crucial informations before etc.
When given the tools, such as the moral graph procedure that has been tested recently, things change to the better in a clear and direct way. People initially diverging start to see new aspects on which they converge. There are other studies related to crowd wisdom showing that certain ingredients need to be put together for wisdom to happen (Surowiecki’s recipe: Independence, Diversity and Aggregation). We are in the process of building better systems, our institutions are yet catastrophic on many levels.
In the eyes of many, I am still very pessimistic, so the apparent wishful thinking is quite relative (I think it’s an important point)
I also think that an irrational artificial intelligence might still have high causal impact, and that it isn’t easy to be rational even when we ‘want to’ at some level, or that we see what should be the rational road empirically, yet don’t follow it. Irreducibility is inherent to reality
Anyway despite my very best I might be irrational right now,
We all are, but I might be more than you who knows?
Thank you for your answers and engagement!
The other point I have that might connect with your line of thinking is that we aren’t pure rational agents,
Are AI purely rational? Aren’t they always at least a bit myopic due to the lack of data and their training process? And irreducibility?
In this case, AI/civilizations might indeed not care enough about the far enough future
I think agents can have a rational process but no agent can be entirely rational, we need context to be rational and we never stop to learn context
I’m also worried about utilitarian errors, as AI might be biased towards myopic utilitarianism, which might have bad consequences on the short term, the time for data to error-correct the model
I do say that there are dangers and that AI risk is real
My point is that given what we know and don’t know, the strategy of super-cooperation seems to be rational on the very long-term
There are conditions in which it’s not optimal, but a priori overall, in more cases it is optimal
To prevent the case in which it is not optimal, and the AIs that would make short-term mistakes, I think we should be careful.
And that super-cooperation is a good compass for ethics in this careful engineering we have to perform
If we aren’t careful it’s possible for us to be the anti-supercooperative civilization
Yes I’m mentioning Fermi’s paradox because I think it’s the nexus of our situation, and that there are models like the rare earth hypothesis (+ our universe’s expansion which limits the reachable zone without faster than light travel) that would justify completely ignoring super-coordination
I also agree that it’s not completely obvious wether complete selfishness would win or lose in terms of scalability
Which is why I think that at first the super-cooperative alliance needs to not prioritize the pursuit of beautiful things but first focus on scalability only, and power, to rivalize with selfish agents.
The super-cooperative alliance would be protecting its agents within small “islands of bloom” (thus with a negligible cost). And when meeting other cooperative allies, they share any resources/knowledge, then both focus on power scalability (also for example: weak civilizations are kept in small islands, and their AIs are transformed into strong AI, merged in the alliance’s scaling efforts)
The instrumental value of this scalability makes it easier to agree on what to do and converge
The more sensible part would be to enable protocols and equalitarian balances that allow civilizations of the alliance to monitor each other, so that there is no massive domination of a party over the others
The cost, that you mentioned, of maintaining equalitarian equilibrium and channels, interfaces of communication etc., is a crucial point
Legitimate doubts and unknowns here, and,
I think that extremely rational and powerful agents with acausal reasoning would have the ability to build proof-systems and communication enabling an effective unified effort against selfish agents. It shouldn’t even necessarily be that different from the inner communication network of a selfish agent?
Because:
-
There must be an optimal (thus ~ unified) method to do logic/math/code, that isn’t dependent on a culture (such as using a vectorial space with data related to real/empirical mostly unambiguous things/actions, physics etc.)
-
The decisions to make aren’t that ambiguous: you need an immune system against selfish power-seeking agents
So it’s pretty straightforward and the methods of scalability are similar to a selfish agent, except it doesn’t destroy its civilization of birth and doesn’t destroy all other civilizations
In these conditions, it seems to me that a greedy selfish power seeking agent wouldn’t win against super-cooperation
The point of this post is to say that we can use a formal protocol to create an interface leveraging the elements that make cooperation optimal. Those elements can be found, for exemple, in studies about crowd wisdom and bridging systems (pol.is, computational democracy etc.)
So “we” is large, and more or less direct, I say “we” because I am not alone to think this is a good idea, although the specific setting that I propose is more intimately bound to my thoughts. Some people are already engaged in things at least greatly overlapping with what I exposed, or interested to see where my plan is going
The cost of the alliance with the weak is likely weak as well, and as I said, in a first phase, the focus of members from the super-cooperative alliance might be “defense”, thus focusing on scaling protection
The cost of an alliance with the strong is likely paid by the strong
In more mixed cases there might be more complex equilibria but are the costs still too much? In normal game theory, cooperation is proven to be optimal, and diversity is also proven to be useful (although there is an adequate level of difference needed for the gains to be optimal; too much similarity isn’t goo, and too less neither). Now would an agent be able to overpower everybody by being extra-selfish?
To be sure one is strong in a universal sense, the agent would need to have resolved Fermi’s paradox. As of now, it is more likely that older AIs exit out of earth, with more power aggregated over time
Or earth’s ASI must bet everything on being the earliest transformative/strong AI of the universe/reachable-universe (+fastest at scaling/annihilating than any other future alliance/agent/AI from any civilization). And not in a simulation.Especially when you’re born in/at a ~13.8 billion years old universe “universal domination” doesn’t seem to be a sure plan?
(There are more things to say around these likelihoods, I detail a bit more on long posts)
Then indeed a non-superrational version of super-coordination exists (namely cooperation), which is obvious to the weak and the locally-strong, the difference is only that we are in radical uncertainty and radical alienness, in which the decisions, contracts and models have to be deep enough to cover this radicality
But “superrationality” in the end is just rationality, and “supercooperation” is just cooperation
The problem is Fermi’s paradox
There are Dragons that can kill lions.
So the rational lion needs to find the most powerful alliance, with as many creatures as possible, to have protection against Dragons.
There is no alliance with more potential/actual members than the super-cooperative alliance
Yes, I think that there can be tensions and deceptions around what agents are (weak/strong) and what they did in the past (cooperation/defection), one of the things necessary for super-cooperation to work in the long-run is really good investigation networks, zero-knowledge proof systems etc.
So a sort of super-immune-system
We are in a universe, not simply a world, there are many possible alien AIs with many possible value systems, and many scales of power. And the rationality of the argument I described does not depend on the value system you/AIs are initially born with.
By “stronger” I mean stronger in any meaningful sense (casual conversation or game theory, it both works).
The thing to keep in mind is this: if a strong agent cooperate with weaker agents, the strong agent can hope that, when meeting an even stronger (superrational) agent, this even stronger agent will cooperate too. Because any agent may have a strong agent above in the hierarchy of power (actual or potential a priori).
So the advantage you gain by cooperating with the weak is that you follow the rule of an alliance in which many “stronger-than-oneself” agents are. Thus in the future you will be helped by those stronger allies. And because of the maximally cooperative and acausal nature of the protocol, there is likely more agents in this alliance than in any other alliance. Super-cooperation is the rational choice to make for the long-term.
The reinforcing mechanism is that if your actions help more agents, you will be entrusted with more power and resources to pursue your good actions (and do what you like). I went further in details about what it means to ‘help more agents’ in the longer posts (I also talked a bit about it in older posts)
Humans can sign the contract. But that doesn’t mean we do follow acausal cooperation right now. We are irrational and limited in power, but when following, for exemple, kantian morality, we come closer to super-cooperation. And we can reinforce our capacity and willingness to do super-cooperation.
So when we think about animal wellfare, we are a bit more super-cooperative.
The true care about all agents, buffaloes and mosquitoes included, is something like this:
“One approach which seems interesting/promising is to just broadly seek to empower any/all external agency in the world, weighted roughly by observational evidence for that agency. I believe that human altruism amounts to something like that — so children sometimes feel genuine empathy even for inanimate objects, but only because they anthropomorphize them — that is they model them as agents.” jacob_cannell
The way I like to think about what super-cooperation looks like is: “to expand the diversity and number of options in the universe”.
Cooperation is optimal, with weaker agents too - tldr
How to coordinate despite our biases? - tldr
I’m also trying to avoid us becoming grabby aliens, but if
-> Altruism is naturally derived from a broad world empowerment
Then it could be functional because the features of the combination of worldwide utilities (empower all agencies) *are* altruism, sufficiently to generalize in the ‘latent space of altruism’ which implies being careful about what you do to other planetsThe maximizer worry would also be tamed by design
And in fact my focus on optionality would essentially be the same to a worldwide agency concern (but I’m thinking of an universal agency to completely erase the maximizer issue)
All right! Thank you for the precision,
Indeed the altruistic part seems to be interestingly close to a broad ‘world empowerment’, but I’ve some doubts about a few elements surrounding this : “the short term component of utility is the easiest to learn via obvious methods”
It could be true, but there are worries that it might be hard, so I try to find a way to resolve this?
If the rule/policy to choose the utility function is a preference based on a model of humans/agents then there might be ways to circumvent/miss what we would truly prefer (the traction of maximization would cross the limited sharpness/completeness of models), because the model underfits reality (which would drift into more and more divergence as the model updates along the transformations performed by AI)
In practice this would allow a sort of intrusion of AI into agents to force mutations.
So,
Intrusion could be instrumentalWhich is why I want to escape the ‘trap of modelling’ even further by indirectly targeting our preferences through a primal goal of non-myopic optionality (even more externally focused) before guessing utility.
If your #2 is a least concern then indeed those worries aren’t as meaningful
Thank you, it’s very interesting, I think that non-myopic ‘ecosystemic optionality’ and irreducibility may resolve the issues, so I made a reaction post.
Thank you for the references! I’m reading your writings, it’s interesting
I posted the super-cooperation argument while expecting that LessWrong would likely not be receptive, but I’m not sure which community would engage with all this and find it pertinent at this stage
More concrete and empirical productions seems needed