Basically agreed, with the extra point that sometimes you can play your way “out” of a high resource context too by exploiting too hard (i.e. killing the goose that lays the golden eggs to get an extra meal). So attuning to what part of reality you are actually in is important.
testingthewaters
Your solar punk initiatives sound very cool! To be clear I also try not to act according to what ‘certain minds’ would think, but I guess I was just trying to address what I saw as a perspective mismatch.
Regarding a better way: sometime ago I came to the conclusion that (to use the language of the moloch blog post ) Moloch and Elua are the same. They are, if you would like, what happens when the different bits of evolution (variation, selection, replication) get weighted differently. So Elua encourages variation and replication with weak selection, and Moloch encourages selection and variation with weak replication. Note that replication is not just copying. Both in biology and in culture, replication is also about reducing complexity of software after major feature push, repairing harm after vicious competition leads to loads of bloodshed and damage, refinement of the message of a book between editing passes and print runs, nature’s way of error correction between generations by regressing to the mean. This makes the final branch what is usually called conservatism—strong replication and selection, but weak variation.
Furthermore, which parameters are dominant is dependent on the context the system is operating in. So Moloch is locally dominant when resources are scarce, and vice versa for Elua. For minds which operate using world models, this means that we can play the Moloch-game or the Elua-game based on which state of mind we are in! (This is my best steelman of whatever the hell an “abundance mindset” is supposed to be)
Of course, whatever strategy we choose will need to actually be effective in reality to work. But in reality we all know people who acted like they were in a zero sum game when they were not, ruining everything for everyone; we also know people who gave even when there was little to give, and so enabled the collective as a whole to get itself out of the local minima—more pie for everyone. (Even if you don’t, you are the beneficiaries of their actions). This suggests that there are lenses which are effective at interfacing with reality but do not promote Moloch-thinking or quick-optimisation-thinking.
I have since gone in search for such lenses. The Moloch-lens is easy to find, it’s called the prisoner’s dilemma. It conforms to the ideas we have about short term gain and hard-nosed geopolitical and interpersonal realism, and does it so well that if you reverse the payouts people will call you unrealistic and biased. There is however also an Elua-lens or Elua-game that we can find. So far my incomplete understanding of its logic is something like:
The world is vast and complicated. Really complicated. Like, OOMs more complicated than any agent in the world (not least because the world also contains that agent’s complexity).
Executing plans in the world often requires taking many actions in sequence before a payoff can be identified (if any).
Thus, the world is exponentially complicated (size of action space is
, where a is the number of actions you can take each second and n the number of seconds until payoff, both of which can be very very large for ambitions the size of the Apollo program)This means that exploration is super dominant over exploitation in terms of total future payoff. For any measly local optima you can find, there’s almost certainly a bigger cheese somewhere else to find.
The problem of course is that exploration is hard and time intensive. This is why cooperation is dominant in the elua-game: cooperation parallelises exploration, leading to a much much faster time-to-payoff. Especially if you can cooperate so much you basically become a superorganism, this gives you immensely fast ways to improve your odds. This is the intuition behind why teaming up is good (two heads are better than one etc.)
The last piece of the puzzle is “why not kill the others and use their materials to build more computronium?” If the power law holds behind compute investments holds, making a computer system more powerful uses exponentially more resources than the gains it provides. This means that the odds of any system fully understanding the world using a particular frame or world-model is ~0, even if it turns galaxy after galaxy into compute nodes.
OTOH, cooperation and preserving other ways of seeing the world allows you to cover more of the search space (it’s the difference between sampling an image by checking random pixels versus starting from the bottom left corner and uncovering the image pixel by pixel). This means that cooperation gives you way more compute “per gram” than the alternative, while also being way less taxing for other reasons: first of all, you don’t have to spend executive capacity directing your subordinate compute units, avoiding the curse of dimensionality that happens when you have too many nodes to control top down simultaneously. (Cf. this quote: “Intelligence, Asman explained, is bounded by power laws: each volume of computing requires an exponentially vaster volume of connections.” ). Having someone else that can take care of themselves and just give you the relevant facts is actually really good. Second, you avoid turning your own weaknesses into single points of failure.
I’ll stop here, except for a final note that cooperation is not just coordination (dictatorships are coordinated but have very few of the benefits I mention above). I also wrote more about the Moloch-lens and what it does to people in this comment. Hope this helps!
I don’t think it is “so bad”, mostly because my definition of rationality is wider than most. I’m pretty sure system 1 is a form of knowledge and reasoning as well, and one that we ignore at our detriment. Communicating honestly and effectively is what I try to go for.
As far as I can tell, we are currently living in a glorious space opera future… for eukaryotes. From their early origins in a hydrothermal vent the eukaryotes have spread across the galaxy (earth), forming civilisations of unbelievable complexity and titanic power (animals and humans). Megastructures (houses) and gigastructures (cities) now exist where entire orders of eukaryotes live out their lives in peace, never feeling the threat of drought or lack of ATP. Generation ships called “planes”, “trains”, and “boats” carry generations from gigastructure to gigastructure at timescales that dwarf eukaryote understanding. Massive galactic empires make peace and war, destroying trillions with weapons of unfathomable power, while science gets ever closer to understanding the fundamental origins of life and the means by which it operates. Recently the empires have even launched expeditions beyond the known universe, seeking to harvest ever more gigantic sources of power to fuel glorious eukaryote replication throughout the multiverse. To an average eukaryote these wonders are beyond their comprehension, but in a very real way they are still making it happen.
To be clear, I agree with you, but I suspect that to a certain kind of mind pursuing d/acc and satisficing and governance puts you in the realm of the luddites and the social conservatives. “There is no good or evil, there is only power/optimisation and those too weak to take it.”
I happen to believe there is another way, but Moloch provides for his own.
Thanks for the clarity, it is helpful. Although of course a turing machine simulating unbounded concurrent turing macchines would be slower than the machines running independently.
A weird thing I have been thinking about recently is the idea that “a computer can only do one thing at once at the top layer of abstraction”. Where “one thing” is loosely “run one program, one algorithm, one task/optimisation loop etc”. The base of this idea is this:
Suppose you had a Turing machine. For it to run two programs simultaneously (rather than just one followed by the other), it would need two heads. Furthermore, the heads must have separate state and transition tables. At which point you actually have two Turing machines reading and writing to the same tape. You can tell because, for them to share information between each other (suppose one of them is running a database and the other a webpage that displays information from the database) they actually have to write to the tape, creating a sequential lock between sending and receiving. This is not a problem a single Turing machine would ever have.
By this logic of course a laptop/phone is actually many computers running at once, which is pretty accurate given my understanding of multithreading issues. And even then it actually is quite close to “one thing” at the topmost level—There’s usually one active app or screen (it’s possible this is more to deal with the limitations of human users than the device itself). Same with humans, we can multitask but only by task switching.
No worries, thanks for elaborating
I would then like to know which is which (DM is okay if you feel that would be somewhat controversial, it’s also alright if you want to keep your opinions to yourself)
the biggest issue with most capacity-building work is that it actively undermines the things that other capacity-building work has been highly successful at, so that ultimately some capacity building work is predictably extremely good for the world, and some is predictably extremely bad for the world
Wait what does this mean? Is there some kind of dichotomy I’m not aware of?
Some additional hurdles: “I think your ontology is not well adapted for this issue” sounds a lot like “I think you are wrong”, and possibly also “I think you are stupid”. Ontologies are tied into value sets very deeply, and so attempts to excavate the assumptions behind ontologies often resemble socratic interrogations. The result (when done without sufficient emotional openness and kindness) is a deeply uncomfortable experience that feels like someone trying to metaphysically trip you up and then disassemble you.
Ted Chiang, not Greg Egan.
For what its worth, I think people on LW generally overvalue “this feels like a Greg Egan story” and undervalue “this feels like a Ted Chiang story” when discussing alignment. Good outcomes cannot be effectively formalised without a lot of sensitivity and nuance and care taken to examine your own emotional/social/historical background.
Don’t know much about nonparametric statistics. But at least with unsupervised learning I think the clear rejoinder is “the choice was devolved to the data and algorithm level”. In other words, you made the choice of framework when you as a scientist decided some data was worth training on and some wasn’t, and that the algorithm of your choice was the correct one for that dataset. And your choice of “what is worth training on” can be basically “whatever we have/whatever performs best” (as in the case of just dumping tons and tons of video into unsupervised visual feature learning)! But then that constitutes an argument that you think that “whatever we have/whatever performs best” is enough to inform a general reasoning system or a “fully objective analytical framework”. And just because you (the scientist) didn’t actually consciously reason through what choices you were making for each piece of data or each architectural feature does not mean that a choice was not being made.
“Love is the most powerful force in the world.”
As a kind of baseline, possibly every new model release you could ask an agent to “design, execute, and writeup a novel AI safety experiment, with the intention that it is publishable work for LessWrong as a first time poster”. That gives you some baseline as to how much effort people have/have not put in?
imagine an alternate treaty that said when a human plants a flag, takes a photo, and brings back a regolith sample to a nation’s capitol, that nation gets the territory for 100km around the flag
So what I’m hearing is that it’s time to play high stakes interstellar capture the flag
I have a little stored thought which sometimes triggers, and it reads:
“If you find yourself being forced to choose between two or more extremely bad options that involve burning your values, your resources, or your life, the truth is that you lost around three moves ago and are living out the equivalent of a forced mate in chess. You’ve already lost, so stop playing and find a better game to spend time on if at all possible.”
Now is as good a time as any to describe my model for a solution to the phenomenon that you describe. It seems that we’re being “attacked” by a rogue attractor state (what you and Land call Pythia). It can be roughly described as “a sequence of arguments that, once internalised, make certain beliefs or actions seem like the only reasonable ones.” The arguments consist of a sequence of frames around ideas like optimisation, power-seeking, instrumental convergence, and machine intelligence. These actions and beliefs they incentivise include the following: that capabilities racing is the only thing we can do, that fear/awe of superintelligence is natural, that ASI emergence is inevitable, that is is reasonable to sacrifice other values for the sake of having a stake in ASI development, that the importance of AI is total and all-encompassing, etc.
I would analogise this “package” as a particular solution to a system of linear equations, since it is in fact a compact solution to a lot of questions one might rightfully ask about current society. Once they have internalised the payload, anyone who asks questions like “how do we guard against x-risk? how do we cure cancer? how do we live forever? how do we solve our massive coordination issues? how do we stop the inevitable rise of [bad people of your choice go here]?” is supplied a kind of universal answer whose minimal form is something like “to [do the thing we want], we must solve intelligence, and then use intelligence to solve everything else .”
The hitch is that at this point some people get scared about unleashing uncontrollable runaway intelligence optimisation on the world. So they start talking about, for example, AI safety. Except the payload is still active for most of them, so their thoughts end up being shaped like “to [ensure that the world is safe from powerful AI], we must… solve intelligence, and then use intelligence to solve everything else.” Which leads to such conclusions as “to save the world from the racing AI labs, I must start an AI lab to join the race.” All the safety-focused justification is then backfilled in, which is why changing the RSP is fine but stopping racing is not.
I actually wrote about an early version of this weird phenomenon here, but now I think I understand it better. If you are given a really good cognitive hammer, you have a really strong incentive to cast everything into the nail category. If you are given a reality warping cognitive black hole shaped like a hammer, the casting is no longer voluntary or even fully conscious.
To be clear, I don’t consider the discovery and propagation of this solution package inevitable. It may be that Land et al. truly succeeded in summoning a demon from a bad future whose presence is a hyperstitional curse. But at the same time, the package is here now. What I want to do is find another solution to that set of linear equations. My current idea is that Moloch and Elua are in fact two sides of the same coin. Both are descriptions of evolutionary dynamics, but with different hyperparameters on the relative strength of replication, selection, and variation. Moloch is what happens when selection and replication overpower variation, and Elua is the other way around. Pythia is just a special case of Moloch with regards to developing AI—which suggests that there is some other way to develop AI that isn’t mired in domination/optimisation/racing/total annihilation of the not-good (which as any guru will tell you also involves total annihilation of yourself, and likely the good that you sought to protect). To operationalise this idea will require a lot of theory work and hard thinking, if anyone’s interested give me a shout.
this is important because it opens up some positive sum trades: capabilities people really hate it when their release dates get set back by e.g safety reviews (which is unfortunate), but they mind a lot less if you’re taking 5% of compute (which is a huge amount of absolute compute, and you should value a lot!), and they mind even less if you have an entire team doing good work somewhere out of their way (which you should also value a lot!).
If the safety people never actually slow down or materially dent the model release schedule (which in your model is what triggers conflict), what you are describing is not a sequence of positive sum trades between the capabilities and safety teams. Rather, it’s a model of how companies can devote relatively small amounts of resources to throw up a safety-coated PR shield around their core capabilities research effort. IRL we’ve also seen safety teams get dissolved after making big claims to funding which could conceivably dent the release schedule (the 20% for superalignment being a good example)
This is very timely research and overcomes a lot of my past objections to scheming detecrion research. Great find, although its pretty grim news to see in the wild.