While I like a lot of Hanson’s grabby alien model, I do not buy the inference that since humans appeared early in cosmological history, that implies that the cosmic commons are taken quickly and so a lower bound on how often grabby aliens appear. I think that is neglecting the possibility that the early universe is inherently more conducive to creating life, so most life is created early, but these lifeforms may be very far apart.
Eliezer is very explicit and repeats many times in that essay, including in the very segment you quote, that his code of meta-honesty does in fact compel you to never lie in a meta-honesty discussion. The first 4 paragraphs of your comment are not elaborating with what Eliezer really meant, they are disagreeing with him. Reasonable disagreements too, in my opinion, but conflating them with Eliezer’s proposal is corrosive to the norms that allows people to propose and test new norms.
I had trouble making the connection between the first two paragraphs and the rest. Are you introducing what you mean by an “alarm” and then giving a specific proposal for an alarm afterwards? Is there significance in how the example alarms are in response to specific words being misleading?
Writing suggestion: Expand the acronym “ELK” early in the piece. I looked at the title and my first question was what ELK is, I quickly skimmed the piece and wasn’t able to find out until I clicked on the link to the ELK document. I now see it’s also expanded in the tag list, which I normally don’t examine. I haven’t read the article more closely than a skim.
On further thought I want to walk back a bit:
I confess my comment was motivated by seeing something where it looked like I could make a quick “gotcha” point, which is a bad way to converse.
Reading the original comment more carefully, I’m seeing how I disagree with it. It says (emphasis mine)
in practice the problems of infinite ethics are more likely to be solved at the level of maths, as opposed on the level of ethics and thinking about what this means for actual decisions.
I highly doubt this problem will be solved purely on the level of math, and expect it will involve more work on the level of ethics than on the level of foundations of mathematics. However, I think taking an overly realist view on the conventions mathematicians have chosen for dealing with infinities is an impediment to thinking about these issues, and studying alternative foundations is helpful to ward against that. The problems of infinite ethics, especially for uncountable infinities, seem to especially rely on such realism. I do expect a solution to such issues, to the extent it is mathematical at all, could be formalized in ZFC. The central thing I liked about the comment is the call to rethink the relationship of math and mathematical infinity to reality, and that doesn’t necessary require changing our foundations, just changing our attitude towards them.
If the only alternative you can conceive of for ZFC is removing the axiom of choice then you are proving Jan_Kulveit’s point.
I was reading the story for the first quotation entitled “The discovery of x-risk from AGI”, and I noticed something around quotation that doesn’t make sense to me and I’m curious if anyone can tell what Eliezer Yudkowsky was thinking. As referenced in a previous version of this post, after the quoted scene highest Keeper commits suicide. Discussing the impact of this, EY writes,
And in dath ilan you would not set up an incentive where a leader needed to commit true suicide and destroy her own brain in order to get her political proposal taken seriously. That would be trading off a sacred thing against an unsacred thing. It would mean that only true-suicidal people became leaders. It would be terrible terrible system design.So if anybody did deliberately destroy their own brain in attempt to increase their credibility—then obviously, the only sensible response would be to ignore that, so as not create hideous system incentives. Any sensible person would reason out that sensible response, expect it, and not try the true-suicide tactic.
And in dath ilan you would not set up an incentive where a leader needed to commit true suicide and destroy her own brain in order to get her political proposal taken seriously. That would be trading off a sacred thing against an unsacred thing. It would mean that only true-suicidal people became leaders. It would be terrible terrible system design.
So if anybody did deliberately destroy their own brain in attempt to increase their credibility—then obviously, the only sensible response would be to ignore that, so as not create hideous system incentives. Any sensible person would reason out that sensible response, expect it, and not try the true-suicide tactic.
The second paragraph is clearly a reference to acausal decision theory, people making a decision because how they anticipate others react to expecting that this is how they make the decision rather than the direct consequences of the decision. I’m not sure if it really makes sense, a self-indulgent reminder that nobody has knows any systematic method for producing prescriptions from acausal decision theories in cases where purportedly they differs from causal decision theory in everyday life. Still, it’s fiction, I can suspend my disbelief.
The confusing thing is that in the story the actual result of the suicide is exactly what this passage says shouldn’t be the result. It convinces the Representatives to take the proposal more seriously and implement it. This passage is just used to illustrate how shocking the suicide was, no additional considerations are described why for the reasoning is incorrect in those circumstances. So it looks like the Representatives are explicitly violating the Algorithm which supposedly underlies the entire dath ilan civilization and is taught to every child at least in broad strokes, in spite of being the second-highest ranked governing body of dath ilan.
Really all I need is that a strategy that takes n bits to specify will be performed by 1 in ∼2n of all random strategies. Maybe a random strategy consists of a bunch of random motions that cancel each other out, and in 1 in ∼2n of strategies in between these random motions are directed actions that add up to performing this n-bit strategy. Maybe 1 in ∼2n strategies start off by typing this strategy to another computer and end with shutting yourself off, so that in the remaining bits of the strategy will be ignored. A prefix-free encoding is basically like the latter situation except ignoring the bits after a certain point is built into the encoding rather than being an outcome of the agent’s interaction with the environment.
How do you make spoiler tags?
A neat thought experiment! At the end of it all, you no longer need to exchange fruit, you can just keep the fruit in place and exchange the identity of the people instead.
Thanks too for responding. I hope our conversation will be productive.
A crucial notion that plays into many of your objections is the distinction between “inner intelligence” and “outer intelligence” of an object (terms derived from “inner vs. outer optimizer”). Inner intelligence is the intelligence the object has in itself as an agent, determined through its behavior in response to novel situation, and outer intelligence is the intelligence that it requires to create this object, and is determined through the ingenuity of its design. I understand your “AI hypothesis” to mean that any solution to the control problem must have inner intelligence. My response is claiming that while solving the control problem may require a lot of outer intelligence, I think it only a requires a small amount of inner intelligence. This is because it seems like the environment in Conway’s Game of Life with random dense initial conditions is very low variety and requires a small number of strategies to handle. (Although just as I’m open-minded about intelligent life somehow arising in this environment, it’s possible that there are patterns much frequent than abiogenesis that make the environment much more variegated.)
Matter and energy and also approximately homogeneously distributed in our own physical universe, yet building a small device that expands its influence over time and eventually rearranges the cosmos into a non-trivial pattern would seem to require something like an AI.
The universe is only homogeneous at the largest scales, at smaller scales it is highly inhomogeneities in highly diverse ways like stars and planets and raindrops. The value of our intelligence comes from being able to deal with the extreme diversity of intermediate-scale structures. Meanwhile, at the computationally tractable scale in CGOL, dense random initial conditions do not produce intermediate-scale structures between the random small-scale sparks and ashes and the homgeneous large-scale. That said, conditional on life being rare in the universe, I expect that the control problem for our universe requires lower-than-human inner intelligence.
You mention the difficulty of “building a small device that...”, but that is talking about outer intelligence. Your AI hypothesis states that, however such a device can or cannot be built, the device itself must be an AI. That’s where I disagree.
Now it could actually be that in our own physical universe it is also possible to build not-very-intelligent machines that begin small but eventually rearrange the cosmos. In this case I am personally more interested in the nature of these machines than in “intelligent machines”, because the reason I am interested in intelligence in the first place is due to its capacity to influence the future in a directed way, and if there are simpler avenues to influence in the future in a directed way then I’d rather spend my energy investigating those avenues than investigating AI. But I don’t think it’s possible to influence the future in a directed way in our own physical universe without being intelligent.
Again, the distinction between inner and outer intelligence is crucial. In a pure mathematical sense of existence there exist arrangements of matter that solve the control problem for our universe, but for that to be relevant for our future there has also has to be a natural process that creates these arrangements of matter at a non-negligible rate. If the arrangement requires a high outer intelligence then this process must be intelligent. (For this discussion, I’m considering natural selection to be a form of intelligent design.) So intelligence is still highly relevant for influencing the future. Machines that are mathematically possible cannot practically be created are not “simpler avenues to influence in the future”.
“to solve the control problem in an environment full of intelligence only requires marginally more intelligence at best”
What do you mean by this?
Sorry. I meant that the solution to the control problem need only be marginally more intelligent than the intelligent beings in its environment. The difference in intelligence between a controller in an intelligent environment and a controller in a unintelligent environment may be substantial. I realize the phrasing you quote is unclear.
In chess, one player can systematically beat another if the first is ~300 ELO rating points higher, but I’m considering that as a marginal difference in skill on the scale from zero-strategy to perfect play. If our environment is creating the equivalent of a 2000 ELO intelligence, and the solution to the control problem has 2300 ELO, then the specification of the environment contributed 2000 ELO of intelligence, and the specification of the control problem only contributed an extra 300 ELO. In other words, open-world control problems need not be an efficient way of specifying intelligence.
But if one entity reliably outcompetes another entity, then on what basis do you say that this other entity is the more intelligent one?
On the basis of distinguishing narrow intelligence from general intelligence. A solution to the control problem is guaranteed to outcompete other entities in force or manipulation, but it might be worse at most other tasks. The sort of thing I had in mind for “NP-hard problems in military strategy” would be “this particular pattern of gliders is particularly good at penetrating a defensive barrier, and the only way to find this pattern is through a brute force search”. Knowing this can the controller a decisive advantage at military conflicts without making it any better at any other tasks, and can permit the controller to have lower general intelligence while still dominating.
Thanks. I also found an invite link in a recent reddit post about this discussion (was that by you?).
While I appreciate the analogy between our real universe and simpler physics-like mathematical models like the game of life, assuming intelligence doesn’t arise elsewhere in your configuration, this control problem does not seem substantially different or more AI-like from any other engineering problems. After all, there are plenty of other problems that involve leveraging a narrow form of control on a predicable physical system to achieve a more refined control, ex. building a rocket that hits a specific target. The structure that arises from a randomly initialized pattern in Life should be homogeneous in a statistical sense a so highly predictable. I expect almost all of it should stabilize to debris of stable periodic patterns. It’s not clear whether it’s possible to manipulate or clear the debris in controlled ways, but if it is possible, then a single strategy will work for the entire grid. It may take a great deal of intelligence to come up with such a strategy, but once such a strategy is found it can be hard-coded into the initial Life pattern, without any need for an “inner optimizer”. The easiest-to-design solution may involve computer-like patterns, with the pattern keeping track of state involved in debris-clearing and each part tracking its location to determine its role in making the final smiley pattern, but I don’t see any need for any AI-like patterns beyond that. On the other hand, if there are inherent limits in the ability to manipulate debris then no amount of reflection by our starting pattern is going to fix that.
That is assuming intelligence doesn’t arise in the random starting pattern. If it does, our starting configuration would to overpower every other intelligence that arises and tries to control the space, and this would reasonably require it to be intelligent itself. But if this is the case then the evolution of the random pattern already encodes the concept of intelligence in a much simpler way then this control problem. To predict the structures that would arise from a random initial configuration the idea of intelligence would naturalistic come up. Meanwhile, to solve the control problem in an environment full of intelligence only requires marginally more intelligence at best, and compared to the no-control prediction problem the control problem adds off some complexity for not very much increase in intelligence. Indeed, the solution to the control problem may even be less intelligent than the structures it competes against, and make up for that with hard-coded solutions to NP-hard problems in military strategy.
On a different note, I’m flattered to see a reference in the comments to some of my own thoughts on working through debris in the Game of Life. It was surprising to see interest in that resurge, and especially surprising to see that interest come from people in AI alignment.
Thanks for linking to my post! I checked the other link, on Discord, and for some reason it’s not working.
Do you know of any source that gives the same explanations in text instead of video?
Edit: Never mind, the course has links to “Lecture PDF” that seem to summarize them. For the first lecture the summary is undetailed and I couldn’t make sense of it without watching the videos, but they appear to get more detailed later on.
I don’t like the fact that the preview doesn’t disappear when I stop hovering. I find the preview visually jarring enough that I would prefer to spend most of my reading time without a spurious preview window. At the very least, there should be a way to manually close the preview. Otherwise I would want to avoid hovering over any links and to refresh when I do, which is a bad reading experience.
My main point of disagreement is the way you characterize these judgements as feelings. With minor quibbles I agree with your paragraph after substituting “it feels” with “I think”. In your article you distinguish between abstract intellectual understanding which may believe that there is no self in some sense and some sort of lower-level perception of the self which has a much harder time accepting this; I don’t follow what you’re pointing to in the latter.
To be clear, I do acknowledge to experience mental phenomena that are about myself in some sense, such as a proprioceptive distinction between my body and other objects in my mental spatial model, an introspective ability to track my thoughts and feelings, and a sense of the role I play in my community that I am expected to adhere to. However, the form of these pieces of mental content is wildly different, and it is only through an abstract mental categorization that I recognize them as all about the same thing. Moreover, I believe these senses are imperfect but broadly accurate, so I don’t know what it is that you’re saying is an illusion.
Crossposted on my blog:
Lightspeed delays lead to multiple technological singularities.
By Yudkowsky’s classification, I’m assuming the Accelerating Change Singularity: As technology gets better, the characteristic timescale at which technological progress is made becomes shorter, so that the time until this reaches physical limits is short from the perspective of our timescale. At a short enough timescale the lightspeed limit becomes important: When information cannot traverse the diameter of civilization in the time until singularity further progress must be made independently in different regions. The subjective time from then may still be large, and without communication the different regions can develop different interest and, after their singularities, compete. As the characteristic timescale becomes shorter the independent regions split further.