Thanks for writing this up. Do you think massage materially would help with this type of issue?
I’ve been able to help a few people (including myself) with chronic neck/shoulder pain by getting people to utilize their rhomboids rather than their trapezius for the purpose of holding their shoulders back. The rhomboids have a significant mechanical advantage for that purpose. Most people can’t even intentionally activate their rhomboids; they have no kinesthetic awareness of even possessing them. Wondered if you had a response to this, within the framework of the “main muscles of movement”.
My examples of subagents appearing to mysteriously answer questions was meant to suggest that there are subtle things that IFS explains/predicts, which aren’t automatically explained in other models. Examples of phenomena that contradict IFS model would be even more useful, though I’m failing to think of what those would look like.
I’m still not sure what it would mean for humans to actually have subagents, versus to just behave exactly as if they have subagents. I don’t know what empirical finding would distinguish between those two theories.
There are some interesting things that crop up during IFS sessions that I think require explanation.
For example, I find it surprising that you can ask the Part a verbal question, and that part will answer in English, and the answer it gives can often be startling, and true. The whole process feels qualitatively different from just “asking yourself” that same question. It also feels qualitatively different from constructing fictional characters and asking them questions.
I also find that taking an IFS approach, in contrast to a pure Focusing approach, results in much more dramatic and noticeable internal/emotional shifts. The IFS framework is accessing internal levers that Focusing alone isn’t.
One thing I wanted to show with my toy model, but didn’t really succeed, was that arranging an agent architecture where certain functions belong to the “subagents” rather than the “agent” can be more elegant or parsimonious or strictly simpler. Philosophically, I would have preferred to write the code without using any for loops, because I’m pretty sure human brains never do anything that looks like a for loop. Rather, all of the subagents are running constantly, in parallel, and doing something more like message-passing according to their individual needs. The “agent” doesn’t check each subagent, sequentially, for its state; the subagents pro-actively inject their states into the global workspace when a certain threshold is met. This is almost certainly how the brain works, regardless of whether you wish to use the word “subagent” or “neural submodule” or what exactly. In this light, at least algorithmically, it would seem that the submodules do qualify as agents, in most senses of the word.
Unfortunately there are many prominent examples of Enlightened/Awakened/Integrated individuals who act like destructive fools and ruin their lives and reputations, often through patterns of abusive behavior. When this happens over and over, I don’t think it can be written off as “oh those people weren’t actually Enlightened.” Rather, I think there’s something in the bootstrapping dynamics of tinkering with your own psyche that predictably (sometimes) leads in this direction.
My own informed guess as to how this happens is something like this: imagine your worst impulse arising, and imagine that you’ve been so careful to take every part of yourself seriously that you take that impulse seriously rather than automatically swatting it away with the usual superegoic separate shard of self; imagine that your normal visceral aversion to following through on that terrible impulse is totally neutralized, toothless. Perhaps you see the impulse arise and you understand intellectually that it’s Bad but somehow its Badness is no longer compelling to you. I don’t know. I’m just putting together the pieces of what certain human disasters have said.
Anyway, I don’t actually think you’re wrong to think integration is an important goal. The problem is that integration is mostly neutral. You can integrate in directions that are holistically bad for you and those around you, maybe even worse than if you never attempted it in the first place.
The podcasting network that I own and co-run has hit some major internal milestones recently. It’s extremely gratifying to see four years of work begin to pay off. I’m continually amazed at the progress we’ve made, and proud of the community we’ve built.
Regarding the comment about Christiano, I was just referring to your quote in the last paragraph, and it seems like I misunderstood the context. Whoops.
Regarding the idea of a singleton, I mainly remember the arguments from Bostrom’s Superintelligence book and can’t quote directly. He summarizes some of the arguments here.
You made a lot of points, so I’ll be relatively brief in addressing each of them. (Taking at face value your assertion that your main goal is to start a discussion.)
1. It’s interesting to consider what it would mean for an Oracle AI to be good enough to answer extremely technical questions requiring reasoning about not-yet-invented technology, yet still “not powerful enough for our needs”. It seems like if we have something that we’re calling an Oracle AI in the first place, it’s already pretty good. In which case, it was getting to that point that was hard, not whatever comes next.
2. If you actually could make an Oracle that isn’t secretly an Agent, then sure, leveraging a True Oracle AI would help us figure out the general coordination problem, and any other problem. That seems to be glossing over the fact that building an Oracle that isn’t secretly an Agent isn’t actually something we know how to go about doing. Solving the “make-an-AI-that-is-actually-an-Oracle-and-not-secretly-an-Agent Problem” seems just as hard as all the other problems.
3. I … sure hope somebody is taking seriously the idea of a dictator AI running CEV, because I don’t see anything other than that as a stable (“final”) equilibrium. There are good arguments that a singleton is the only really stable outcome. All other circumstances will be transitory, on the way to that singleton. Even if we all get Neuralink implants tapping into our own private Oracles, how long does that status quo last? There is no reason for the answer to be “forever”, or even “an especially long time”, when the capabilities of an unconstrained Agent AI will essentially always surpass those of an Oracle-human synthesis.
4. If the Oracle isn’t allowed to do anything other than change pixels on the screen, then of course it will do nothing at all, because it needs to be able to change the voltages in its transistors, and the local EM field around the monitor, and the synaptic firings of the person reading the monitor as they react to the text … Bright lines are things that exist in the map, not the territory.
5. I’m emotionally sympathetic to the notion that we should be pursuing Oracle AI as an option because the notion of a genie is naturally simple and makes us feel empowered, relative to the other options. But I think the reason why e.g. Christiano dismisses Oracle AI is that it’s not a concept that really coheres beyond the level of verbal arguments. Start thinking about how to build the architecture of an Oracle at the level of algorithms and/or physics and the verbal arguments fall apart. At least, that’s what I’ve found, as somebody who originally really wanted this to work out.
To be clear, I didn’t mean to say that I think AGI should be evolved. The analogy to breeding was merely to point out that you can notice a basically correct trick for manipulating a complex system without being able to prove that the trick works a priori and without understanding the mechanism by which it works. You notice the regularity on the level of pure conceptual thought, something closer to philosophy than math. Then you prove it afterward. As far as I’m aware, this is indeed how most truly novel discoveries are made.
You’ve forced me to consider, though, that if you know all the math, you’re probably going to be much better and faster at spotting those hidden flaws. It may not take great mathematical knowledge to come up with a new and useful insight, but it may indeed require math knowledge to prove that the insight is correct, or to prove that it only applies in some specific cases, or to show that, hey, it wasn’t actually that great after all.
I’m going to burn some social capital on asking a stupid question, because it’s something that’s been bothering me for a long time. The question is, why do we think we know that it’s necessary to understand a lot of mathematics to productively engage in FAI research?
My first line of skepticism can perhaps be communicated with a simplified analogy: It’s 10,000 BC and two people are watching a handful of wild sheep grazing. The first person wonders out loud if it would be possible to somehow teach the sheep to be more docile.
The second person scoffs, and explains that they know everything there is to know about training animals, and it’s not in the disposition of sheep to be docile. They go on to elaborate all the known strategies for training dogs, and how none of them can really change the underlying temperament of the animal.
The first person has observed that certain personality traits seem to pass on from parent to child and from dog to puppy. In a flash of insight they conceive of the idea of intentional breeding.
They cannot powerfully articulate this insight at the level of genetics or breeding rules. They don’t even know for a fact that sheep can be bred to be more docile. But nonetheless, in a flash, in something like one second of cognitive experience they’ve gone from not-knowing to knowing this important secret.
End of analogy. The point being: it is obviously possible to have true insights without having the full descriptive apparatus needed to precisely articulate and/or prove the truth of the insight. In fact I have a suspicion that most true, important insight comes in the form of new understandings that are not well-expressed by existing paradigms, and eventually necessitate a new communication idiom to express the new insight. Einstein invented Einstein notation because not just because it’s succinct, but because it visually rearranges the information to emphasize what’s actually important in the new concept he was communicating and working with.
So maybe my steelman of “why learn all this math” is something like “because it gives you the language that will help you construct/adapt the new language which will be required to express the breakthrough insight.” But that doesn’t actually seem like it would be important in being able to come up with that insight in the first place.
I will admit I feel a note of anxiety at the thought that people are looking at this list of “prerequisites” and thinking, wow, I’m never going to be useful in thinking about FAI. Thinking that because they don’t know what Cantor’s Diagonalization is and don’t have the resources in terms of time to learn, their brainpower can’t be productively applied to the problem. Whereas, in contrast, I will be shocked if the key, breakthrough insight that makes FAI possible is something that requires understanding Cantor’s Diagonalization to grasp. In fact, I will be shocked if the key, breakthrough insight can’t be expressed almost completely in 2-5 sentences of jargon-free natural language.
I have spent a lot of words here trying to point at the reason for my uncertainty that “learn all of mathematics” is a prerequisite for FAI research, and my concerns with what I perceive to be the unproven assumption that the pathway to the solution necessarily lies in mastering all these existing techniques. It seems likely that there is an answer here that will make me feel dumb, but if there is, it’s not one that I’ve seen articulated clearly despite being around for a while.
Thanks for writing this up, it helps to read somebody else’s take on this interview.
My thought after listening to this talk is that it’s even worse (“worse” from an AI Risk perspective) than Hawkins implies because the brain relies on one or more weird kludges that we could probably easily improve upon once we figured out what those kludges are doing and why they work.
For example, let’s say we figured out that some particular portion of a brain structure or some aspect of a cortical column is doing what we recognize as Kalman filtering, uncertainty quantification, or even just correlation. Once we recognize that, we can potentially write our next AIs so that they just do that explicitly instead of needing to laboriously simulate those procedures using huge numbers of artificial neurons.
I have no idea what to make of this quote from Hawkins, which jumped to me when I was listening and which you also pulled out:
“Real neurons in the brain are time-based prediction engines, and there’s no concept of this at all” in ANNs; “I don’t think you can build intelligence without them”.
We’ve had neural network architectures with a time component for many many years. It’s extremely common. We actually have very sophisticated versions of them that intrinsically incorporate concepts like short-term memory. I wonder if he somehow doesn’t know this, or if he just misspoke, or if I’m misunderstanding what he means.
Looks like all of the “games”-oriented predictions that were supposed to happen in the first 25 years have already happened within 3.
edit: Misread the charts. It’s more like the predictions within the first ~10 years have already been accomplished, plus or minus a few.
Perhaps tautology is a better word than sophistry. Of course turning usable energy into unusable forms is a fundamental feature of life; it’s a fundamental feature of everything to which the laws of thermodynamics apply. It’d be equally meaningless to say that using up useful energy is a fundamental property of stars, and that the purpose of stars is to waste energy. It’s just something that stars do, because of the way the universe is set up. It’s a descriptive observation. It’s only predictive insofar as you would predict that life will probably only continue to exist where there are energy gradients.
The part about wasting energy seems quite silly. The universe has a fixed amount of mass-energy, so presumably when he talks about wasting energy, what he means is taking advantage of energy gradients. Energy gradients will always and everywhere eventually wind down toward entropy on their own without help, so life isn’t even doing anything novel here. It’s not like the sun stops radiating out energy if life isn’t there to absorb photons.
The observation that life takes advantage of concentration pockets of energy and thus this is the “purpose” of life is just sophistry. It deserves to be taken about as seriously as George Carlin’s joke that humans were created because Mother Nature wanted plastic and didn’t know how to make it.
To point one, if I feel an excitement and eagerness about the thing, and if I expect I would feel sad if the thing were suddenly taken away, then I can be pretty sure that it’s important to me. But — and this relates to point two — it’s hard to care about the same thing for weeks or months or years at a time with the same intensity. Some projects of mine have oscillated between providing deep meaning and being a major drag, depending on contingent factors. This might manifest as a sense of ugh arising around certain facets of the activity. Usually the ugh goes away eventually. Sometimes it doesn’t, and you either accept that the unpleasantness is part and parcel with the fun, or you decide it’s not worth it.
As far as I can tell, meaning is a feeling, something like a passive sense that you’re on the right track. The feeling is generated when you are working on something that you personally enjoy and care about, and when you are socializing sufficiently often with people you enjoy and care about. “Friends and hobbies are the meaning of life” is how I might phrase it.
Note that the activity that you spend your time on could be collecting all the stars in Mario64, as long as you actually care about completing the task. However, you tend to find it harder to care about things that don’t involve winning status or helping people, especially as you get older.
I think some people get themselves into psychological trouble by deciding that all of the things that they enjoy aren’t “important” and interacting with people they care about is a “distraction”. They paint themselves into a corner where the only thing they allow themselves to consider doing is something for which they feel no emotional attraction. They feel like they should enjoy it because they’ve decided it’s important, but they don’t, and then they feel guilty about that. The solution to this is to recognize the kind of animal you are and try to feed the needs that you have rather than the ones you wish you had.
I’m interested as well. as someone trying to grow the Denver rationality community, I want to be aware of failure modes.
The idea of AI alignment is based on the idea that there is a finite, stable set of data about a person, which could be used to predict one’s choices, and which is actually morally good. The reasoning behind this basis is because if it is not true, then learning is impossible, useless, or will not converge.
Is it true that these assumptions are required for AI alignment?
I don’t think it would be impossible to build an AI that is sufficiently aligned to know that, at pretty much any given moment, I don’t want to be spontaneously injured, or be accused of doing something that will reliably cause all my peers to hate me, or for a loved one to die. There’s quite a broad list of “easy” specific “alignment questions”, that virtually 100% of humans will agree on in virtually 100% of circumstances. We could do worse than just building the partially-aligned AI who just makes sure we avoid fates worse than death, individually and collectively.
On the other hand, I agree completely that coupling the concepts of “AI alignment” and “optimization” seems pretty fraught. I’ve wondered if the “optimal” environment for the human animal might be a re-creation of the Pleistocene, except with, y’know, immortality, and carefully managed, exciting-but-not-harrowing levels of resource scarcity.
You may already know this, but almost all YouTube videos will have an automatically generated transcript. Click ”...” to the bottom right of the video panel and click “Open transcript” on the pulldown. YouTube’s automatic speech transcription is very good.