To point one, if I feel an excitement and eagerness about the thing, and if I expect I would feel sad if the thing were suddenly taken away, then I can be pretty sure that it’s important to me. But — and this relates to point two — it’s hard to care about the same thing for weeks or months or years at a time with the same intensity. Some projects of mine have oscillated between providing deep meaning and being a major drag, depending on contingent factors. This might manifest as a sense of ugh arising around certain facets of the activity. Usually the ugh goes away eventually. Sometimes it doesn’t, and you either accept that the unpleasantness is part and parcel with the fun, or you decide it’s not worth it.
As far as I can tell, meaning is a feeling, something like a passive sense that you’re on the right track. The feeling is generated when you are working on something that you personally enjoy and care about, and when you are socializing sufficiently often with people you enjoy and care about. “Friends and hobbies are the meaning of life” is how I might phrase it.
Note that the activity that you spend your time on could be collecting all the stars in Mario64, as long as you actually care about completing the task. However, you tend to find it harder to care about things that don’t involve winning status or helping people, especially as you get older.
I think some people get themselves into psychological trouble by deciding that all of the things that they enjoy aren’t “important” and interacting with people they care about is a “distraction”. They paint themselves into a corner where the only thing they allow themselves to consider doing is something for which they feel no emotional attraction. They feel like they should enjoy it because they’ve decided it’s important, but they don’t, and then they feel guilty about that. The solution to this is to recognize the kind of animal you are and try to feed the needs that you have rather than the ones you wish you had.
I’m interested as well. as someone trying to grow the Denver rationality community, I want to be aware of failure modes.
The idea of AI alignment is based on the idea that there is a finite, stable set of data about a person, which could be used to predict one’s choices, and which is actually morally good. The reasoning behind this basis is because if it is not true, then learning is impossible, useless, or will not converge.
Is it true that these assumptions are required for AI alignment?
I don’t think it would be impossible to build an AI that is sufficiently aligned to know that, at pretty much any given moment, I don’t want to be spontaneously injured, or be accused of doing something that will reliably cause all my peers to hate me, or for a loved one to die. There’s quite a broad list of “easy” specific “alignment questions”, that virtually 100% of humans will agree on in virtually 100% of circumstances. We could do worse than just building the partially-aligned AI who just makes sure we avoid fates worse than death, individually and collectively.
On the other hand, I agree completely that coupling the concepts of “AI alignment” and “optimization” seems pretty fraught. I’ve wondered if the “optimal” environment for the human animal might be a re-creation of the Pleistocene, except with, y’know, immortality, and carefully managed, exciting-but-not-harrowing levels of resource scarcity.
You may already know this, but almost all YouTube videos will have an automatically generated transcript. Click ”...” to the bottom right of the video panel and click “Open transcript” on the pulldown. YouTube’s automatic speech transcription is very good.
This exceeded my expectations. You kept it short and to the point, and the description of the technique was very clear. I look forward to more episodes.
Have you—or anyone, really—put much thought into the implications of these ideas to AI alignment?
If it’s true that modeling humans at the level of constitutive subagents renders a more accurate description of human behavior, then any true solution to the alignment problem will need to respect this internal incoherence in humans.
This is potentially a very positive development, I think, because it suggests that a human can be modeled as a collection of relatively simple subagent utility functions, which interact and compete in complex but predictable ways. This sounds closer to a gears-level portrayal of what is happening inside a human, in contrast to descriptions of humans as having a single convoluted and impossible-to-pin-down utility function.
I don’t know if you’re at all familiar with Mark Lippman’s Folding material and his ontology for mental phenomenology. My attempt to summarize his framework of mental phenomena is as follows: there are belief-like objects (expectations, tacit or explicit, complex or simple), goal-like objects (desirable states or settings or contexts), affordances (context-activated representations of the current potential action space) and intention-like objects (plans coordinating immediate felt intentions, via affordances, toward goal-states). All cognition is “generated” by the actions and interactions of these fundamental units, which I infer must be something like neurologically fundamental. Fish and maybe even worms probably have something like beliefs, goals, affordances and intentions. Ours are just bigger, more layered, more nested and more interconnected.
The reason I bring this up is that Folding was a bit of a kick in the head to my view on subagents. Instead of seeing subagents as being fundamental, I now see subagents as expressions of latent goal-like and belief-like objects, and the brain is implementing some kind of passive program that pursues goals and avoids expectations of suffering, even if you’re not aware you have these goals or these expectations. In other words, the sense of there being a subagent is your brain running a background program that activates and acts upon the implications of these more fundamental yet hidden goals/beliefs.
None of this is at all in contradiction to anything in your Sequence. It’s more like a slightly different framing, where a “Protector Subagent” is reduced to an expression of a belief-like object via a self-protective background process. It all adds up to the same thing, pretty much, but it might be more gears-level. Or maybe not.
Could you elaborate on how you’re using the word “symmetrical” here?
The best I can do after thinking about it for a bit is compute every possible combination of units under 200 supply, multiply that by the possible positions of those units in space, multiply that by the possible combinations of buildings on the map and their potential locations in space, multiply that by the possible combinations of upgrades, multiply that by the amount of resources in all available mineral/vespene sources … I can already spot a few oversimplifications in what I just wrote, and I can think of even more things that need to be accounted for. The shields/hitpoints/energy of every unit. Combinatorially gigantic.
Just the number of potential positions of a single unit on the map is already huge.
But AlphaStar doesn’t really explore much of this space. It finds out pretty quickly that there’s really no reason to explore the parts of the space the include building random buildings in weird map locations. It explores and optimizes around the parts of the state space that look reasonably close to human play, because that was its starting point, and it’s not going to find superior strategies randomly, not without a lot of optimization in isolation.
That’s one thing I would love to see, actually. A version of the code trained purely on self-play, without a basis in human replays. Does it ever discover proxy plays or other esoteric cheese without a starting point provided in the human replays?
Before now, it wasn’t immediately obvious that SC2 is a game that can be played superhumanly well without anything that looks like long-term planning or counterfactual reasoning. The way humans play it relies on a combination of past experience, narrow skills, and “what-if” mental simulation of the opponent. Building a superhuman SC2 agent out of nothing more than LSTM units indicates that you can completely do away with planning, even when the action space is very large, even when the state space is VERY large, even when the possibilities are combinatorially enormous. Yes, humans can get good at SC2 with much less than 200 years of time played (although those humans are usually studying the replays of other masters to bootstrap their understanding) but I think it’s worthwhile to focus on the inverse of this observation: that a sophisticated problem domain which looks like it ought to require planning and model-based counterfactual reasoning actually requires no such thing. What other problem domains seem like they ought to require planning and counterfactual reasoning, but can probably be conquered with nothing more advanced than a deep LSTM network?
(I haven’t seen anyone bother to compute an estimate of the size of the state-space of SC2 relative to, for example, Go or Chess, and I’m not sure if there’s even a coherent way to go about it.)
The freedom to speculate wildly is what makes this topic so fun.
My mental model would say, you have a particular pattern recognition module that classifies objects as “chair”, along with a weight of how well the current instance matches the category. An object can be a prototypical perfect Platonic chair, or an almost-chair, or maybe a chair if you flip it over, or not a chair at all.
When you look at a chair, this pattern recognition module immediately classifies it, and then brings online another module, which makes available all the relevant physical affordances, linguistic and logical implications of a chair being present in your environment. Recognizing something as a chair feels identical to recognizing something as a thing-in-which-I-can-sit. Similarly, you don’t have to puzzle out the implications of a tiger walking into the room right now. The fear response will coincide with the recognition of the tiger.
When you try to introspect on chairness, what you’re doing is tossing imagined sense percepts at yourself and observing the responses of the chariness detecting module. This allows you to generate an abstract representation of your own chairness classifier. But this abstract representation is absolutely not the same thing as the chairness classifier, any more than your abstract cogitation about what the “+” operator does is the same thing as the mental operation of adding two numbers together.
I think a lot of confusion about the nature of human thinking stems from the inability to internally distinguish between the abstracted symbol for a mental phenomenon and the mental phenomenon itself. This dovetails with IFS in an interesting way, in that it can be difficult to distinguish between thinking about a particular Part in the abstract, and actually getting into contact with that Part in a way that causes it to shift.
I’m not sure why you say that the unconscious modules communicating with each other would necessarily contradict the idea of us being conscious of exactly the stuff that’s in the workspace, but I tend to agree that considering the contents of our consciousness and the contents of the workspace to be strictly isomorphic seems to be too strong.
I may be simply misunderstanding something. My sense is that when you open the fridge to get a yogurt and your brain shouts “HOW DID CYPHER GET INTO THE MATRIX TO MEET SMITH WITHOUT SOMEONE TO HELP HIM PLUG IN?”, this is a kind of thought that arises from checking meticulously over your epistemic state for logical inconsistencies, rather esoteric and complex logical inconsistencies, and it seems to come from nowhere. Doesn’t this imply that some submodules of your brain are thinking abstractly and logically about The Matrix completely outside of your conscious awareness? If so, then this either implies that the subconscious processing of individual submodules can be very complex and abstract without needing to share information with other submodules, or that information sharing between submodules can occur without you being consciously aware of it.
A third possibility would be that you were actually consciously thinking about The Matrix in a kind of inattentive, distracted way, and it only seems like the thought came out of nowhere. This would be far from the most shocking example of the brain simply lying to you about its operations.
The most obvious example of this kind of thing is the “flash of insight” that we all experience from time to time, where a complex, multi-part solution to a problem intrudes on our awareness as if from nowhere. This seems to be a clear case of the unconscious working on this problem in the background, identifying its solution as a valid one still in the background, and injecting the fully-formed idea into awareness with high salience.
It’s a bit like the phenomenon of being able to pick out your own name from a babble of crowded conversation, except applied to the unconscious activity of the mind. This, however, implies that much complex inter-agent communication and abstract problem solving is happening subconsciously. And this seems to contradict the view that only very simple conceptual packages are passed through to the Global Workspace, and that we must necessarily be conscious of our own abstract problem solving.
My own perceptions during meditation (and during normal life) would suggest that the subconscious/unconscious is doing very complex and abstract “thinking” without my being aware of its workings, and intermittently injecting bits and pieces of its ruminations into awareness based on something like an expectation that the gestalt self might want to act on that information.
This seems contrary to the view that “what we are aware/conscious of” is isomorphic to “the Global Workspace”. It seems that subconscious modules are chattering away amongst themselves almost constantly, using channels that are either inaccessible to consciousness or severely muted.
I really look forward to this Sequence.
I would suspect the failure of most social movements is overdetermined. Social movements by default are designed to change the status quo, and the status quo tends to be stable and intrinsically resistant to change. Social movements are often ideologically originated and may be aimed at achieving something practically impossible.
Another phrasing might be that most social movements fail because a sober analysis would have shown that there was never any realistic possibility for most social movements to succeed, even if they had more resources, smarter people and better planning.
I’ve improved most dramatically at writing by getting very specific feedback from people who are clearly better than me. I consider myself lucky to have had a small handful of teachers and professors willing to put in the time to critique my writing on the sentence- and word-level.
Recently I had a work of fiction of mine minutely critiqued by a professional author and experienced a similar sense of “leveling up”. For example, I’ve thought for years that I understood what “show don’t tell” means. But my gracious editor in this case was able to point out multiple instances in my story where I was “telling” when I could be “showing”. Once he pointed these out, I understood on a deeper level what to pay attention to.
One interesting thing about getting feedback on writing is that someone who is truly better than you can usually provide suggestions that you immediately recognize as correct. You may think your writing is fine, even great, but you’ll recognize true improvements as being obviously correct and better, once pointed out. The process of becoming better at writing is the accumulation of such individual corrections.
My daughter is just starting to learn subtraction. She was very frustrated by it, and if I verbally asked “What’s seven minus five?” she was about 50% likely to give the right answer. I asked her a sequence of simple subtraction problems and she consistently performed at about that level. In the course of our back and forth I switch my phrasing to the form “You have seven apples and you take away five, how many left?” and she immediately started answering the questions 100% correctly, very rapidly too. Experimentally I switched back to the prior form and she started getting them wrong again. It was apparent to me that simply phrasing the problem in terms of concrete objects was activating something like visualization which made the problems easy, and just phrasing it as abstract numbers was failing to activate this switch. So as you say, for more tricky arithmetic problems, it may be the case that what mental circuits are “activated automatically” determine the first answer you arrive at, and you can exploit that effect with edge cases like this.
It seems so obvious to me that the benefits of preschool would wear off after a short number of years that I feel like I must be missing something. How could it be otherwise, given the current system? This is all completely setting aside the developmental limitations of small children.
Let’s take two kids, Jamie and Alex. Pretend that there are no developmental limitations on children’s brains and that they can be taught to read equally well at ages 3, 4, 5, and 6.
Alex starts preschool at age 3 and they can read at a 1st grade level by the time they enter Kindergarten.
Jamie does not do any preschool and cannot read at all when they enter Kindergarten.
By the end of Kindergarten, Alex can read slightly better than 1st grade level, but not a lot better, since the curriculum hasn’t been challenging. It’s basically been a rehash of what they already can do. Jamie can read at the expected grade level by the end of Kindergarten.
By the end of first grade, no accommodations have been made for the fact that Alex is a slightly advanced reader. Both kids are given essentially the same pool of books to read. Alex has not skipped a grade or put in an some secret fast-track program for kids who went to preschool, because this does not exist. So by the end of first grade, they can read about equally well. Maybe Alex reads slightly better, but since no real pressure is being put on this advantage that would cause it to compound rather than diminish, it naturally diminishes until both students are at the same level.
Acting as though anything else would happen doesn’t make sense to me. It’s not like each year a child spends in school exerts some kind of Education Force on their brain which accrues generalized scholastic ability. Kindergartners are taught kindergarten level math and reading skills; kids entering kindergarten who already possess these skills only benefit until the other kids catch up.
So IMO the problem isn’t the preschool “doesn’t do anything”. The problem is that the system as it stands doesn’t actually utilize the potential advantage of preschool. We are pretty far away from a system that would do so; such a system would need to be one that tailors the specific educational content to the specific child.
My four year old can read pretty well and can write well enough that you can puzzle out what he’s trying to communicate. But there is no expectation that he’s going to skip kindergarten because of this. So in what sense could this ever be a long-term academic advantage?
“Real” stoicism seems to demand total relinquishment of all attachments, to almost exactly the same degree that “real” Buddhism does. I think this is a pathological thing to want.
Yes, it’s psychologically beneficial to be less upset about being stuck in traffic. When you’re already stuck in traffic and can’t do anything about it, your choice to not be upset about it is simply a choice to avoid needless suffering.
One might argue that it’s even better to let yourself be really annoyed by being stuck in traffic, and then permit your annoyance to motivate you to take actions to avoid being stuck in traffic in the future.
The sort of person who would legitimately not care if their child died would also be different from me in a number of other very important ways in order to be a reasonably consistent agent. For example, if a stoic claims to be emotionally indifferent between “child death” and “child flourishing”, then what actually motivates them? Why do anything, why make any choice? At least Buddhist thought is honest about this, and admits that the only truly consistent solution is a purely monastic life of meditation and aggressive pursuit of non-existence. Stoicism, as far as I can tell, refuses to bite the bullet of the conclusions of its premises.
1) I think a lot of people think they’re stoic when in actuality they’ve just never had anything bad happen to them. Modern life offers relatively few opportunities to test stoicism, and by default, everyone fails such tests without truly significant preparation.
2) Stoicism is actually a huge drag.
With regard to whatever objects give you delight, are useful, or are deeply loved, remember to tell yourself of what general nature they are, beginning from the most insignificant things. If, for example, you are fond of a specific ceramic cup, remind yourself that it is only ceramic cups in general of which you are fond. Then, if it breaks, you will not be disturbed. If you kiss your child, or your wife, say that you only kiss things which are human, and thus you will not be disturbed if either of them dies.
- Epictetus, The Enchiridion
Who wants to live like this? I want to be disturbed if a loved one dies.