Trying to nudge others seems like an attempt to route around the problem rather than solve it. It seems like you tried pretty hard to integrate the substantive points in my “Effective Altruism is self-recommending” post, and even with pretty extensive active engagement, your estimate is that you only retained a very superficial summary. I don’t see how any compression tech for communication at scale can compete with what an engaged reader like you should be able to do for themselves while taking that kind of initiative.
We know this problem has been solved in the past in some domains—you can’t do a thing like the Apollo project or build working hospitals where cardiovascular surgery is regularly successful based on a series of atomic five-word commands; some sort of recursive general grammar is required, and at least some of the participants need to share detailed models.
One way this could be compatible with your observation is that people have somewhat recently gotten worse at this sort of skill; another is that credit-assignment is an unusually difficult domain to do this in. My recentblog posts have argued that at least the latter is true.
In the former case (lost literacy), we should be able to reconstruct older modes of coordination. In the latter (politics has always been hard to think clearly about), we should at least internally be able to learn from each other by learning to apply cognitive architectures we use in domains where we find this sort of thing comparatively easy.
I think I may have communicatedly somewhat poorly by phrasing this in terms of 5 words, rather than 5 chunks, and will try to write a new post sometime that presents a more formal theory of what’s going on.
Coordinated actions can’t take up more bandwidth than someone’s working memory (which is something like 7 chunks, and if you’re using all 7 chunks then they don’t have any spare chunks to handle weird edge cases).
A lot of coordination (and communication) is about reducing the chunk-size of actions. This is why jargon is useful, habits and training are useful (as well as checklists and forms and bureaucracy), since that can condense an otherwise unworkably long instruction into something people can manage.
And:
The “Go to the store” is four words. But “go” actually means “stand up. walk to the door. open the door. Walk to your car. Open your car door. Get inside. Take the key out of your pocket. Put the key in the ignition slot...” etc. (Which are in turn actually broken into smaller steps like “lift your front leg up while adjusting your weight forward”)
But, you are capable of taking all of that an chunking it as the concept “go somewhere” (as as well as the meta concept of “go to the place whichever way is most convenient, which might be walking or biking or taking a bus”), although if you have to use a form of transport you are less familiar with, remembering how to do it might take up a lot of working memory slots, leaving you liable to forget other parts of your plan.
I do in fact expect that the Apollo project worked via finding ways to cache things into manageable chunks, even for the people who kept the whole project in their head.
I’d be interested in figuring out how to operationalize this as a bet and check how the project actually worked. What I have heard (epistemic status: heard it from some guy on the internet) is that actually, most people on the project did not have all the pieces in their head, and the only people who did were the pilots.
My guess is that the pilots had a model of how to *use* and *repair* all the pieces of the ship, but couldn’t have built it themselves.
My guess it that “the people who actually designed and assembled the thing” had a model of how all the pieces fit together, but not as a deep a model of how and when to use it, and may have only understood the inputs and outputs of each piece.
And meanwhile, while I’m not quite sure how to operationalize the bet, I would bet maybe $50 that (conditional on us finding a good operationalization), that the number of people who had the full model or anything like it was quite small. (“You Have About Five Words” doesn’t claim you can’t have more than 5 words of nuance, it claims that you can’t coordinate large groups of people that depend on more than 5 words of nuance. I bet there were less than 100 people and probably closer to 10 who had anything like a full model of everything going on)
and will try to write a new post sometime that presents a more formal theory of what’s going on
I think I’m unclear on how this constrains anticipations, and in particular it seems like there’s substantial ambiguity as to what claim you’re making, such that it could be any of these:
You can’t communicate recursive structures or models with more than five total chunks via mass media such as writing.
You can’t get humans to act (or in particular to take initiative) based on such models, so you’re limited to direct commands when coordinating actions.
There exist such people, but they’re very few and stretched between very different projects and there’s nothing we can do about that.
I think there are two different anticipation-constraining-claims, similar but not quite what you said there:
Working Memory Learning Hypothesis – people can learn complex or recursive concepts, but each chunk that they learn cannot be composed of more than 7 other chunks. You can learn a 49 chunk concept but first must distill it into seven 7-chunk-concepts, learn each one, and then combine them together.
Coordination Nuance Hypothesis – there are limits to how nuanced a model you can coordinate around, at various scales of coordination. I’m not sure precisely what the limits are, but it seems quite clear that the more people you are coordinating the harder it is to get them to share a nuanced model or strategy. It’s easier to have a nuanced strategy with 10 people than 100, 1000, or 10,000.
I’m less confident of the Working Memory hypothesis (it’s an armchair inside view based on my understanding of how working memory works)
I’m fairly confident in the Coordination Nuance Hypothesis, which is based on observations about how people actually seem to coordinate at various scales and how much nuance they seem to preserve.
In both cases, there are tools available to improve your ability to learn (as an individual), disseminate information (as a communicator), and keep people organized (as a leader). But none of the tools changed the fundamental equation, just the terms.
Anticipation Constraints:
The anticipation-constraint of the WMLH is “if you try to learn a concept that requires more than 7 chunks, you will fail. If a concept requires 12 chunks, you will not successfully learn it (or will learn a simplified bastardization of it) until you find a way to compress the 12 chunks into 7. If you have to do this yourself, it will take longer than if an educator has optimized it for you in advance.”
The anticipation constraint of the CNH is that if you try to coordinate with 100 people of a given level of intelligence, the shared complexity of the plan that you are enacting will be lower than the complexity of the plan you could enact with 10 people. If you try to implement a more complex plan or orient around a more complex model, your organization will make mistakes due to distorted simplifications of the plan. And this gets worse as your organizations scales.
I agree they are different but think it is the case that with a larger group you have a harder time with either of them, for roughly the same reasons at roughly the same rate of increased difficulty.
The Working Memory Hypothesis says the Bell Labs is useful, in part, because whenever you need to combine multiple interdisciplinary concepts that are each complicated to invent a new concept…
instead of having to read a textbook that explains it one-particular-way (and, if it’s not your field, you’d need to get up to speed on the entire field in order to have any context at all) you can just walk down the hall and ask the guy who invented the concept “how does this work” and have them explain it to you multiple times until they find a way to compress it down into a 7 chunks, optimized for your current level of understanding.
A slightly more accurate anticipation of the CNH is:
people need to spend time learning a thing in order to coordinate around it. At the very least, the more time you need to spend getting people up to speed on a model, the less time they have to actually act on that model
people have idiosyncratic learning styles, and are going to misinterpret some bits of your plan, and you won’t know in advance which ones. Dealing with this requires individual attention, noticing their mistakes and correcting them. Middle managers (and middle “educators” can help to alleviate this, but every link in the chain reduces your control over what message gets distributed. If you need 10,000 people to all understand and act on the same plan/model, it needs to be simple or robust enough to survive 10,000 people misinterpreting it in slightly different ways
This gets even worse if you need to change your plan over time in response to new information, since now people are getting it confused with the old plan, or they don’t agree with the new plan because they signed up for the old plan, and then you have to Do Politics to get them on board with the new plan.
At the very least, if you’ve coordinated perfectly, each time you change your plan you need to shift from “focusing on execution” to “focusing on getting people up to speed on the new model.”
Trying to nudge others seems like an attempt to route around the problem rather than solve it. It seems like you tried pretty hard to integrate the substantive points in my “Effective Altruism is self-recommending” post, and even with pretty extensive active engagement, your estimate is that you only retained a very superficial summary. I don’t see how any compression tech for communication at scale can compete with what an engaged reader like you should be able to do for themselves while taking that kind of initiative.
We know this problem has been solved in the past in some domains—you can’t do a thing like the Apollo project or build working hospitals where cardiovascular surgery is regularly successful based on a series of atomic five-word commands; some sort of recursive general grammar is required, and at least some of the participants need to share detailed models.
One way this could be compatible with your observation is that people have somewhat recently gotten worse at this sort of skill; another is that credit-assignment is an unusually difficult domain to do this in. My recent blog posts have argued that at least the latter is true.
In the former case (lost literacy), we should be able to reconstruct older modes of coordination. In the latter (politics has always been hard to think clearly about), we should at least internally be able to learn from each other by learning to apply cognitive architectures we use in domains where we find this sort of thing comparatively easy.
I think I may have communicatedly somewhat poorly by phrasing this in terms of 5 words, rather than 5 chunks, and will try to write a new post sometime that presents a more formal theory of what’s going on.
I mentioned in the comments of the previous post:
And:
I do in fact expect that the Apollo project worked via finding ways to cache things into manageable chunks, even for the people who kept the whole project in their head.
Chunks can be nested, and chunks can include subtle neural-network-weights that are part of your background experience and aren’t quite explicit knowledge. It can be very hard to communicate subtle nuances as part of the chunks if you don’t have excess to high volume and preferably in-person communication.
I’d be interested in figuring out how to operationalize this as a bet and check how the project actually worked. What I have heard (epistemic status: heard it from some guy on the internet) is that actually, most people on the project did not have all the pieces in their head, and the only people who did were the pilots.
My guess is that the pilots had a model of how to *use* and *repair* all the pieces of the ship, but couldn’t have built it themselves.
My guess it that “the people who actually designed and assembled the thing” had a model of how all the pieces fit together, but not as a deep a model of how and when to use it, and may have only understood the inputs and outputs of each piece.
And meanwhile, while I’m not quite sure how to operationalize the bet, I would bet maybe $50 that (conditional on us finding a good operationalization), that the number of people who had the full model or anything like it was quite small. (“You Have About Five Words” doesn’t claim you can’t have more than 5 words of nuance, it claims that you can’t coordinate large groups of people that depend on more than 5 words of nuance. I bet there were less than 100 people and probably closer to 10 who had anything like a full model of everything going on)
I think I’m unclear on how this constrains anticipations, and in particular it seems like there’s substantial ambiguity as to what claim you’re making, such that it could be any of these:
You can’t communicate recursive structures or models with more than five total chunks via mass media such as writing.
You can’t get humans to act (or in particular to take initiative) based on such models, so you’re limited to direct commands when coordinating actions.
There exist such people, but they’re very few and stretched between very different projects and there’s nothing we can do about that.
??? Something else ???
I think there are two different anticipation-constraining-claims, similar but not quite what you said there:
Working Memory Learning Hypothesis – people can learn complex or recursive concepts, but each chunk that they learn cannot be composed of more than 7 other chunks. You can learn a 49 chunk concept but first must distill it into seven 7-chunk-concepts, learn each one, and then combine them together.
Coordination Nuance Hypothesis – there are limits to how nuanced a model you can coordinate around, at various scales of coordination. I’m not sure precisely what the limits are, but it seems quite clear that the more people you are coordinating the harder it is to get them to share a nuanced model or strategy. It’s easier to have a nuanced strategy with 10 people than 100, 1000, or 10,000.
I’m less confident of the Working Memory hypothesis (it’s an armchair inside view based on my understanding of how working memory works)
I’m fairly confident in the Coordination Nuance Hypothesis, which is based on observations about how people actually seem to coordinate at various scales and how much nuance they seem to preserve.
In both cases, there are tools available to improve your ability to learn (as an individual), disseminate information (as a communicator), and keep people organized (as a leader). But none of the tools changed the fundamental equation, just the terms.
Anticipation Constraints:
The anticipation-constraint of the WMLH is “if you try to learn a concept that requires more than 7 chunks, you will fail. If a concept requires 12 chunks, you will not successfully learn it (or will learn a simplified bastardization of it) until you find a way to compress the 12 chunks into 7. If you have to do this yourself, it will take longer than if an educator has optimized it for you in advance.”
The anticipation constraint of the CNH is that if you try to coordinate with 100 people of a given level of intelligence, the shared complexity of the plan that you are enacting will be lower than the complexity of the plan you could enact with 10 people. If you try to implement a more complex plan or orient around a more complex model, your organization will make mistakes due to distorted simplifications of the plan. And this gets worse as your organizations scales.
CNH is still ambiguous between “nuanced plan” and “nuanced model” here, and those seem extremely different to me.
I agree they are different but think it is the case that with a larger group you have a harder time with either of them, for roughly the same reasons at roughly the same rate of increased difficulty.
The Working Memory Hypothesis says the Bell Labs is useful, in part, because whenever you need to combine multiple interdisciplinary concepts that are each complicated to invent a new concept…
instead of having to read a textbook that explains it one-particular-way (and, if it’s not your field, you’d need to get up to speed on the entire field in order to have any context at all) you can just walk down the hall and ask the guy who invented the concept “how does this work” and have them explain it to you multiple times until they find a way to compress it down into a 7 chunks, optimized for your current level of understanding.
A slightly more accurate anticipation of the CNH is:
people need to spend time learning a thing in order to coordinate around it. At the very least, the more time you need to spend getting people up to speed on a model, the less time they have to actually act on that model
people have idiosyncratic learning styles, and are going to misinterpret some bits of your plan, and you won’t know in advance which ones. Dealing with this requires individual attention, noticing their mistakes and correcting them. Middle managers (and middle “educators” can help to alleviate this, but every link in the chain reduces your control over what message gets distributed. If you need 10,000 people to all understand and act on the same plan/model, it needs to be simple or robust enough to survive 10,000 people misinterpreting it in slightly different ways
This gets even worse if you need to change your plan over time in response to new information, since now people are getting it confused with the old plan, or they don’t agree with the new plan because they signed up for the old plan, and then you have to Do Politics to get them on board with the new plan.
At the very least, if you’ve coordinated perfectly, each time you change your plan you need to shift from “focusing on execution” to “focusing on getting people up to speed on the new model.”