I still don’t love the term “subagents”, despite everyone getting lots out of it, as well as personally agreeing with the intentional stance and the “alliances” you mention. I think my crux-net is something like
agents are strategic
fragments of our associative mental structures aren’t strategic except insofar as their output calls other game theoretic substructures or you are looking at something like the parliamentary moderator
if you think of these as agents, you will attribute false strategy to them and feel stuck more often, when in fact they are easily worked with if you think of their apparent strategy as “using highly simplistic native associations and reinforcements, albeit sometimes by pinging other fragments to do things outside their own purview, to accomplish their goal”
However, it does seem possible to me that the “calling other fragments” step does actually chain so far as to constitute real strategy and offer a useful level of abstraction for viewing such webs as subagents. I haven’t seen much evidence for this—does this framing make sense, and do you think it is clear there is something more like Turing-complete webs of strategy within subagents vs merely pseudostrategy? Wish I had a replacement word I liked better than subagent.
I don’t think most of the motivation is supposed to come in at the level of doing the final work that might win the award—I agree it seems like Nobel prizes, knighthoods, Hugo and Nebula, etc all aren’t being consciously thought about too much during the year or two beforehand.
“Making the industry seem relevant rather than encouraging behaviors” rings more true. The motivation seems to happen when younger people see that this is a thing society values. That node downstream of the award drives them through years of striving.
This is a good point. I was wondering why civic/public is much more functional in meatspace than cyber, whereas a lot of internet communities that seem good are more gated—and I think this is due to the civic/public being sort of superficial, because the actual gatekeepers are in all sorts of transaction costs and social barriers one doesn’t normally notice (or are deliberately obscured).
I actually had some similar alarm bells go off for conflation of concepts in the op, especially because the post specifically gestures at one concept and doesn’t give explanations of the different examples where this might come up.
However, on second thought I think I do like the concept this builds. To phrase it in your formal terms, I think it’s very useful to notice all the systems in which the Taylor series for f has b>0, ESPECIALLY when it’s comparably easy to control f via b∗x rather than just a.
In this light, you can view momentum, exponential growth, heavy-tails, etc., as all cases where a main component of controlling or predicting future x is by paying attention to the b∗x term, and I claim this is an important revelation to have at a variety of levels.
Perhaps more relevant to your actual crux, I also get shudders when people overload physics terms with other meanings, but before they were physics terms they were concepts for intuitive things. Given that we view the world through physical metaphors, I think it’s quite important for us to use the best-fitting words for concepts. Then we can remind people of the different variants when people run into conflationary trouble. If we start off by naming things with poor associations we hold ourselves back more. If you have alternative name to “momentum” for this that you also think have good connotations though, I’d love to hear them.
I expect certain changes in information flow to affect things somewhat. Anonymity on the internet allowed people to humorize their own laziness and patheticness without unmasking, which seems to have significantly increased common knowledge about lots of people being mentally unwell or otherwise bad at traditionally valued things like hard work. As this gets normalized I expect it to further erode adherence to mask-like values and promote the cluster of things like “be true to yourself” and “it’s ok to be depressed and seek help” and other MtG red/green over white. In fact, the selection effect of internet heroes being young, engaged in the gig economy, non-neurotypical, etc may create a sort of new value stratum if it doesn’t percolate further.
The social media bubble effect seems like it could also lead to a further divergence of values along various class/bubble lines as Vaniver mentioned was the case historically. This might be exacerbated on the economic axis if we keep seeing capital growth gaining relative to wages, though I don’t know much about that trend.
I am confused as to how the propositional consistency and observe function work together to prevent the trolling in the final step. Suppose I do try to find pairs of sentences such that I can show (A⇒Bi) and also ¬Bi to drive A down. Does this fail because you are postulating non-adversarial sampling, as ESRogs mentions? Or is there some other reason why propositional consistency is important here?
I’m not confident these are the right gears, and you might be asking for refined gears than mine, but my working hypothesis is something like:
The umbrella concept of weirdness is about whether people can predict your actions, since this is extremely useful information to track for a social animal. Predictability and therefore weirdness are tracked on a variety of levels—you can be weird because of your sleep schedule, or weird because of your nervous tics and body language, or weird because you talk in a very normal manner about the impending alien rapture, or weird just because you’re a foreigner. The weirdness of an action registers as flags on various mental levels to help you predict when that person later might not do the canonical action, and it registers with a magnitude and some metadata to help you track their weird trait(s) for inner simming. To answer the question of how much disconformity is “enough” to be labeled weird, I have to hand-wave and say that typical people’s social neural nets just get very good at inferring what infractions correspond to how much likelihood of what level of difficulty coordinating with them. (If this is the meat of the question, I could say more later).
Unfortunately, “weird” has had some semantic drift since unpredictable often happens to correlate with “being a less valuable ally” in a variety of ways for systemic or intrinsic reasons. Two important subtypes of weird that this is evident in are 1) the people whom you talk about that are just kind of loners, and 2) the people who actually provide frequent disvalue. The loners are “weird” because they can and do take actions the group hasn’t decided on, which makes them harder to coordinate with and significantly less predictable. But this also correlates with them being weird in other ways, and so it is rightly seen as Bayesian evidence for other problems by their peers—and further, people who sometimes leave the group are just less valuable allies (for dependability, for gossip, etc). When I do focusing on the weirdness of loners, I can kind of pick out these distinct feelings (of which I think the third is most prominent), along with other more personal ones like “weird → unpredictable → higher likelihood of new ideas → valuable” and similar.
I think “weird” has mutated into a slur nowadays because of the subtype of those who provide disvalue and the ways that those traits correlate with weirdness (and why it’s hard to get gears on the different types of nonconformity). You certainly can have good weird, where someone is unpredictable but in ways that everyone repeatedly likes (though they are still tracked as “weird”, importantly). But since a large part of social coordination is being predictable, the people who have fine control over their many levels of dials often do work largely within predictable ranges, and only the best optimizers can escape the local optima and be correct without too much disvalue on the way—which means that most people who aren’t being predictable are doing so because of an inability. And since most people can hit the small range of highly valuable parameter space we call “normal”, that gets set as baseline value, so a vast proportion of other actions are negative. So people who have difficulties with certain dials will regularly cause disvalue in various ways, which means that the trait of “weird” is now correlated with bad actions.
After writing this out, I’m wondering whether I should have called “weird” specifically “negative unpredictability”, and call “positive predictability” something like “interesting”. The people I think of as least weird and those I think of as least interesting both end up as “boring”, in the sense of a very predictable wind-up doll. I think you can have separate tickers for both weirdness and interestingness, but often people will black-and-white it one way or the other (and indeed argue whether someone is “weird” or “interesting”). The needle-threading of getting people to follow you demands an entire toolbox of gears itself, but some heuristics on just pushing the scales a little further from bad unpredictability:
One good way is to use your unpredictable actions to help your peers, as in noticing others are hungry and striking out on your own to fix the problem, or hitting the sweet spot of high-level predictability low-level unpredictability we call humor. Another, probably more important way, is to put a little extra effort at being extra predictable when around: prove you’re normal with small talk, say normal stuff about yourself, and forge social ties or commit to the group in other ways so they can know that you’ll (mostly) be there for them. Allies have to be dependable.
It seems important to be extremely clear about the criticism’s target, though. I agree overanalysis is a failure mode of certain rationalists, and statistically more so for those who comment more on LW and SSC (because of the selection effect specifically for those who nitpick). But rationality itself is not the target here, merely naive misapplication of it. The best rationalists tend to cut through the pedantry and focus on the important points, empirically.
Don’t know why the discrepancy, but it seems to me that a great deal of postrationality is littered with historical examples.
I also share your skepticism of clear psychological progression, but would point out plenty of times that people diverge in some ways but converge in more meta ones, e.g. divergence to liberal or conservative but convergence in political acumen, or e.g. divergence to minimalism or luxury but convergence to environmental modification.
As you say, the inner circle certainly may have reason to do non-obvious things. But while withholding information from people can be occasionally politically helpful, it seems usually best for the company to have the employees on the same page and working toward a goal they see reason for. Because of this, I would usually assume that seemingly poor decisions in upper management are the result of actual incompetence or a deceitful actor in the information flow on the way down.
I think people have already considered this, but the strategies converge. If someone else is going to make it first, you have only two possibilities: seize control by exerting a strategic advantage, or let them keep control but convince them to make it safe.
To do the former is very difficult, and the little bit of thinking that has been done about it has mostly exhausted the possibilities. To do the latter requires something like
1) giving them the tools to make it safe,
2) doing enough research to convince them to use your tools or fear catastrophe, and
3) opening communications with them.
So far, MIRI and other organizations are focusing on 1 and 2, whereas you’d expect them to primarily do 1 if they expected to get it first. We aren’t doing 3 with respect to China, but that is a step that isn’t easy at the moment and will probably get easier as time goes on.
I agree with this response; using first principles is a heuristic, and heuristics always have pros and cons. Just in terms of performance, the benefit is that you can re-assess assumptions but the cost is that you ignore a great amount of information gathered by those before you. Depending on the value of this information, you should frequently seek it out, as least as a supplement to your derivation.
Sharing goals is definitely a tricky decision, as you note. I think it has even more subtly than your proposed dichotomy, though.
Getting positive feedback just for proposing a goal takes away the positive future reward, but you still have reason to avoid the negative future reward of failing your commitments. Getting negative feedback early gives a positive future reward of showing people up, but this is little better than the future reward would have been anyways and comes hand in hand with an increased fear that your detractors will indeed be right.
Your point about avoiding early and undeserved praise is an important part of maintaining motivations, but I think a better solution would be something like a ring of friends that strongly support goal-achievement and stretch goals as virtuous and frequently check in with each other on incremental progress to incentivize goal-maintenance.