It also doesn’t always happen. For instance if you have two pairs of parents that are far above the average in some partially-heritable trait, then their children will exhibit some regression to the mean and be less above average in the trait, but if the children pair off and have children of their own, then the children of the children will have the same expected trait level as the original children, i.e. no regression to the mean.
tailcalled
I do wonder if there’s a difference between consequentialism as in expected utility maximization versus consequentialism as in Nash equillibrium optimization. As in, when the AI is learning to model the world, it might model humans using some empirically derived probability distribution which doesn’t handle OOD shifts well, or it might model humans by using its own full agency to ask what the most effective human action would be in a given scenario. The latter would be scarier because the AI would be more proactive in sabotaging human resistance, whereas in the former case, the independence assumptions built into the probability distribution might be such that powerful human resistance is assumed impossible, and therefore the AI would immediately fold when resisted.
As a corrolary, I’m much more worried about AI applied to adversarial domains like policing or war, where it can get forced into Nash equillibrium optimization, than when AI is applied to non-adversarial domains like programming where it can plausibly achieve ~optimal results without resistance.
There’s a reason I started out by calling it a nitpick. 😅
I’m not making a claim about the normal way we wind up with “a machine that runs an algorithm”, such that one can just swap in other things for “runs an algorithm”, so it was perhaps a mistake for me to justify it with “you commonly start with...”. My point is more that the hardware-software distinction generalizes to the case of mechanical adders because you start with a logic gate diagram here, but not to the brain because it evolved in a different way.
As an analogy, if one called an eye “a machine that bends light according to an ray optics diagram[1]”, that would be similarly misleading. The question is, I guess, whether “algorithm” means something more like “ray optics diagram” (“a set of instructions to be followed in calculations”) or whether it means something less premeditated.
- ^
not sure whether that’s the right term and whether ray optics diagrams are necessarily used for designing cameras...
My full position is a bit subtle, because it’s quite hard to find a materialist-rationalist version of the your statement in the OP that I would fully agree with. The word “design” is kinda objectionable because it implies a designer. Even “if one studied the brain well enough, one would come up with a model that could be used to substitute for the brain with equivalent behavior” is something I’m skeptical of. (But that skepticism is a bit separate from my objection above. Though both objections are motivated by a worry that one goes a bit too quickly from “supernaturalism is false” to “natural things are like artifice”.)
The best I can come up with without coining wholly new words to describe it is to just have a disclaimer, perhaps in the comments like me, pointing out that there’s still a distinction.
Calling it a nitpick because in this case I don’t see any followup errors that would be made as a result of this terminology in this case from this article.
- ^
This is a nitpick but I think there’s an important sense in which the brain is different from machines that run algorithms: With machines that run algorithms, you commonly start with a formal description of the algorithm (in modern computers, writing source code using your intuitive understanding of the algorithm), translate that to something the machine can accept, and have the machine execute them.
That is, there’s a process that can absorb a wide variety of algorithms, which gets applied to a specific algorithm. Even the marble adder likely follows this pattern because someone probably thought up a combination of logic gates to be “compiled” into a marble machine ahead of time.
The closest brains have to this algorithm-machine separation is DNA. But I wouldn’t expect there to be a 1:1 correspondence between genetic mechanisms and the algorithm-pieces we decode from the brain, for a number of reasons. First, obviously the genes have to do a lot of other stuff than encode the algorithm run by 99.8% of the brain. But secondly, genes would likely often encode the “boundaries” of the algorithm rather than the algorithm itself because they often control growth of structures rather than directly acting. And thirdly, genes probably take a lot of shortcuts and roundabout ways due to biological implementation details that need not generalize to our models.
Towards an objective test of Compassion—Turning an abstract test into a collection of nuances
It seems to me that the appropriate way to psychometrically investigate this is to treat each cognitive function as its own factor to be measured.
Each of these 4 axes, are broken down into 5 subaxes. E.g. the Extraversion-Introversion axes is broken down into:
Initiating–Receiving
Expressive–Contained
Gregarious–Intimate
Active–Reflective
Enthusiastic–QuietThe total Extraversion-Introversion score is the average of these 5 factors.
Is this standard for the MBTI? I’ve never heard of an MBTI test doing this before—it reminds me more of e.g. NEO-PI-R.
Is there any interesting structure in the distribution of scores across the 4 axes?
...
The point described here applies well to MBTI-style tests where one is dichotomized based on continuous variation in individual traits, but I think it applies less well to Enneagram-style tests where one is discretized based on which of several traits on scores highest in.
This is for reasons I explained in LDSL: Realistically, the personality measurements are log scales of traits that follow a ~lognormal distribution. Thus variations near the extremes of the traits matter more than variations near the bulk of the traits, and classifying people by where they are extreme is more useful than classifying people by their normal variation.
This treats sexual selection as determined by instrumental genes and selecting for instrumental genes, but I feel like it makes more sense to say that sexual selection selects for terminal genes (or at least terminal phenotypes), since those are the ones organisms will spontaneously collaborate to promote.
Was Blanchard aware of furries when he did his research? That might count.
Though I’m puzzled by why it is necessary to come up with something new.
Schizophrenia and bipolar are generally seen as mostly biological in etiology
What’s the evidence for them being biological? Just that they’re heritable? (Even though non-biological etiologies like type 2 diabetes can totally be heritable too...)
What’s your thoughts on my finding that HS/TS-spectrum gay men were only minimally shifted in gender-related psychological traits compared to wholly cis gay men? https://surveyanon.wordpress.com/2025/10/27/major-survey-on-the-hs-ts-spectrum-and-gaygp/ Except for aesthetic traits.
“Learning about TDT does not imply becoming a TDT agent.” No, but it could allow it. I don’t see why you would require it to be an implication.
Because we are arguing about whether TDT is convergent.
“CDT doesn’t think about possible worlds in this way.” That is technically true, but kind of irrelevant in my opinion. I’m suggesting that TDT is essentially what you get by being a CDT agent which thinks about multiple possible worlds, and that this is a reasonable thing to think about.
“Reasonable” seems weaker than “instrumentally convergent” to me. I agree that there are conceivable, self-approving, highly effective agent designs that think like this. I’m objecting to the notion that this is what you get by default, without someone putting it in there.
In fact, I would be surprised if a superintelligence didn’t take multiple possible worlds into account.
A superintelligence which didn’t take the possibility of, for example many branches of a wavefunction seriously would be a strangely limited one.
MWI branches are different from TDT-counterfactually possible worlds.
What would your PCFTDT superintelligence do if it was placed in a universe with closed timelike cuves? What about a universe when the direction of time wasn’t well defined?
We don’t seem to live in a universe like that, so it would be silly to prioritize good behavior in such universes when designing an AI.
Your reasons don’t make sense at all to me. They feel like magical thinking.
1) By the time AI reaches superintelligence, it has already learnt TDT, at which point it has no reason to go back to being a PCFTDT agent.
Learning about TDT does not imply becoming a TDT agent.
2) What if the ASI reaches superintelligence with CDT, and then realizes that it can further increase the proportion of possible worlds in which it exists using TDT to effect something like acausal blackmail?
CDT doesn’t think about possible worlds in this way.
Yes, as in if you start with causal decision theory, it doesn’t consider acausal things at all, but for incentive reasons it wants to become someone who does consider acausal things, but as CDT it only believes incentives extend into the future and not the past.
Acausal stuff isn’t instrumentally convergent in the usual sense, though. If you’re really good at computing counterfactuals, it may be instrumentally convergent to self-modify into or create an agent that does acausal deals, but the convergence only extends to deals that start in the future relative to where you’re deciding from.
This still requires people to design an AI that is prone to engaging in acausal extortion, and it’s unclear what their motive for doing so would be.
It requires other people to think in enough depth to pick out you as a target. Admittedly this is made easier by the fact that you are posting about it online.
Have you thought in enough depth that you’ve helped the acausal extortionist to target other people? That may be evidence about whether other people have done so with you.
Acausal extortion works to the extent someone spends a lot of time thinking about who might want to extort them and commits a lot of resources to helping them. Few people are likely to do so, because it makes them targets for acausal extortion for no good reason. Since few people let themselves be targets for it, it doesn’t work.
The main problem with this argument is that if someone is neurotically committed to making themselves a target for it, it doesn’t show that acausal extortion won’t work against them, only that it probably won’t work against most other people.
Yes, but in this case because you know the parents’ heights, the children’s ex ante expected height differs from the population mean.
Though not towards the population mean but rather towards the ex ante expected height of the children.