MichaelStJules

Karma: 586

MichaelStJules 16 Sep 2025 14:39 UTC
3 points
0
on: Shutdownable Agents through POST-Agency
Cool direction and results!
Can we prevent such an agent from having a preference to create agents that do resist shutdown?
EDIT: And if they’re going to create agents anyway, actually make sure those agents don’t resist shutdown, too, rather than, say, being indifferent about whether those other agents resist shutdown.

MichaelStJules 24 Aug 2025 1:09 UTC
3 points
0
in reply to: Lukas Finnveden’s comment on: Winning isn’t enough
Would some version of this still work if you have imprecise credences about the signs (and magnitudes) of considerations you’ll come up with, rather than 50-50 (or some other ratio)?
Even if not 50-50 but precise, we could adjust the donation amounts to match the probabilities and maintain the expected amount donated at $1000.
But if the probabilities are imprecise, I don’t think we can (precisely) maintain the expected donation amounts. We could pick donation amounts such that $1000, <$1000 and >$1000 are all possible expected donation amounts, in our set of credences (representor).

MichaelStJules 11 Jan 2025 19:52 UTC
1 point
0
in reply to: Pablo’s comment on: Actualism, asymmetry and extinction
FWIW, users can at least highlight text in a post to disagree with.
What links here?
- Pablo's comment on Actualism, asymmetry and extinction by MichaelStJules (11 Jan 2025 23:03 UTC; 4 points)

MichaelStJules 9 Jan 2025 15:00 UTC
10 points
0
in reply to: Noosphere89’s comment on: Quick general thoughts on suffering and consciousness
Interesting! Graziano’s Attention Schema Theory is also basically the same: he proposes consciousness to be found in our models of our own attention, and that these models evolved to help control attention. To be clear, though, it’s not the mere fact of modelling or controlling attention, but that attention is modelled in a way that makes it seem mysterious or unphysical, and that’s what explains our intuitions about phenomenal consciousness.^[1]
Graziano thinks mammals, birds and reptiles are conscious, is 50-50 on octopuses and very skeptical of fish and arthropods.
1. ^
  Graziano, 2021:
  In the attention schema theory (AST), having an automatically constructed self-model that depicts you as containing consciousness makes you intuitively believe that you have consciousness. The reason why such a self-model evolved in the brains of complex animals is that it serves the useful role of modeling, and thus helping to control, the powerful and subtle process of attention, by which the brain seizes on and deeply processes information.
  Suppose the machine has a much richer model of attention. Somehow, attention is depicted by the model as a Moray eel darting around the world. Maybe the machine already had need for a depiction of Moray eels, and it coapted that model for monitoring its own attention. Now we plug in the speech engine. Does the machine claim to have consciousness? No. It claims to have an external Moray eel.
  Suppose the machine has no attention, and no attention schema either. But it does have a self-model, and the self-model richly depicts a subtle, powerful, nonphysical essence, with all the properties we humans attribute to consciousness. Now we plug in the speech engine. Does the machine claim to have consciousness? Yes. The machine knows only what it knows. It is constrained by its own internal information.
  AST does not posit that having an attention schema makes one conscious. Instead, first, having an automatic self-model that depicts you as containing consciousness makes you intuitively believe that you have consciousness. Second, the reason why such a self-model evolved in the brains of complex animals, is that it serves the useful role of modeling attention.

Some implications of radical empathy

MichaelStJules7 Jan 2025 16:10 UTC

3 points

0 comments7 min readLW link

Actualism, asymmetry and extinction

MichaelStJules7 Jan 2025 16:02 UTC

7 points

4 comments9 min readLW link

Really radical empathy

MichaelStJules6 Jan 2025 17:46 UTC

19 points

0 comments10 min readLW link

Better difference-making views

MichaelStJules21 Dec 2024 18:27 UTC

9 points

0 comments14 min readLW link

MichaelStJules 20 Dec 2024 22:35 UTC
2 points
0
in reply to: transhumanist_atom_understander’s comment on: Iron deficiencies are very bad and you should treat them
Also oysters and mussels can have a decent amount of presumably heme iron, and they seem unlikely to be significantly sentient, and either way, your effects on wild arthropods are more important in your diet choices. I’m vegan except for bivalves.

MichaelStJules 14 Oct 2024 6:56 UTC
3 points
0
in reply to: Nathan Helm-Burger’s comment on: LLMs are likely not conscious
Since consciousness seems useful for all these different species, in a convergent-evolution pattern even across very different brain architectures (mammals vs birds), then I believe we should expect it to be useful in our homonid-simulator-trained model. If so, we should be able to measure this difference to a next-token-predictor trained on an equivalent number of tokens of a dataset of, for instance, math problems.
What do you mean by difference here? Increase in performance due to consciousness? Or differences in functions?
I’m not sure we could measure this difference. It seems very likely to me that consciousness evolved before, say, language and complex agency. But complex language and complex agency might not require consciousness, and may capture all of the benefits that would be captured by consciousness, so consciousness wouldn’t result in greater performance.
However, it could be that
1. humans do not consistently have complex language and complex agency, and humans with agency are fallible as agents, so consciousness in most humans is still useful to us as a species (or to our genes),
2. building complex language and complex agency on top of consciousness is the locally cheapest way to build them, so consciousness would still be useful to us, or
3. we reached a local maximum in terms of genetic fitness, or evolutionary pressures are too weak on us now, and it’s not really possible to evolve away consciousness while preserving complex language and complex agency. So consciousness isn’t useful to us, but can’t be practically gotten rid of without loss in fitness.
Some other possibilities:
1. The adaptive value of consciousness is really just to give us certain motivations, e.g. finding our internal processing mysterious, nonphysical or interesting makes it seem special to us, and this makes us
  1. value sensations for their own sake, so seek sensations and engage in sensory play, which may help us learn more about ourselves or the world (according to Nicholas Humphrey, as discussed here, here and here),
  2. value our lives more and work harder to prevent early death, and/or
  3. develop spiritual or moral beliefs and adaptive associated practices,
2. Consciousness is just the illusion of the phenomenality of what’s introspectively accessible to us. Furthermore, we might incorrectly believe in its phenomenality just because of the fact that much of the processing we have introspective access to is wired in and its causes are not introspectively accessible, but instead cognitively impenetrable. The full illusion could be a special case of humans incorrectly using supernatural explanations for unexplained but interesting and subjectively important or profound phenomena.
What links here?
- Michael St Jules 🔸's comment on Why I Think All The Species Of Significantly Debated Consciousness Are Conscious And Suffer Intensely by Bentham's Bulldog (EA Forum; 20 Nov 2024 18:09 UTC; 8 points)
- Michael St Jules 🔸's comment on Why I Think All The Species Of Significantly Debated Consciousness Are Conscious And Suffer Intensely by Bentham's Bulldog (EA Forum; 21 Nov 2024 1:04 UTC; 2 points)

MichaelStJules 16 Aug 2024 5:37 UTC
1 point
0
in reply to: Dagon’s comment on: Utilitarianism and the replaceability of desires and attachments
Sorry for the late response.

If people change their own preferences by repetition and practice, then they usually have a preference to do that. So it can be in their own best interests, for preferences they already have.

I could have a preference to change your preferences, and that could matter in the same way, but I don’t think I should say it’s in your best interests (at least not for the thought experiment in this post). It could be in my best interests, or for whatever other goal I have (possibly altruistic).

In my view, identity preservation is vague and degreed, a matter of how much you inherit from your past “self”, specifically how much of your memories and other dispositions.

Sequence overview: Welfare and moral weights

MichaelStJules15 Aug 2024 4:22 UTC

7 points

0 comments1 min readLW link

Pleasure and suffering are not conceptual opposites

MichaelStJules11 Aug 2024 18:32 UTC

7 points

0 comments1 min readLW link

MichaelStJules 6 Aug 2024 22:42 UTC
1 point
0
in reply to: Anthony DiGiovanni’s comment on: What are your cruxes for imprecise probabilities / decision rules?
Someone could fail to report a unique precise prior (and one that’s consistent with their other beliefs and priors across contexts) for any of the following reasons, which seem worth distinguishing:
1. There is no unique precise prior that can represent their state of knowledge.
2. There is a unique precise prior that represents their state of knowledge, but they don’t have or use it, even approximately.
3. There is a unique precise prior that represents their state of knowledge, but, in practice, they can only report (precise or imprecise) approximations of it (not just computing decimal places for a real number, but also which things go into the prior could differ by approximation). Hypothetically, in the limit of resources spent on computing its values, the approximations would converge to this unique precise prior.
I’d be inclined to treat all three cases like imprecise probabilities, e.g. I wouldn’t permanently commit to a prior I wrote down to the exclusion of all other priors over the same events/possibilities.

Utilitarianism and the replaceability of desires and attachments

MichaelStJules27 Jul 2024 1:57 UTC

5 points

2 comments12 min readLW link

MichaelStJules 24 Jun 2024 2:50 UTC
2 points
0
in reply to: Gustav Alexandrie’s comment on: Appraising aggregativism and utilitarianism
Harsanyi’s theorem has also been generalized in various ways without the rationality axioms; see McCarthy et al., 2020 https://doi.org/10.1016/j.jmateco.2020.01.001. But it still assumes something similar to but weaker than the independence axiom, which in my view is hard to motivate separately.

MichaelStJules 22 Jun 2024 13:34 UTC
29 points
0
in reply to: johnswentworth’s comment on: johnswentworth’s Shortform
Why do you believe AMD and Google make better hardware than Nvidia?

MichaelStJules 19 Apr 2024 21:11 UTC
1 point
0
in reply to: Dacyn’s comment on: Should we maximize the Geometric Expectation of Utility?
If bounded below, you can just shift up to make it positive. But the geometric expected utility order is not preserved under shifts.

MichaelStJules 19 Apr 2024 21:08 UTC
1 point
0
on: Should we maximize the Geometric Expectation of Utility?
Violating the Continuity Axiom is bad because it allows you to be money pumped.
Violations of continuity aren’t really vulnerable to proper/standard money pumps. The author calls it “arbitrarily close to pure exploitation” but that’s not pure exploitation. It’s only really compelling if you assume a weaker version of continuity in the first place, but you can just deny that.
I think transitivity (+independence of irrelevant alternatives) and countable independence (or the countable sure-thing principle) are enough to avoid money pumps, and I expect give a kind of expected utility maximization form (combining McCarthy et al., 2019 and Russell & Isaacs, 2021).
Against the requirement of completeness (or the specific money pump argument for it by Gustafsson in your link), see Thornley here.
To be clear, countable independence implies your utilities are “bounded” in a sense, but possibly lexicographic. See Russell & Isaacs, 2021.

MichaelStJules 17 Apr 2024 9:06 UTC
6 points
4
on: Mid-conditional love
Even if we instead assume that by ‘unconditional’, people mean something like ‘resilient to most conditions that might come up for a pair of humans’, my impression is that this is still too rare to warrant being the main point on the love-conditionality scale that we recognize.
I wouldn’t be surprised if this isn’t that rare for parents for their children. Barring their children doing horrible things (which is rare), I’d guess most parents would love their children unconditionally, or at least claim to. Most would tolerate bad but not horrible. And many will still love children who do horrible things. Partly this could be out of their sense of responsibility as a parent or attachment to the past.
I suspect such unconditional love between romantic partners and friends is rarer, though, and a concept of mid-conditional love like yours could be more useful there.

MichaelStJules

Some im­pli­ca­tions of rad­i­cal empathy

Ac­tu­al­ism, asym­me­try and extinction

Really rad­i­cal empathy

Bet­ter differ­ence-mak­ing views

Se­quence overview: Welfare and moral weights

Plea­sure and suffer­ing are not con­cep­tual opposites

Utili­tar­i­anism and the re­place­abil­ity of de­sires and attachments

Some implications of radical empathy

Actualism, asymmetry and extinction

Really radical empathy

Better difference-making views

Sequence overview: Welfare and moral weights

Pleasure and suffering are not conceptual opposites

Utilitarianism and the replaceability of desires and attachments