cubefox

Karma: 1,856

cubefox May 27, 2025, 8:15 PM
4 points
0
in reply to: Mo Putera’s comment on: Mo Putera’s Shortform
The Parfit quote from the blog post is taken out of context. Here is the relevant section in Parfit’s essay:

(Each box represents a possible population, with the height of a box representing how good overall an individual life is in that population, and the width representing the size of the population. The area of a box is the sum total “goodness”/”welfare”/”utility” (e.g. well-being, satisfied preferences, etc) in that population. The areas increase from A to Z, with Z being truncated here.)

Note that Parfit describes two different ways in which an individual life in Z could be barely worth living (emphasis added):

A life could be like this either because its ecstasies make its agonies seem just worth enduring, or because it is painless but drab.

Then he goes on to describe the second possibility (which is arguably unrealistic and much less likely than the first, and which contains the quote by the blog author). The author of the blog posts mistakenly ignores Parfit’s mentioning the first possibility. After talking about the second, Parfit returns (indicated by “similarly”) to the first possibility:

Similarly, Z is the outcome in which there would be the greatest quantity of whatever makes life worth living.

The “greatest quantity” here can simply be determined by the weight of all the positive things in an individual life minus the weight of all the negative things. Even if the result is just barely positive for an individual, for a large enough population, the sum welfare of the “barely net positive” individual lives would outweigh the sum for a smaller population with much higher average welfare. Yet intuitively, we should not trade a perfect utopia with relatively small population (A) for a world that is barely worth living for everyone in a huge population (Z).

That’s the problem with total utilitarianism, which simply sums all the “utilities” of the individual lives to measure the overall “utility” of a population. Taking the average instead of the sum avoids the repugnant conclusion, but it leads to other highly counterintuitive conclusions, such as that a population of a million people suffering strongly is less bad than a population of just a single person suffering slightly more strongly, as the latter has a worse average. So arguably both total and average utilitarianism are incorrect, at least without strong modifications.

(Personally I think a sufficiently developed version of person-affecting utilitarianism (an alternative to average and total utilitarianism) might well solve all these problems, though the issue is very difficult. See e.g. here.)

cubefox May 27, 2025, 9:00 AM
2 points
0
in reply to: TsviBT’s comment on: Wei Dai’s Shortform
There is quite the difference between “our understanding is still pre-Socratic” and “we haven’t said enough”. In general I think very few people here (not sure whether this applies to you) are familiar with the philosophical literature on topics in this area. For example, there is very little interest on LessWrong in normative ethics and the associated philosophical research. Even though this is directly related to alignment, since, if you you have an intent-aligned ASI (which is probably easier to achieve than shooting straight for value alignment) you probably need to know what ethics it should implement when asking it to create a fully value-aligned ASI.

Interestingly, the situation is quite different for the EA Forum, where there are regular high-quality posts on solving issues in normative ethics with reference to the academic literature, like the repugnant conclusion, the procreation asymmetry and the status of person-affecting theories. Any satisfactory normative ethical theory needs to solve these problems, similar to how any satisfactory normative theory of epistemic rationality needs to solve the various epistemic paradoxes and related issues.

Again, I don’t know whether this applies to you, but most cases of “philosophy has made basically no progress on topic X” seem to come from people who have very little knowledge of the philosophical literature on topic X.

cubefox May 26, 2025, 9:49 PM
2 points
0
in reply to: TsviBT’s comment on: Wei Dai’s Shortform
Philosophers have discussed these under the term “desires”. I think there was a lot of progress since the time of the pre-Socratics. Aristotle’s practical syllogism, Buridan’s donkey, Hume emphasis of the independence of beliefs and desires, Kant’s distinction between theoretical reason and practical reason, direction of fit, Richard Jeffrey’s utility theory (where utilities are degrees of desire), analysis of akrasia by various analytic philosophers, Nozick’s experience machine, and various others.

cubefox May 26, 2025, 5:47 PM
2 points
0
in reply to: MichaelDickens’s comment on: Neel Nanda’s Shortform
Oh, that’s disappointing. I once got rid of my craving for sweet drinks just by completely quitting drinks with sugar and sweeteners for a while. Unfortunately I since had a relapse. It’s easy to get addicted again, especially when another drug is involved, as in energy drinks. The randomization (gamification?) approach may work better in some cases.

cubefox May 26, 2025, 12:35 PM
2 points
0
in reply to: TsviBT’s comment on: Wei Dai’s Shortform
Talk about “values” is very popular on LessWrong, but much less common in philosophy or natural language. I confess I don’t even know what you mean with “trying to understand what values are”. Can you make the problem statement more precise, perhaps without reference to “values”?

cubefox May 25, 2025, 11:01 AM
3 points
4
in reply to: Neel Nanda’s comment on: Neel Nanda’s Shortform
If you cut something out entirely, that’s hard at first, but basically free later, when you became unaddicted. Just reducing consumption to medium level probably doesn’t cause you to get unaddicted in this way, so this requires some degree of long-term willpower. I assume this is why alcoholics try to stay completely “dry”, not just reduce their consumption.

cubefox May 25, 2025, 10:51 AM
4 points
1
in reply to: Neel Nanda’s comment on: Neel Nanda’s Shortform
That’s sounds like an interesting trick. However:

I don’t want to spend the willpower required to cut it out entirely, or to agonise every time about whether something is really worth it

If you cut it out entirely, you get used to it, and no longer need a lot of willpower after a while. Though it’s probably less realistic to cut out sugar entirely than to quit some drug entirely.

cubefox May 24, 2025, 3:55 PM
−2 points
−4
in reply to: xpym’s comment on: The stakes of AI moral status

what should we do?

Figure out where we’re confused

Congratulations, you just reinvented philosophy. :)

cubefox May 24, 2025, 7:09 AM
4 points
0
in reply to: Said Achmiz’s comment on: The stakes of AI moral status

No, babies are very different from adults in this regard, inasmuch as adults can tell us that they are in pain, can describe the pain, etc.

This doesn’t look like a big difference to me. Moreover, adults may also be unable to speak due to various illnesses or disabilities.

For very dissimilar entities, like rocks, the probability of being conscious would fall back to some kind of prior, though I don’t know how such a prior could be justified. (Purely intuitively it seems clear that rocks being conscious is highly unlikely, but it isn’t obvious why.)

… really? You can’t think of any reasons for this belief? Just pure intuition, that’s all you’ve got to go on? Are you seriously making this claim?

Yes. At least not from the top of my head. Note that this prior is supposed to not incorporate the information that you are conscious yourself.

cubefox May 24, 2025, 4:58 AM
4 points
0
on: How load-bearing is KL divergence from a known-good base model in modern RL?

Should we expect the trend of RL models being less like outcome pumps and more like agents which execute unsurprising actions to hold?

Very recently, with RLVR (reinforcement learning from verifiable rewards) the trend seems to have reversed. See here or here.

cubefox May 23, 2025, 2:15 PM
8 points
3
in reply to: Said Achmiz’s comment on: The stakes of AI moral status
These citations would only include trivial data we know about anyways. E.g. “If you injure a baby, it cries (which seems pretty similar to what I do when in pain)”. Babies are hardly different from adults in this regard. So it makes little sense to demand “evidence” for babies being able to feel pain, but not for (other) adults. I think in all these cases I can infer other minds with an inference from analogy, i.e. from similarity to myself in known properties (behavior, brain) to unknown properties (consciousness). For very dissimilar entities, like rocks, the probability of being conscious would fall back to some kind of prior, though I don’t know how such a prior could be justified. (Purely intuitively it seems clear that rocks being conscious is highly unlikely, but it isn’t obvious why.)

cubefox May 22, 2025, 2:55 PM
10 points
3
in reply to: Said Achmiz’s comment on: The stakes of AI moral status
There is no “citation” that anyone but myself feels pain. It’s the “problem of other minds”. After all, anyone could be a p-zombie, not just babies, animals, AIs...

cubefox May 22, 2025, 11:31 AM
3 points
0
in reply to: E.G. Blee-Goldman’s comment on: E.G. Blee-Goldman’s Shortform
Is this evidence for the natural abstraction hypothesis @johnswentworth?

cubefox May 22, 2025, 12:59 AM
5 points
0
in reply to: E.G. Blee-Goldman’s comment on: E.G. Blee-Goldman’s Shortform
Thanks for this reference. It arguably means aliens don’t have alien ontologies. Previous related discussion.

cubefox May 22, 2025, 12:54 AM
3 points
0
in reply to: TAG’s comment on: Warty’s Shortform
The article is not about alignment (that’s a different article), it’s about a normative moral theory.

cubefox May 22, 2025, 12:11 AM
4 points
1
in reply to: Warty’s comment on: Warty’s Shortform
That’s a frequent misconception. In fact, Eliezer Yudkowsky is a moral realist.

cubefox May 21, 2025, 3:21 PM
11 points
2
in reply to: whestler’s comment on: Matthew Khoriaty’s Shortform
Yeah. It is probably even more important for the cover to look serious and “academically respectable” than for it to look maximally appealing to a broad audience. It shouldn’t give the impression of a science fiction novel or a sensationalist crackpot theory. An even more negative example of this kind (in my opinion) is the American cover of The Beginning of Infinity by David Deutsch.

cubefox May 21, 2025, 2:51 PM
2 points
0
in reply to: Matthew Barnett’s comment on: Winning the power to lose
Note that individual value differences (like personal differences in preferences/desires) do not imply a difference in moral priority. This is because moral priority, at least judging from a broadly utilitarian analysis of the term, derives from some kind of aggregate of preferences, not from an individual preference. Questions about moral priority can be reduced to the empirical question of what the individual preferences are, and/or to the conceptual question of what this ethical aggregation method is. People can come (or fail to come) to an agreement on both irrespective of what their preferences are.

cubefox May 21, 2025, 8:53 AM
7 points
4
in reply to: Vale’s comment on: Matthew Khoriaty’s Shortform
Not sure about the italics, but I like showing Earth this way from space. It drives home a sense of scale.

cubefox May 21, 2025, 8:34 AM
13 points
10
in reply to: ryan_greenblatt’s comment on: peterbarnett’s Shortform
Note, the video doesn’t show up for me.