Felix is essentially a Utility Monster: a thought experiment that’s been addressed here before. As that family of examples shows, happiness-maximization breaks down rather spectacularly when you start considering self- or other-modification or any seriously unusual agents. You can bite that bullet, if you want, but not many people here do; fortunately, there are a few other ways you can tackle this if you’re interested in a formalization of humanlike ethics. The “Value Stability and Aggregation” post linked above touches on the problem, for example, as does Eliezer’s Fun Theory sequence.
You don’t need any self-modifying or non-humanlike agents to run into problems related to “Torture vs. Dust Specks”, though; all you need is to be maximizing over the welfare of a lot of ordinary agents. 3^^^3 is an absurdly huge number and leads you to a correspondingly counterintuitive conclusion (one which, incidentally, I’d estimate has led to more angry debate than anything else on this site), but lesser versions of the same tradeoff are quite realistic; unless you start invoking sacred vs. profane values or otherwise define the problem away, it differs only in scale from the same utilitarian calculations you make when, say, assigning chores.
Felix is essentially a Utility Monster: a thought experiment that’s been addressed here before. As that family of examples shows, happiness-maximization breaks down rather spectacularly when you start considering self- or other-modification or any seriously unusual agents. You can bite that bullet, if you want, but not many people here do; fortunately, there are a few other ways you can tackle this if you’re interested in a formalization of humanlike ethics. The “Value Stability and Aggregation” post linked above touches on the problem, for example, as does Eliezer’s Fun Theory sequence.
You don’t need any self-modifying or non-humanlike agents to run into problems related to “Torture vs. Dust Specks”, though; all you need is to be maximizing over the welfare of a lot of ordinary agents. 3^^^3 is an absurdly huge number and leads you to a correspondingly counterintuitive conclusion (one which, incidentally, I’d estimate has led to more angry debate than anything else on this site), but lesser versions of the same tradeoff are quite realistic; unless you start invoking sacred vs. profane values or otherwise define the problem away, it differs only in scale from the same utilitarian calculations you make when, say, assigning chores.