HoVY
Yeah it’s strange. I wouldn’t be surprised if people attracted to lesswrong tend to have less robust theory of mind, like if we tend to be more autistic (I’m probably dipping my toes on the spectrum but haven’t been diagnosed. Many of my closest friends tend to be autistic), which then leads to a theory-of-mind version of the breakfast question (which to be clear looks like it’s a pretty racist meme, I’ve only seen it from that know your meme page, and I think the ties to race are gross. The point I’m trying to make is not race related at all), where if you ask “How would you feel if you were someone else?” people say “what do you mean? I’m not them, I’m me.”
I also posted it on the EA forum, and it did a lot better there https://forum.effectivealtruism.org/posts/fcM7nshyCiKCadiGi/how-the-veil-of-ignorance-grounds-sentientism
a superintelligent AI would change its utility function to the simplest one, as described here. But I don’t see why it shouldn’t do that. What do you think about this?
I don’t think a superintelligent AI would change it’s utility function as you describe, I think the constraints of it’s existing utility function would be way too ingrained, and it would not want to change it in those ways. While I think the idea you’re putting forward makes sense and gets us closer to an “objective” morality, I think that you’re on the same path as Eliezer’s “big mistake” of thinking that a super intelligent ai would just want to have an ideal ethics, which isn’t a given (I think he talks about it somewhere in here https://www.readthesequences.com/Book-V-Mere-Goodness). For example, the current path of LLM ai is essentially just a conglomeration of human ethics based on what we’ve written and passed in to the training data, it tends not to be more ethical than us, and in fact early ai bots that learned from people interacting with them could easily become very racist.
By the way, have I convinced you to accept total hedonistic utilitarianism?
Well, I already thought that suffering is what roots ethics at the most basic level, so in a sense yes, but also I think that we do better at that using higher level heuristics rather than trying to calculate everything out, so in that sense I don’t think so?
Hey, just got around to reading your post after your comment on https://www.lesswrong.com/posts/JL3PvrfJXg7RD7bhr/how-the-veil-of-ignorance-grounds-sentientism, and I think we are very much trying to point to the same thing! Thanks for sharing!
“Works” how, exactly? For example, what are your actual answers to the specific questions I asked about that variant of the scenario?
The thought experiment as I execute it requires me to construct a model of other minds, human or not, that is more detailed than what I would normally think about, and emotionally weight that understanding in order to get a deeper understanding of how important that is. To give an example, it’s possible for me to think about torture and be very decoupled with it and shrug and think “that sucks for the people getting tortured”, but if I think about it more carefully, and imagine my own mental state if I was about to be tortured, then the weight of how extremely fucked up it is becomes very crisp and clear.
Perhaps it was a mistake to use Rawl’s VOI if it also implies other things that I didn’t realize I was invoking, but the way I think of it, every sentient being is actually feeling the valence of everything they’re feeling, and from an impartial perspective the true weight of that is not different from ones own valenced experiences. And if you know that some beings experience extreme negative valence, one strategy to get a deeper understanding of how important that is, is to think about it as if you were going to experience that level of negative valence. No incoherent beings of perfect emptiness required, just the ability to model other minds based on limited evidence, imagine how you would personally react to states across the spectrum of valence, and the ability to scale that according to the distribution of sentient beings in the real world.
And this works on pebblesorters too, although it’s more difficult since we can’t build a concrete model of them beyond what’s given in the story + maybe some assumptions if their neurobiology is at all similar to ours. If an “incorrect” pebble stack gives them negative valence of around the same level that the sound of nails on a chalkboard does for me, then that gives me a rough idea of how important it is to them (in the fictional world). If pebblesorters existed and that was the amount of negative valence caused by an “incorrect” stack, I wouldn’t mess up their stacks any more than I go around scratching chalkboards at people (while wearing earplugs so it doesn’t bother me).
To go back to the master/slave example, if the master truly thought he was about to become a slave, and everything that entails, I’m not convinced he would stick to his guns on how it’s the right order of the universe. I’m sure some people would genuinely be fine with it, but I’m guessing if you actually had a mercenary trying to kidnap and enslave him, he’d start making excuses and trying to get out of it, in a similar way as the one claiming the invisible dragon in their garage will have justifications for why you can’t actually confirm it exists.
In other words, I’m trying to describe a way of making moral views pay rent about the acceptable levels of negative valence in the world. Neither my views, nor the thought experiment I thought I was talking about, depends on disembodied spirits.
Ok… it seems that you totally ignored the question that I asked, in favor of restating a summary of your argument. I guess I appreciate the summary, but it wasn’t actually necessary. The question was not rhetorical; I would like to see your answer to it.
I only see two questions in this line of conversation?
do you have a principled disagreement with all of the arguments for why nothing remotely like this is possible even in principle, or… are you not familiar with them?
I’m not familiar with the specific arguments you’re referring to, but I don’t think it’s actually possible for disembodied minds to exist at all, in the first place. So no I don’t have principled disagreements for those arguments, I have tentative agreement with them.
Another way to put it is that you are asking us (by extending what Rawls is asking us) to perform a mental operation that is something like “imagine that you could have been a chicken instead of a human”. When you ask a question like this, who are you talking to? It is obviously impossible for me—Said Achmiz, the specific person that I am, right now—to have turned out to be a chicken (or, indeed, anyone other than who I am). So you can’t be talking to me (Said Achmiz).
(bold added to highlight your question, which I’m answering) When I ask a question like that, I’m talking to you (or whoever else I’m talking to at the time).
The Metaethics Sequence (which contains a few posts that didn’t make it into R:AZ a.k.a. “The Sequences” as the term usually meant today) is what you’ll want to check out.
I’ll check it out! and yeah that’s where I read the sequences
On Yudkowsky, keep in mind he’s written at least two (that I know of) detailed fictions imagining and exploring impossible/incoherent scenarios—HPMOR, and planecrash. I’ve read the former and am partway through reading the latter. If someone says “imagining yourself in a world with magic can help tune your rationality skills,” you certainly could dismiss that by saying that’s an impossible situation so the best you can do is not imagine it, and maybe your rationality skills are already at a level where the exercise would not provide any value. But at least for me, prompts like that and the veil of ignorance are useful for sharpening my thinking on rationality and ethics, respectively
I think you’ve got your meta-levels mixed up. For one thing, there isn’t any such thing as “meta-meta-ethics”; there’s just metaethics, and anything “more meta” than that is still metaethics. For another thing, “sentientism” is definitely object-level ethics; metaethics is about how to reason about ethics—definitionally it cannot include any ethical principles, which “sentientism” clearly is. This really seems like an attempt to sneak in object-level claims by labeling them as “meta-level” considerations.
Ah I see, yes I did have them mixed up. Thanks for the correction.
Yes, that claim is what makes it an intuition pump—but as I said, it doesn’t work, because the hypothetical scenario in the thought experiment has no bearing on any situation we could ever encounter in the real world, has no resemblance to any situation we could ever encounter in the real world, etc.
On the incoherence of the thought experiment, @neo’s comment explains it pretty well I thought. I will say that I think the thought experiment still works with imaginary minds, like the pebblesorters. If the pebblesorters actually exist and are sentient, then they are morally relevant.
But this isn’t just a case of “not exactly the same”. Nothing approximately like, or even remotely resembling, the hypothetical scenario, actually takes place.
What? In the thought experiment and the real world, a great deal of beings are born into a world that gives rise to a variety of valenced experiences. In the thought experiment, you are tasked with determining whether you would be ok with being the one finding themself in any given one of those lives/experiences.
Like the rest of this paragraph, this is non-responsive to my comment, but I am curious: do you have a principled disagreement with all of the arguments for why nothing remotely like this is possible even in principle, or… are you not familiar with them? (Thomas Nagel’s being the most famous one, of course.)
You said that it is impossible for you to have turned out to be a chicken, and so I can’t be talking to you if I say “imagine that you could have been a chicken instead of a human”. I demonstrated how to imagine that very thing, implying that I could indeed be talking to you when I ask that. I agree that it is impossible for you to turn into a chicken, or for you to have been born a chicken instead of you. I disagree that it is impossible to imagine and make educated guesses on the internal mental states of a chicken.
This seems hard to square with the positions you take on all the stuff in your post…
I’m not following, sorry. Can you give an example of a position I take in the post that’s inconsistent with what I said there?
I think there’s some very deep confusion here… are you familiar with Eliezer’s writing on metaethics? (I don’t know whether that would necessarily resolve any of the relevant confusions or disagreements here, but it’s the first thing that comes to mind as a jumping-off point for untangling this.)
Maybe? I’ve read the sequences twice, one of those times poring over ~5 posts at a time as part of a book club, but maybe his writing on metaethics isn’t in there. I think we are likely talking past each other, but I’m not sure exactly where the crux is. @neo described what I’m trying to get at pretty well, and I don’t know how to do better, so maybe that can highlight a new avenue of discussion? I do appreciate you taking the time to go into this with me though!
Are you able to imagine things you will want in the future? But assuming the universe isn’t just a big 4d-block, that version of you doesn’t exist, so wouldn’t imagining that be incoherent? Why wouldn’t the unreality of that be a stumbling block?
This is indeed neither a universally compelling argument nor is it possible to be an “ideal mind of perfect emptiness”. Think of this post as more along the lines of asking “if I was much more impartial and viewed all sentient beings as morally relevant, how would I want the world to look, what values would I have?”. Some people would answer “I don’t care about group X, if I was one of them I’d hope I get treated poorly like they do” and if they were being honest, this could not change their mind
Thanks for such an in depth response! I’ll just jump right in. I haven’t deeply proofread this, so please take it with a grain of salt
The point of a moral/ethical framework of any sort—the point of ethics, generally—is to provide you with an answer to the question “what is the right thing for me to do”.
I’m not trying to frame the veil of ignorance (VOI) as a moral or ethical framework that answers that question. I’m arguing for the VOI as a meta-meta-ethical framework, which grounds the meta-ethical framework of sentientism, which can ground many different object-level frameworks that answer “what is the right thing for me to do”, as long as those object-level frameworks consider all sentient beings as morally relevant.
We exist as physical beings in a specific place, time, social and historical context, material conditions, etc. (And how could it be otherwise?) Our thoughts (beliefs, desires, preferences, personality, etc.) are the products of physical processes. _We_ are the products of physical processes. _That includes our preferences, and our moral intuitions, and our beliefs about morality._ These things don’t come from nowhere! They are the products of our specific neural makeup, which itself is the product of specific evolutionary circumstances, specific cultural circumstances, etc.
100% agree with you here.
“Imagine that you are a disembodied, impersonal spirit, existing in a void, having no identity, no desires, no interests, no personality, and no history (but you can think somehow)” is basically gibberish. It’s a well-formed sentence and it seems to be saying something, but if you actually try to imagine this scenario, and follow the implications of what you’ve just been told, you run directly into several brick walls simultaneously. The whole thought experiment, and the argument the follows from it, is just the direst nonsense.
I agree that nobody can do literally that. I do think that doing your best at that will allow you to be a lot more impartial. Minor nitpick, the imagined disembodied spirit should have desires and interests in the thought experiment, at the very least, the desire to not experience suffering when they’re born.
So ethics can’t have anything to say about what you should do if you find yourself in this hypothetical situation of being behind the veil of ignorance
I agree, in the post I even point out that from behind the veil you could endorse other positions for outside the veil, such as being personally selfish even at others expense. The point of the thought experiment is that thinking about it can help you refine your views on how you think you should act. The point is not to tell you what to do if you find yourself behind the veil of ignorance, which as you say is incoherent.
There isn’t any pre-existing thinking entity which gets embodied
I’m not following how this rules it out from being an analogy. My understanding of analogies is that they don’t need to be exactly the same for the relevant similarity to help transfer the understanding.
Another way to put it is that you are asking us (by extending what Rawls is asking us) to perform a mental operation that is something like “imagine that you could have been a chicken instead of a human”.
Well, yeah, that is almost exactly what I’m doing! Except generalized to all sentient beings :) I don’t see why you would take so much issue with a question like that? There are many things we don’t (and likely can’t) know about chickens internal experiences, but there’s a lot of very important and useful ground that can be covered from asking that question because there is a lot we can know to a high degree of confidence. If I were asked that, I would look at our understanding of chicken neurology, and how they respond to different drugs like painkillers and pleasurable ones, and our understanding of evolutionary psychology and what kinds of mental patterns would lead to chickens behaving in the ways that they do. I could not give an exact answer, but if I was a chicken I’m almost certain I’d experience positive valence eating corn and fruit and bugs and negative valence if I got hit or I broke a bone or whatever, and that’s just what I’m highly confident on. With enough time and thought I’m sure I could discuss a wide range of experiences with a wide range of how confident I am at how I’d experience them as a chicken. Even though it would be impossible for me writing this to ever actually experience those things, it’s still easy to take my understanding of the world and apply it in a thought experiment.
Now, the obvious question to ask is whether there’s anything you can do to convince me that the reasoning from behind the veil of ignorance should proceed as you say, and not as I (in this hypothetical scenario) say; or, is there nothing more you can say to me? And if the latter—what, if anything, does this prove about the original position argument? (At the very least, it would seem to highlight the fact that Rawls’s reasoning is shaky even granting the coherence of his hypothetical!)
This I resonate much more with. If someone would genuinely be happy with a coin flip deciding whether they’re master or slave, I don’t think there’s anything I could say to convince them against slavery.
It sure seems awfully convenient that when you posit these totally impersonal disembodied spirits, they turn out to have the moral beliefs of modern Westerners. Why should that be the case? Our own moral intuitions, again, don’t come from nowhere. What if we find ourselves behind the veil of ignorance, and all the disembodied spirits are like, “yeah, the slave should obey the master, etc.”?
I don’t think they ought to have the moral beliefs of modern westerners. I think I’m probably wrong or confused or misguided about a lot of moral questions, I think probably everyone is, modern westerner or not. The slave question is sillier on the assumption that they don’t want to be slaves, if they’re equally fine with being a slave or master it wouldn’t be very silly of them.
As Eliezer comments in [his writings about metaethics](https://www.readthesequences.com/Book-V-Mere-Goodness), human morality is just that—human. It’s the product of our evolutionary history, and it’s instantiated in our neural makeup. It doesn’t exist outside of human brains.
Once again, absolutely agree.
On the pebblesorters question, my interpretation of that story was that we humans do a mind numbing amount of things equally as silly as the pebblesorters. To take just one example, music is just arranging patterns of sound waves in the air in the “right” way, which is no more or less silly than the “right” pebble stacks. Behind the human/chicken/pebblesorter/etc veil, I would argue that all of us look extremely silly! Behind the veil likely wouldn’t care all that much about fairness/justice from behind the veil, beyond how it might impact valence.
The bottom line is that Rawls’s argument is an [intuition pump](https://en.wikipedia.org/wiki/Intuition_pump), in Dennett’s derogatory sense of the term. It is designed to lead you to a conclusion, while obscuring the implications of the argument, and discouraging you from thinking about the details. Once looked at, those implications and details show clearly: the argument simply does not work.
I do in fact want to lead towards sentientism. Is it fair to use it in the derogatory way if I’m quite clear and explicit about that? I already described this whole post as an intuition pump before the introduction, I just think that transparent intuition pumps are not just fine, but they can be quite useful and good.
To go through the ones listed in wikipedia:
In Anarchy, State, and Utopia (1974), Robert Nozick argues that, while the original position may be the just starting point, any inequalities derived from that distribution by means of free exchange are equally just, and that any re-distributive tax is an infringement on people’s liberty. He also argues that Rawls’s application of the maximin rule to the original position is risk aversion taken to its extreme, and is therefore unsuitable even to those behind the veil of ignorance.[16]
This is criticizing Rawls’s proposed next steps after he saw the map laid out by the veil. I’m just pointing to the map and saying “this is a helpful tool for planning next steps, which will probably be different than steps proposed by Rawls”. I’d point out that this criticism would hold up better if everyone started with equal claim to resources, but that’s an entirely separate conversation.
In Liberalism and the Limits of Justice (1982),[17] Michael Sandel has criticized Rawls’s notion of a veil of ignorance, pointing out that it is impossible, for an individual, to completely prescind from beliefs and convictions (from the Me ultimately), as is required by Rawls’s thought experiment.
Well yeah of course it’s impossible to do it perfectly. It’s impossible for any of us to be ideal-reasoning agents, I guess rationalism is doomed. Sorry guys, pack up and go home.
In a 1987 empirical research study,[18] Frohlich, Oppenheimer, and Eavey showed that, in a simulated original position, undergraduates at American universities agreed upon a distributive principle that maximizes the average with a specified floor constraint (a minimum for the worst-off in any given distribution) over maximizing the floor or the average alone.
Makes sense, “evidence and reason” is critical to planning specific next steps even if you have a high level map.
In How to Make Good Decisions and Be Right All the Time (2008), Iain King argues that people in the original position should not be risk-averse, leading them to adopt the Help Principle (help someone if your help is worth more to them than it is to you) rather than maximin.[19]
Sure. Again, I’m not arguing for specific interpretations of the map, just saying it’s there and it’s helpful even if you don’t come to the same conclusions as others looking at a similar one. The help principle seems reasonable, as do other strategies like giving 10% of your income rather than selling all you have to give to the poor.
Philosopher and Law Professor Harold Anthony Lloyd argues that Rawls’s veil of ignorance is hardly hypothetical but instead dangerously real since individuals cannot know at any point in time the future either for themselves or for others (or in fact know all aspects of either their relevant past or present). Faced with the high stakes of such ignorance, careful egoism effectively becomes altruism by minimizing/sharing risk through social safety nets and other means such as insurance.[20]
😏😏😏
Hmm good question. Coordinating with other time slices of your body is a very tough problem if you take empty individualism seriously (imo it is the closest to the truth of the three, but I’m not certain by any means). From the perspective of a given time slice, any experience besides the one they got is not going to be experienced by them, so why would they use their short time to get a spike in pleasure in a future time slice of the body they’re in, rather than a smaller but more stable increase in pleasure for any other time slice, same body or not? If the duration of a time slice is measured in seconds, even walking to the fridge to get a candy bar is essentially “altruism” for future time slices to enjoy it.
In terms of coordination for other goals, you can use current time slices to cultivate mental patterns in themselves that future ones are then more likely to practice such as equanimity, accepting “good-enough” experiences, recognizing that your future slices aren’t so different from others and using that as motivation for altruism, and even making bids with future time slices. If this time slice can’t rely on future ones to enact it’s goals, future ones can’t rely on even further future ones either, vastly limiting what’s possible (if no time slice is willing to get off the couch for the benefit of other slices, that being will stay on the couch until it’s unbearable not to). Check out George Ainslie’s Breakdown of Will for a similar discussion on coordinating between time slices like that https://www.semanticscholar.org/paper/Pr%C3%A9cis-of-Breakdown-of-Will-Ainslie/79af8cd50b5bd35e90769a23d9a231641400dce6
Other strategies that are way less helpful for aligning ai are more just making use of the fact that we evolved to not feel like time slices, probably because it makes it easier to coordinate between them. So there’s a lot of mental infrastructure already in place for the task
On the fear of different values, I think you need to figure out which values you actually care if future slices hold and make sure they are all well grounded and can be easily re-derived, and the ones that aren’t that important you just need to hold loosely and accept that your future selves might value something else and hope that their new values are well founded. That’s where cultivating mental patterns of strong epistemology comes in, you actually want your values to change for the better, but not for the worse
I’ve added your post to my reading list! So far it’s a pretty reliable way for me to get future time slices to read something :)
Seconding @neo’s question, which criticisms are you referring to? I’ll grant you that I didn’t look for criticisms of it beyond searching for it on lesswrong and looking at the wikipedia page, but ultimately, my thesis is based on the general thrust of the veil of ignorance/original position and not the specifics of Rawls’s work.
To chime in with a stronger example the cynical audience member from 1.4 could’ve used: religion. Religions are constantly morphing and evolving, and they’ve ranged in practices from live sacrifice (human and non-human) to sweeping while walking to avoid potentially hurting a bug. That sorta falls under moral philosophy but I think all the other, non-moral, aspects of religion make the point more strongly. There’s no way to determine that religion X is true and Y is false, there’s no grounding in reality, and given that the strongest predictor of an individuals religion is which one they were born into, their discernment doesn’t seem to be pulling much weight.
Now, you could say religion is a useful idea in terms of social cohesion, but I think that if ai convinces us of happy lies that’s not a great outcome (better than many possible ones though)
It seems like the way we talk about the results of a coin flip would be a good start for how we’d talk about being cloned, although it’s rare for a coin flip to have such a massive impact on our life after that point
https://existentialcomics.com/comic/1 Related comic exploring this idea
I interpreted “retrain myself to perform only those steps over the course of 30 seconds” to mean that after training for n seconds/minutes/hours, he could solve an equivalent problem in 30 seconds (via the distilled steps). You seem to interpret it to mean that the training takes 30 seconds, and the length of time to solve the problem after training is unspecified.
I don’t know which it is, the wording seems ambiguous.
This also tracks with LSD, measured in micrograms rather than milligrams like other psychedelics like psylocibin and 2c-b
I didn’t know there was (going to be?) an epilogue to planecrash, but it didn’t leave me nearly as thirsty for more as hpmor did. With hpmor, I wanted to see what everyone would do next, as they’re still pretty young, whereas with planecrash, it felt like everything I was curious about was explored to my satisfaction. Sure, we don’t get a lot of specifics on the new society(s) on golarion, but that’s pretty fine with me. It would be interesting to see maybe what The Future holds, or where the language guy ends up, but the former feels right as a mystery, while the latter seemed pretty well foreshadowed