I like the flipping attempt—changing framing around is a good way to triangulate intuitions. I think there are some things that could be strengthened about it:
1. The “alt” question is better than the 1st, but should include “and you won’t be reanimated otherwise”, or in other words: God kills you iff (you accept the deal and) being signed up for cryonics before you get significant new information will make the difference between you being re-animated or not.
2. Several reason jump to my mind why I wouldn’t trade the rest of my years of life now for a 100% guarantee of n+1 years in the future: My family and friends will be sad now that I am not around, I will be sad later that they are not around, I know my quality of life now and I don’t know what it would be after reanimation, I am attempting effective altruism now and believe I will be far less effective in the future, etc.
So, maybe “if being signed up for cryonics now make the difference in getting you reanimated and at least as much total happiness in the future, God will replace you with an unconscious replica that will do exactly what you would have done”
A different way to flip the question:
Let’s say you wanted to stay dead. Either being reanimated would pull you back down to earth, or you have some powerful enemies in the future or for whatever reason, it’s roughly as important to you that you stay dead as it currently is that you don’t. If you do nothing, you will be cryopreserved upon death. How much do you pay to avoid that?
k64
I wasn’t one of them, but I find the idea that cryopreservation is going to become a new global tradition very soon, to be overly optimistic, so I had a mental flag go up at it being characterized as a strong case.
I just tried criticizing my ingroup. Did my blood boil? No. My Scotsmen got truer. Every time I could identify a flawed behavior, it felt inappropriate to include those people in my “real ingroup”. Now, if I had a more objectively defined group based on voting record or religious belief or something, then maybe I’d be able to force my brain to keep them in my ingroup, but right now, my brain flips to “sure, I’m happy to criticize those people giving us a bad name. Look, I’m criticizing my ingroup!”
I tried 2 other experiments:
1. Think about criticisms toward my ingroup that do make me angry—maybe those are the ones hitting home.
Result: I found myself disagreeing with all of them. And my brain asked “what, am I supposed to like wrongheaded arguments just because they are against my group?”
2. Just go straight for the inner-est group I have: me.
Result: I was able to think of criticisms of myself and it didn’t make my blood boil, and it wouldn’t to write them. I suspect that when I shrink the group to {me}, I may expect extra social points for criticizing myself, making it much more palatable.
So, my quick experiment suggests that, at least for someone without a clearly defined in-group, trying to criticize one’s ingroup can be more ‘slippery’ difficult than ‘grueling’ difficult.
I think it’s worth noting that I also have had times where I was impressed with your tact. The two examples that jump to mind are 1) a tweet where you gently questioned Nate Silver’s position that expressing probabilities as frequencies instead of percentages was net harmful, and 2) your “shut it all down” letter to NYT, especially the part where you talk about being positively surprised by the sanity of people outside the industry and the text about Nina losing a tooth. Both of those struck me as emotionally perceptive.
The thing I wonder every time this topic comes up is: why is this the question raised to our attention? Why aren’t we instead asking whether AlphaFold is conscious? Or DALL-E? I’d feel a lot less wary of confirmation bias here if people were as likely to believe that a GPT that output the raw token numbers was conscious as they are to believe it when those tokens are translated to text in their native language.
Also, I think it is worth separating the question of “can LLMs introspect” (have access to their internal state) vs “are LLM’s conscious”.
I’m curious how you’d see moral and long-term considerations playing into this. For instance:
1. Saving for retirement produces no experienced benefit for many years and will only ever complete a single investment cycle in a lifetime.
2. Donating to or working on global health, x-risk, etc. produces no experienced benefit ever in most cases.
Yet, in both cases, individuals seem capable of exercising willpower to do these activities for many years.
I can think of 3 models currently that could explain this:
1. They just are “dead willpower”, but your willpower system gets enough income from shorter term investments to allow it to continue to invest in things that will not pay out any time soon.
2. Your willpower system has “stock” that gives it value based on the prediction of experienced benefit that hasn’t been experienced yet.
3. You experience satisfaction from seeing the 401k numbers go up or feeling like a moral person and that is the payout your willpower gets.
How do you see moral and long-term considerations interacting with your toy model?
I do think that this is probably part of my misprediction—that I simply idealize others too much and don’t give enough credit to how inconsistent humans actually are. “Idealize” is probably just the Good version of “flatten”, with “demonize” being the Bad version, both of which are probably because it takes less neurons to model someone else that way.
I actually just recently had the displeasure of stumbling upon that reddit and it made me sad that people wanted to devote their energies to just being unkind without a goal. So I’m probably also not modeling how my own principle of avoiding offense unless helpful would erode over time. I’ve seen it happen to many public figures on twitter—it seems to be part of the system.
I like this perspective. I would agree that there is more to knowing and being known by others than simply Aumann Agreement on empirical fact. I also probably have a tendency to expect more explicit goal-seeking from others than myself.
I haven’t thought this through before, but I notice two things that affect how open I am. The first is how much the communication is private, has non-verbal cues, and has an existing relationship. So right now, I’m not writing this with a desired consequence in mind, but I am filtering some things out subconsciously—like if we were in person talking right now, I might launch into a random anecdote, but while writing online I stay on a narrower path.The second is that I generally only start running my “consequentialist program” once I anticipate that someone may be upset by what I say. The anticipation of offense is what triggers me to think either “but it still needs to be said” or “saying this won’t help”. So maybe my implicit question was less “why does Eliezer not aim all his communication at his goals” and more “why doesn’t he seem to have the same guardrail I do about only causing offense if it will help”, which is a more subjective standard.
I accept your correction that I misquoted you. I paraphrased from memory and did miss real nuance. My bad.
Looking at the comment now, I do see that it has a score of −43 currently, and is the only negative karma comment on the post. So maybe a more interesting question is why I (and presumably several others) interpreted it as insult when logical content of “Intelligence(having <30y timeline in 2025) > Intelligence(potted plant)” doesn’t contain any direct insult. My best guess is that people are running informal inference on “do they think of me as lower status”, and any comparison to a lower intelligence entity is likely to trigger that. For instance, I actually find the thing you just said suggesting that I could have an LLM explain an LSAT-style question to me, to be insulting because it implies that you assign decent probability to my intelligence being lower than LLM or LSAT level. (Of course, I rank it less than “calling someone out publicly, even politely”, so I still feel vague social debt to you in this interaction.) I also anticipate that you might respond that you are justified in that assumption given that I seem to not have understood something an LLM could, and that that would only serve to increase the perceived status threat.
The “polite about the house burning” is something I have changed my mind about recently. I initially judged some of your stronger rhetoric as unhelpful because it didn’t help me personally, but have seen enough people say otherwise that I now lean toward that being the right call. The remaining confusion I have is over the instances where you take extra time to either raise your own status or lower someone else’s instead of keeping discussion focused on the object level. Maybe that’s simply because, like me, you sometimes just react to things. Maybe, as someone else suggested, its some sort of punishment strategy. If it is actually intentionally aimed at some goal, I’d be curious to know.
I’m sorry to hear about your health/fatigue. That’s a very unfortunate turn of events, for everyone really. I think your overall contribution is quite positive, so I would certainly vote that you keep talking rather than stop! If I got a vote on the matter, I’d also vote that you leave status out of conversations and play to your strength of explaining complicated concepts in a way that is very intuitive for others. In fact, as much as I had high hopes for your research prospects, I never directly experienced any of that—the thing that has directly impressed me, (and if I’m honest, the only reason I assume you’d also be great at research) has been the way you make new insights accessible through your public writing. So, consider this my vote for more of that.
I suspect that some of my dissonance does result from an illusion of consistency and a failure to appreciate how multi-faceted people can really be. I naturally think of people as agents and not as a collection of different cognitive circuits. I’m not ready to assume that this explains all of the gap between my expectations and reality, but it’s probably part of it.
I think this is an important perspective, especially for understanding Eliezer, who places a high value on truth/honesty, often directly over consequentialist concerns.
While this explains true but unpleasant statements like “[Individual] has substantially decreased humanity’s odds of survival”, it doesn’t seem to explain statements like the potted plant one or other obviously-not-literally-true statements, unless one takes the position that full honesty also requires saying all the false and irrational things that pass through one’s head as well. (And even then, I’d expect to see an immediate follow-up of “that’s not true of course”).
I agree with this decision. You reference the comment in one of your answers. If it starts taking over, it should be removed, but can otherwise provide interesting meta-commentary.
I think this makes sense as a model of where he is coming from. As a strategy, my understanding of social dynamics is that “I told you so” makes it harder, not easier, for people to change their minds and agree with you going forward.
Not an answer to the question, but I think it’s worth noting that people asking for your opinion on EA may not be precise with what question they ask. For example, it’s plausible to me that someone could ask “has EA been helpful” when their use case for the info is something like “would a donation to EA now be +EV”, and not be conscious of the potential difference between the two questions.
I agree that we’ll make new puzzles that will be more rewarding. I don’t think that suffering need be involuntary to make its elimination meaningful. If I am voluntarily parched and struggling to climb a mountain with a heavy pack (something that I would plausibly reject ASI help with), I would nevertheless feel appreciation if some passerby offered me a drink or lightening my load. Given a guarantee of safety from permanent harm, I think I’d plausibly volunteer to play a role in some game that involved some degree of suffering that could be alleviated.
[Question] Why does Eliezer make abrasive public comments?
there are also donation opportunities for influencing AI policy to advance AI safety which we think are substantially more effective than even the best 501c3 donation opportunities
Would you be willing to list these (or to DM me if there’s a reason to not list publicly)?
I began to write a long comment about how to possibly identify poverty-restoring forces, but I think we actually should take a step back and ask:
Why do we care about poverty in the first place?
”The utility function is not up for grabs”
Sure, but poverty seems like a rather complex idea to really be directly in our utility function, instead of instrumentally.
“Well we care about poverty because it causes suffering”
Ok. But why not just talk about reducing suffering then?
”Suffering can have multiple causes. It is helpful to focus on a single cause at a time to produce solutions”
Sure—but we just said we don’t know what causes it, so that’s not why. Why don’t we just talk about eliminating suffering?
”Because that would feel too...utilitarian. Too sterile. Cancer is unfortunate, but poverty is just wrong.”
And that’s exactly it I think—we care about ‘poverty’ in particular because we care about justice. There is something worse about someone dying of a preventable disease. So poverty is not simply a state of resources or of hedonic experiences. It’s not even about the poor. Someone suffering of an unpreventable cause is unfortunate. They only become poor once others have the ability to help them and doesn’t. We also care about suffering for itself, but poverty is actually a moral defect we see in the other humans who don’t help.
Once we frame the discussion this way, it becomes easy to see why universal basic income might not fix human moral defects.*
*And even if we object that poverty is not about just the moral defect, but about it also indirectly causing suffering , it is still much easier to see why UBI might not prevent human moral defects from indirectly causing suffering.
Quote voice seems to “win” this exchange, but I think there are 3 things it is missing:
1. I can’t know someone else’s joy level with certainty, but despite quote voice accusing unquote voice of having problems taking joy in the real—I don’t hear the joy in quote voice (save for the last reply). Maybe QV is just using “joy in the real” as an applause light instead of actually practicing it.
2. “And you claim to be surprised by this?”—Lack of surprise may be a symptom of having a perfect model of the world, but more often it is a symptom of not actually predicting with your model. For mortals, surprise at the real state of things should be a common occurrence—it is akin to admitting fallibility. Perhaps more importantly, in this conversation, it seems to be shutting down curiosity.
3. Even after the call-out on “explain any possible human behavior”, QV continues to use “well it has to [work] somehow” to imply “my specific model of the world is correct”. If UQV was arguing for magic or theism, then these responses would make sense, but as is, they seem like a way to avoid admitting “I don’t know”.
Sure, happy to clarify:
1. The “before new info” means that it would feel unfair if you took the deal and then God was like “well, I gotta kill you because in 6 months they’re going to have a breakthrough and do the first successful human reanimation”. You’d be like “well, then I would just have signed up in 6 months when I found out. So unless I would have died in the next 6 months, you shouldn’t kill me”. Alternatively the gamble could be that God kills you if you wouldn’t have ended up signing up for cryonics by the time you die and it would have worked.
2. Well yeah, I’m assuming that the point of your analogy is to construct it so that the hypothetical decision you make tells you what your actual decision should be on cryonics. If it’s just a whimsical thought experiment then there’s no need to match everything up. If it is intended to mean that someone who would require more than the current cost of cryonics to take the deal should sign up for cryonics, then it does have to match up stuff, because it is entirely coherent for someone to, for example, neither want to be revived into a dystopia nor be killed immediately.
The unconscious replica is intended to keep the impact on others the same. No guilt about traumatizing your children, for example, because they would still grow up with 2 loving parents. So you’re just worried about whether you want money or life and not moral duties you might have to avoid being killed prematurely.
Yeah, you understood my example. It’s not particularly deep. It’s just that I find that many people have a pessimism bias, so I can feel myself thinking “cryonics probably won’t work” but if I imagine someone evil wants to revive and hurt me I think “but there’s a chance it would work...”. For the “depends how bad”, I think the 2 ways one can use the idea are a) set that it’s exactly as much worse than death as you expect reanimated life would be better than death, or b) just play with different severities and see if your gut estimate of the probability of revival changes.