Even assuming perfect selfishness, sometimes the best way to get what you want (X) is to coordinate to change the world in a way that makes X plentiful, rather than fighting over the rare Xs that exist now, and in that way, your goals align with other people who want X.
Sweetgum
E.g. learning when you’re rationalizing, when you’re avoiding something, when you’re deluded, [...] when you’re really thinking about something else, etc.
It seems extremely unlikely that these things could be seen in fMRI data.
I think I got it. Right after the person buys X for $1, you offer to buy it off them for $2, but with a delay, so they keep X for another month before the sale goes through. After the month passes, they now value X at $3 so they are willing to pay $3 to buy it back from you, and you end up with +$1.
What happens if the parrots have their own ideas about who to breed with? Or the rejected parrots don’t want to be sterilised?
It’s worth noting that both of these things are basically already true, and don’t require great intelligence.
Autonomous lethal weapons (ALWs; we need a more eerie, memetic name)
There’s already a more eerie, memetic name. Slaughterbots.
Maybe something like “mundane-ist” would be better. The “realists” are people who think that AI is fundamentally “mundane” and that the safety concerns with AI are basically the same as safety concerns with any new technology (increases inequality by making the powerful more powerful, etc.) But of course “mundane-ist” isn’t a real word, which is a bit of a problem.
Can’t tell if sarcastic
Wild speculation ahead: Perhaps the aversion to this sort of rationalization is not wholly caused by the suboptimality of rationalization, but also by certain individualistic attitudes prevalent here. Maybe I, or Eliezer Yudkowsky, or others, just don’t want to be the sort of person whose preferences the world can bend to its will.
Yes, and another meaning of “rationalization” that people often talk about is inventing fake reasons for your own beliefs, which may also be practically rational in certain situations (certain false beliefs could be helpful to you) but it’s obviously a major crime against epistemic rationality.
I’m also not sure rationalizing your past personal decisions isn’t an instance of this; the phrase “I made the right choice” could be interpreted as meaning you believe you would have been less satisfied now if you chose differently, and if this isn’t true but you are trying to convince yourself it is to be happier then that is also a major crime against epistemic rationality.
I wish you had gone more into the specific money pump you would be vulnerable to if you rationalize your past choices in this post. I can’t picture what money pump would be possible in this situation (but I believe you that one exists.) Also, you not describing the specific money pump reduces the salience of the concern (improperly, in my opinion.) It’s one thing to talk abstractly about money pumps, and another to see right in front of you how your decision procedure endorses obviously absurd actions.
Like, as far as I’m concerned, I’m trans because I chose to be, because being the way I am seemed like a better and happier life to have than the alternative. Now sure, you could ask, “yeah but why did I think that? Why was I the kind of agent that would make that kind of choice? Why did I decide to believe that?”
Yes, this a non-confused question with a real answer.
Well, because I decided to be the kind of agent that could decide what kind of agent I was. “Alright octavia but come on this can’t just recurse forever, there has to be an actual cause in biology” does there really?
In a literal/trivial sense, all human actions have a direct cause in the biology of the human brain and body. But you are probably using “biology” in a way that refers to “coarse” biological causes like hormone levels in utero, rather than individual connections between neurons, as well as excluding social causes. In that case, it’s at least logically possible that the answer to this question is no. It seems extremely unlikely that coarse biological factors play no role in determining whether someone is trans (I expect coarse biological factors to be at least somewhat involved in determining the variance in every relevant high-level trait of a person), but it’s very plausible that there is not one discrete cause to point to, or that most of the variance in gender identity is explained by social factors.
If a brain scan said I “wasn’t really trans” I would just say it was wrong, because I choose what I am, not some external force.
This seems like a red herring to me—as far as I know no transgender brain research is attempting to diagnose trans people by brain scan in a way that overrides their verbal reports and behavior, but rather to find correlates of those verbal reports and behavior in the brain. If we find a characteristic set of features in the brains of most trans people, but not all, it will then be a separate debate as to whether we should consider this newly discovered thing to be the true meaning of the word “transgender”, or whether we should just keep using the word the same way we used it before, to refer to a pattern of self-identity and behavior, and the “keep using it the same way we did before” side seems quite reasonable. Even now, many people understand the word “transgender” as an “umbrella term” that encompasses people who may not have the same underlying motivations.
Morphological freedom without metaphysical freedom of will is pointless.
If by “metaphysical freedom of will” you are referring to is libertarian free will, then I have to disagree. Even if libertarian free will doesn’t exist (it doesn’t), it is still beneficial to me for society to allow me the option of changing my body. If you are confused about how the concept of “options” can exist without libertarian free will, that problem has already been solved in Possibility and Could-ness.
I’ve noticed people using formal logic/mathematical notation unnecessarily to make their arguments seem more “formal”: ∀x∈X(∃y∈Y|Q(x,y)), f:S→T, etc. Eliezer Yudkowsky even does this at some points in the original sequences. These symbols were pretty intimidating to me before I learned what they mean, and I imagine they would be confusing/intimidating to anyone without a mathematical background.
Though I’m a bit conflicted on this one because if the formal logic notation of a statement is shown alongside the English description, it could actually help people learn logic notation who wouldn’t have otherwise. But it shouldn’t be used as a replacement for the English description, especially for simple statements that can easily be expressed in natural language. It often feels like people are trying to signal intellectualism at the expense of accessibility.
What are you talking about then? It seems like you’re talking about probabilities as being the objective proportion of worlds something happen in in some sort of multiverse theory, even if it’s not the Everett multiverse. And when you said “There won’t be any iff there is a 100.0000% probability of annihilation” you were replying to a comment talking about whether there will be any Everett branches where humans survive, so it was reasonable for me to think you were talking about Everett branches.
Bayesian probability (which is the kind Yudkowsky is using when he gives the probability of AI doom) is subjective, referring to one’s degree of belief in a proposition, and cannot be 0% or 100%. If you’re using probability to refer to the objective proportion of future Everett branches something occurs in, you are using it in a very different way than most, and probabilities in that system cannot be compared to Yudkowsky’s probabilities.
But that still requires us to have developed human brain-scanning technology within 5 years, right? That does not seem remotely plausible.
Indeed, it is instrumentally useful for instrumental rationalists to portray themselves as epistemic rationalists. And so this is a common pattern in human politics—“[insert political coalition] care only about themselves, while [insert political coalition] are merely trying to spread truth” is one of the great political cliches for a reason. And because believing one’s own lies can be instrumentally useful, falsely believing oneself to have a holy devotion to the truth is a not-uncommon delusion.
I try to dissuade myself of this delusion.
There’s a subtle paradox here. Can you spot it?
He is trying to dissuade himself of the premise[X] that he is committed to the truth over socially useful falsehoods. But that premise[X] is itself socially useful to believe, and he claims it’s false, so disbelieving it would show that he does sometimes value the truth over socially useful falsehoods, contradicting the point.
More specifically, there are three possibilities here:X is broadly true. He’s just wrong about X, but his statement that X is false is not socially motivated.
X is usually false, but his statements about X are a special case for some reason.
X is false, but his statement that X is false doesn’t contradict this because denying X is actually the socially useful thing, rather than affirming X. Lesswrong might be the kind of place where denying X (saying that you are committed to spreading socially useful falsehoods over the truth) actually gets you social credit, because readers interpret affirming X as the thing that gets you social credit, so denying it is interpreted as a signal that you are committed to saying the taboo truth (not-X) over what is socially useful (X), the exact opposite of what was stated. If true, this would be quite ironic. This interpretation is self-refuting in multiple ways, both logically (for not-X to be a “taboo truth”, X has to be false, which already rules out the conclusion of this line of reasoning) and causally (if everyone uses this logic, the premise that affirming X is socially useful becomes false, because denying X becomes the socially useful thing.) But that doesn’t mean readers couldn’t actually be drawing this conclusion without noticing the problems.
It’s more akin to me writing down my thoughts and then rereading them to gather my ideas than the kind of loops I imagine our neurons might have.
In a sense, that is what is happening when you think in words. It’s called the phonological loop.
In this cases it can be helpful to imagine your current self in a bargaining game with your future selves, in a sort of prisoner’s dilema. If your current now defects, your future selves will be more prone to defecting as well. If you coordinate and resist tempation now, future resistance will be more likely. In other words, establishing a Schelling fence.
This is an interesting way of looking at it. To elaborate a bit, one day of working toward a long-term goal is essentially useless, so you will only do it if you believe that your future selves will as well. This is some of where the old “You need to believe in yourself to do it!” advice comes from. But there can be good reasons not to believe in yourself as well.
In the context of the iterated Prisoner’s Dilemma, it’s been investigated what the frequency of random errors (the decision to cooperate or defect being replaced with a random one in x% of instances) can go up to before cooperation breaks down. (I’ll try to find a citation for this later.) This seems similar, but not literally equivalent, to a question we might ask here: What frequency of random motivational lapses can be tolerated before the desire to work towards the goal at all breaks down?
Naturally, the goals that require the most trust are ones that see no benefit until the end, because they require you to trust that your future selves won’t permanently give up on the goal anywhere between now and the end to be worth working towards at all. But most long term goals aren’t really like this. They could be seen to fall on a spectrum between providing no benefit until a certain point and linear benefit the more they are worked towards with the “goal” point being arbitrary. (This is analogous to the concept of a learning curve.) Actions towards a goal may also provide an immediate benefit as well as progress toward the goal, which reduces the need to trust your future selves.
If you don’t trust your future selves very much, you can seek out “half-measure” actions that sacrifice some efficiency toward the goal for immediate benefits, but still contribute some progress toward the goal. You can to some extent set where they are along this spectrum, but you are also limited by the types of actions available to you.
Thanks, this is a great explanation and you changed my mind on this. This is probably the reason why most people have the intuition that legalizing these things makes things worse for everyone. There were many proposed explanations for that intuition in this thread, but none of the others made sense/seemed valid to me, so I was beginning to think the intuition was erroneous.
This seems like a potentially misleading description of the situation. It seems to say that the contents of working memory could always be described in one minute of natural language, but this is not implied (as I’m sure you know based on your reasoning in this post). A 630-digit number cannot be described in one minute of natural language. 2016 bits of memory and about 2016 bits of natural language per minute really means that if our working memory was perfectly optimized for storing natural language and only natural language, it could store about one minute of it.
(And on that note, how much natural language can the best memory athletes store in their working memory? One minute seems low to me. If they can actually store more, it would show that your bit estimate is too low.)