Philosophy PhD student. Interested in ethics, metaethics, AI, EA, disagreement/erisology. Former username Ikaxas
Vaughn Papenhausen
This is the first point at which I, at least, saw any indication that you thought Ben’s attempt to pass your ITT was anything less than completely accurate. If you thought his summary of your position wasn’t accurate, why didn’t you say so earlier ? Your response to the comment of his that you linked gave no indication of that, and thus seemed to give the impression that you thought it was an accurate summary (if there are places where you stated that you thought the summary wasn’t accurate and I simply missed it, feel free to point this out). My understanding is that often, when person A writes up a summary of what they believe to be person B’s position, the purpose is to ensure that the two are on the same page (not in the sense of agreeing, but in the sense that A understands what B is claiming). Thus, I think person A often hopes that person B will either confirm that “yes, that’s a pretty accurate summary of my position,” or “well, parts of that are correct, but it differs from my actual position in ways 1, 2, and 3″ or “no, you’ve completely misunderstood what I’m trying to say. Actually, I was trying to say [summary of person B’s position].”
To be perfectly clear, an underlying premise of this is that communication is hard, and thus that two people can be talking past each other even if both are putting in what feels like a normal amount of effort to write clearly and to understand what the other is saying. This implies that if a disagreement persists, one of the first things to try is to slow down for a moment and get clear on what each person is actually saying, which requires putting in more than what feels like a normal amount of effort, because what feels like a normal amount of effort is often not enough to actually facilitate understanding. I’m getting a vibe that you disagree with this line of thought. Is that correct? If so, where exactly do you disagree?
- 5 Sep 2018 20:35 UTC; 2 points) 's comment on Zetetic explanation by (
If this is intended as a summary of the post, I’d say it doesn’t quite seem to capture what I was getting at. If I had to give my own one-paragraph summary, it would be this:
There’s a thing people (including me) sometimes do, where they (unreflectively) assume that the conclusions of motivated reasoning are always wrong, and dismiss them out of hand. That seems like a bad plan. Instead, try going into System II mode and reexamining conclusions you think might be the result of motivated reasoning, rather than immediately dismissing them. This isn’t to say that System II processes are completely immune to motivated reasoning, far from it, but “apply extra scrutiny” seems like a better strategy than “dismiss out of hand.”
Something that was in the background of the post, but I don’t think I adequately brought out, is that this habit of [automatically dismissing anything that seems like it might be the result of motivated reasoning] can lead to decision paralysis and pathological self-doubt. The point of this post is to somewhat correct for that. Perhaps it’s an overcorrection, but I don’t think it is.
The group version of this already exists, in a couple of different versions:
I originally had a longer comment, but I’m afraid of getting embroiled in this, so here’s a short-ish comment instead. Also, I recognize that there’s more interpretive labor I could do here, but I figure it’s better to say something non-optimal than to say nothing.
I’m guessing you don’t mean “harm should be avoided whenever possible” literally. Here’s why: if we take it literally, then it seems to imply that you should never say anything, since anything you say has some possibility of leading to a causal chain that produces harm. And I’m guessing you don’t want to say that. (Related is the discussion of the “paralysis argument” in this interview: https://80000hours.org/podcast/episodes/will-macaskill-paralysis-and-hinge-of-history/#the-paralysis-argument-01542)
I think this is part of what’s behind Christian’s comment. If we don’t want to be completely mute, then we are going to take some non-zero risk of harming someone sometime to some degree. So then the argument becomes about how much risk we should take. And if we’re already at roughly the optimal level of risk, then it’s not right to say that interlocutors should be more careful (to be clear, I am not claiming that we are at the optimal level of risk). So arguing that there’s always some risk isn’t enough to argue that interlocutors should be more careful—you also have to argue that the current norms don’t prescribe the optimal level of risk already, they permit us to take more risk than we should. There is no way to avoid the tradeoff here, the question is where the tradeoff should be made.
[EDIT: So while Stuart Anderson does indeed simply repeat the argument you (successfully) refute in the post, Christian, if I’m reading him right, is making a different argument, and saying that your original argument doesn’t get us all the way from “words can cause harm” to “interlocutors should be more careful with their words.”
You want to argue that interlocutors should be more careful with their words [EDIT: kithpendragon clarifies below that they aren’t aiming to do that, at least in this post]. You see some people (e.g. Stuart Anderson, and the people you allude to at the beginning), making the following sort of argument:
Words can’t cause harm
Therefore, people don’t need to be careful with their words.
You successfully refute (1) in the post. But this doesn’t get us to “people do need to be careful with their words” since the following sort of argument is also available:
A. Words don’t have a high enough probability of causing enough harm to enough people that people need to be any more careful with them than they’re already being.
B. Therefore, people don’t need to be careful with their words (at least, not any more than they already are). [EDIT: list formatting]]
I think about this question a lot as well. Here are some pieces I’ve personally found particularly helpful in thinking about it:
Sharon Street, “Nothing ‘Really’ Matters, But That’s Not What Matters”: link
This is several levels deep in a conversation, but you might be able to read it on its own and get the gist, then go back and read some of the other stuff if you feel like it.
Nate Soares’ “Replacing Guilt” series: http://mindingourway.com/guilt/
Includes much more metaethics than it might sound like it should.
Especially relevant: “You don’t get to know what you’re fighting for”; everything in the “Drop your Obligations” section
(Book) Jonathan Haidt: The Happiness Hypothesis: link
Checks ancient wisdom about how to live against the modern psych literature.
Susan Wolf on meaning in life: link
There is a book version, but I haven’t read it yet
Search terms that might help if you want to look for what philosophers have said about this:
meaning in/of life
moral epistemology
metaethics
terminal/final goals/ends
Some philosophers with relevant work:
Derek Parfit
Sharon Street
Christine Korsgaard
Bernard Williams
There is a ton of philosophical work on these sorts of things obviously, but I only wanted to directly mention stuff I’ve actually read.
If I had to summarize my current views on this (still very much in flux, and not something I necessarily live up to), I might say something like this:
Pick final goals that will be good both for you and for others. As Susan Wolf says, meaning comes from where “subjective valuing meets objective value,” and as Jonathan Haidt says, “happiness comes from between.” Some of this should be projects that will be fun/fulfilling for you and also produce value for others, some of this should be relationships (friendships, family, romantic, etc). But be prepared for those goals to update. I like romeostevensit’s phrasing elsethread: “goals are lighthouses not final destinations.” As Nate Soares says, “you don’t get to know what you’re fighting for”: what you’re fighting for will change over time, and that’s not a bad thing. Can’t remember the source for this (probably either the Sequences or Replacing Guilt somewhere), but the human mind will often transform an instrumental goal into a terminal goal. Unlike Eliezer, I think this really does signal a blurriness in the boundary between instrumental and terminal goals. Even terminal goals can be evaluated in light of other goals we have (making them at least a little bit instrumental), and if an instrumental goal becomes ingrained enough, we may start to care about it for its own sake. And when picking goals, start from where you are, start with what you already find yourself to care about, and go from there. The well-known metaphor of Neurath’s Boat goes like this: “We are like sailors who on the open sea must reconstruct their ship but are never able to start afresh from the bottom. Where a beam is taken away a new one must at once be put there, and for this the rest of the ship is used as support. In this way, by using the old beams and driftwood the ship can be shaped entirely anew, but only by gradual reconstruction.” See also Eliezer, “Created already in motion”. So start from what you already care about, and aim to have that evolve in a more consistent direction. Aim for goals that will be good for both you and others, but take care of the essentials for yourself first (as Jordan Peterson says, “Set your house in perfect order before you criticize the world”). In order to help the world, you have to make yourself formidable (think virtue ethics). Furthermore, as Agnes Callard points out (link), the meaning in life can’t solely be to make others happy. The buck has to stop somewhere—“At some point, someone has to actually do the happying.” So again, look for places where you can do things that make you happy while also creating value for others.
I don’t claim this is at all airtight, or complete (as I said, still very much in flux), but it’s what I’ve come to after thinking about this for the last several years.
I hope they are buying 50+ books each otherwise I don’t see how the book part is remotely worth it.
As a data point, I did not vote, but if there is a book, I will almost certainly be buying a copy of it if it is reasonably priced, i.e. similar price to the first two volumes of R:A-Z ($ 6-8).
I see no indication in Ben’s post that he had the same estimate of the results of his efforts as I did.
This is exactly the problem that the ITT is trying to solve. Ben’s interpretation of what you said is Ben’s interpretation of what you said, whether he posts it or merely thinks it. If he merely thinks it, and then responds to you based on it, then he’ll be responding to a misunderstanding of what you actually said and the conversation won’t be productive. You’ll think he understood you, he’ll perhaps think he understood you, but he won’t have understood you, and the conversation will not go well because of it.
But if he writes it out, then you can see that he didn’t understand you, and help him understand what you actually meant before he tries to criticize something you didn’t even actually say. But this kind of thing only works if both people cooperate a little bit. (Okay, that’s a bit strong, I do think that the kind of thing Ben did has some benefit even though you didn’t respond to it. But a lot of the benefit comes from the back and forth.)
if one may spend hours on such a thing, and end up with such disappointing results, what’s the point?
Again, this is merely evidence that communication is harder than it seems. Ben not writing down his interpretation of you doesn’t magically make him understand you better. All it does is hide the fact that he didn’t understand you, and when that fact is hidden it can cause problems that seem to come from nowhere.
If the claim is “doing interpretive labor lets you understand your interlocutor, where a straightforward reading may lead you astray”
That’s not the claim at all. The claim is that the reading that seems straightforward to you may not be the reading that seems straightforward to Ben. So if Ben relies on what seems to him a “straightforward reading,” he may be relying on a wrong reading of what you said, because you wanted to communicate something different.
but the reality is “doing interpretive labor leaves you with the entirely erroneous impression that you’ve understood your interlocutor when in fact you haven’t, thus wasting your time not just for no benefit, but with a negative effect”, then, again—why do it?
I mean, yes, maybe Ben thought that after writing all that he understood what you were saying. But if he misunderstood you have the power to correct that. And him putting forward the interpretation he thinks is correct gives you a jumping-off point for helping him to understand what you meant. Without that jumping-off point you would be shooting in the dark, throwing out different ways of rephrasing what you said until one stuck, or worse (as I’ve said several times now) you wouldn’t realize he had misunderstood you at all.
sometimes there are just actual disagreements. I think maybe some folks in this conversation forget that, or don’t like to think about it, or… heck, I don’t know. I’m speculating here. But there’s a remarkable lack of acknowledgment, here, of the fact that sometimes someone is just wrong, and people are disagreeing with that person because he’s wrong, and they’re right.
Yes, but you can’t hash out the substantive disagreements until you’ve sorted out any misunderstandings first. That would be like arguing about the population size of Athens when one of you thinks you’re talking about Athens, Greece and the other thinks you’re talking about Athens, Ohio.
I think there are actually two separate phenomena under discussion here, which look superficially similar, but actually don’t have much to do with each other.
First phenomenon
Alice: Would you help me fix my car muffler?
Bob: Sure.
Alice: That way you won’t have to listen to my car roaring like a jet engine every time I leave my house (since we’re neighbors and all).
Second Phenomenon
Alice: Would you help me fix my car muffler?
Bob: Sure.
Alice: The noise sure does give me a headache, I want it fixed as soon as possible.
Bob: Ah, okay. I’m alright with cars, but not stellar, so how about I just pay for you to get it fixed at a garage instead? You can owe me one.
The first phenomenon seems bad for the reasons you describe in the great-grandparent comment. It also just seems strange from a linguistic perspective to keep trying to persuade someone to do something after they’ve already agreed to do it. Though if the order were reversed so that Alice gave her reason before Bob assented, it would still seem bad for the reasons you mention (because Alice’s reason isn’t all that good) but not linguistically odd.
The second phenomenon, on the other hand, seems like a good thing to me, and as far as I can tell it isn’t affected by the problems you mention. In particular, Alice giving extra reasons doesn’t absolve her of any debt she owes to Bob for the favor; in fact, in this particular scenario I would perceive her to owe a greater debt to Bob if he pays for her to have her car fixed than if he helps her fix it (though I have no idea how universal this intuition would be, and am agnostic about whether it’s correct morally). It actually seems like Bob and Alice both benefit from Alice giving her reason (at least the way I’m imagining the extra details of the scenario): Alice gets her car fixed faster, and Bob gets to avoid spending a large amount of time fixing the car. As I’m imagining the scenario, Bob would have done it if he thought Alice was asking him e.g. partially as an excuse to spend more time with him, because he also would have wanted to do that, but once it was revealed that Alice’s primary objective was to get the car fixed as fast as possible, Bob was able to save himself some time and (as I mentioned above) get Alice in debt to him even more than she otherwise would have been. So they both benefited.
The distinction seems to be that in the first phenomenon, Alice mentions a reason why it would benefit Bob to help her fix her car, whereas in the second phenomenon, Alice mentions the underlying reason she wants the car fixed. I can see how Alice mentioning a reason Bob would want to help fix the car could shift the situation to an instance of your third case, but I don’t see how Alice mentioning the underlying reason she wants the car fixed could do so, since that doesn’t make it any more in Bob’s interest to help her (except insofar as fulfilling Alice’s preferences is part of Bob’s interest, but that’s an instance of your second case).
It seems the fact that these two phenomena are distinct has only been obliquely acknowledged elsewhere in this thread, so I wanted to make it more explicit. In particular, if I’m interpreting everyone correctly then most of what people have said in this thread has been in support of the second phenomenon, and most of your objections have been objections to the first phenomenon, so to a certain extent people seem to be talking past each other.
Also, you said in the parent comment that you object to what looks to me like the second phenomenon, but you didn’t give your reasons there. Nothing wrong with that, but if you’re willing I’d be interested in hearing those reasons, because I’m having trouble imagining what someone could object to about the second phenomenon. The only thing I can think of is this: If you know the “big-picture goal” behind someone’s request, perhaps that obligates you to put in more effort to help them towards that big-picture goal than if you only knew the contents of the immediate request, i.e. you have to put in time to think about whether there’s a better way to accomplish the big-picture goal, and if that way ends up being more effortful than the original ask you still have to help with it, etc. That might be concerning in a similar way to your objection to the first phenomenon, if it’s true.
My model of gears to ascension, based on their first 2 posts, is that they’re not complaining about the length for their own sake, but rather for the sake of people that they link this post to who then bounce off because it looks too long. A basics post shouldn’t have the property that someone with zero context is likely to bounce off it, and I think gears to ascension is saying that the nominal length (reflected in the “43 minutes”) is likely to have the effect of making people who get linked to this post bounce off it, even though the length for practical purposes is much shorter.
The way I like to think about this is that the set of all possible thoughts is like a space that can be carved up into little territories and each of those territories marked with a word to give it a name.
Probably better to say something like “set of all possible concepts.” Words denote concepts, complete sentences denote thoughts.
I’m curious if you’re explicitly influenced by Quine for the final section, or if the resemblance is just coincidental.
Also, about that final section, you say that “words are grounded in our direct experience of what happens when we say a word.” While I was reading I kept wondering what you would say about the following alternative (though not mutually exclusive) hypothesis: “words are grounded in our experience of what happens when others say those words in our presence.” Why think the only thing that matters is what happens when we ourselves say a word?
I don’t know of a full guide, but here’s a sequence exploring applications for several CFAR techniques: https://www.lesswrong.com/sequences/qRxTKm7DAftSuTGvj
Said, I’m curious: have you ever procrastinated? If so, what is your internal experience like when you are procrastinating?
Thanks for the encouragement. I will try writing one and see how it goes.
thoughts [don’t] end up growing better than they would otherwise by being nurtured and midwifed? Thoughts grow better by being intelligently attacked.
I think both are true, depending on the stage of development the thought is at. If the thought is not very fleshed out yet, it grows better by being nurtured and midwifed (see e.g. here). If the thought is relatively mature, it grows best by being intelligently attacked. I predict Duncan will agree.
“no refill until appointment is on the books”
But Zvi’s friend had an appointment on the books? It was just that it was a couple weeks away.
Otherwise, thanks very much for commenting on this, good to get a doctor’s perspective.
Thanks! Done
If transcripts end up not being provided, I would be willing to transcribe the video or part of the video, depending on how long it is (I’d probably be willing to transcribe up to about 2 hours of video, maybe more if it’s less effort than I expect, having never really tried it before).
Let me see if I’ve got your argument right:
(1) It seems likely that the world is a simulation (simulation argument)
(2) If (1) is true, then it’s most likely that I am the only conscious being in existence (presumably due to computational efficiency constraints on the simulation, and where “in existence” means “within this simulation”)
(3) If I am the only conscious being in existence, then it would be unethical for me to waste resources improving the lives of anyone but myself, because they are not conscious anyway, and it is most ethical for me to maximize the good in my own life.
(4) Therefore, it’s likely that the most ethical thing for me to do is to maximize the good in my own life (ethical egoism).
Is this right?
I had never considered this argument before; it’s a really interesting argument, and I think it has a lot of promise. I especially had never really thought about premise (2) or its implications for premise (3), I think that is a really forceful point.
I’m not yet fully convinced though; let me see if I can explain why.
First, I don’t think premise (2) is true. The simulation argument, at least as I tend to hear it presented, is based on the premise that future humans would want to run “ancestor simulations” in order to see how different versions of history would play out. If this is the case, it seems like first-person simulations wouldn’t really do them much good; they’d have to simulate everyone in order to get the value they’d want out of the simulations. To be clear, by “first person simulation” I mean a simulation that renders only from the perspective of one person. It seems to me that, if you’re a physicalist about consciousness, if the simulation was rendering everywhere, then all the people in the simulation would have to be conscious, because consciousness just is the execution of certain computations, and it would be necessary to run those computations in order to get an accurate simulation. This also means that, even in a first-person simulation, the people you were interacting with would be conscious as long as they were within your frame of awareness (otherwise the simulation couldn’t be accurate), it’s just that they would blink out of existence once they left your frame of awareness.
Second, correct me if I’m wrong, but it seems to me that premise (3) is actually assuming Utilitarianism (or some other form of agent-neutral consequentialism) which simply reduces to egoism when there’s only one conscious agent in the world. So at bottom, the disagreement between yourself and effective altruists isn’t normative, it’s empirical (i.e. it’s not about your fundamental moral theory, but simply about which beings are conscious). This isn’t really a counterargument, more of an observation that you seem to have more in common with effective altruists than it may seem just on the basis that you call your position “ethical egoism.” If, counterfactually, there _were_ other conscious beings in the world, would you think that they also had moral worth?
Third, assuming that your answer to that question is “yes,” I think that it’s still often worth it to act altruistically, even on your theory, on the basis of expected utility maximization. Suppose you think that there’s only an extremely small chance that there are other conscious beings, say only .01%. Even so, if there are _enough_ lives at stake, even if you think they aren’t conscious it can be worth it to act as if they are, because it would be morally catastrophic if you did not and they turned out to be conscious after all. I think this turns out to be equivalent to valuing other lives (and others’ happiness, etc.) at X% the value of your own, where X is the probability you assign to their being conscious. So, if you assign a .01% chance that you’re wrong about this argument and other people are conscious after all, you should be willing to e.g. sacrifice your life to save 100 others. Or if you think it’s .001%, you should be willing to sacrifice your life to save 1000 others.
Anyway, I’m curious to hear your thoughts on all of this. Thanks for the thought-provoking article!
Putting RamblinDash’s point another way: when Eliezer says “unlimited retries”, he’s not talking about a Groundhog Day style reset. He’s just talking about the mundane thing where, when you’re trying to fix a car engine or something, you try one fix, and if it doesn’t start, you try another fix, and if it still doesn’t start, you try another fix, and so on. So the scenario Eliezer is imagining is this: we have 50 years. Year 1, we build an AI, and it kills 1 million people. We shut it off. Year 2, we fix the AI. We turn it back on, it kills another million people. We shut it off, fix it, turn it back on. Etc, until it stops killing people when we turn it on. Eliezer is saying, if we had 50 years to do that, we could align an AI. The problem is, in reality, the first time we turn it on, it doesn’t kill 1 million people, it kills everyone. We only get one try.
Am I the only one who, upon reading the title, pictured 5 people sitting behind OP all at the same time?