Previously “Lanrian” on here. Research analyst at Open Philanthropy. Views are my own.
Lukas Finnveden
>while if you’re a normal person reading this (haha, jk), you might think I’m awful hard on nerds (how can you say such mean things as that they don’t care what others think and are incapable of properly expressing themselves?)
Testing this sounds worth doing. Intuitively, I think it’s false. Caring too much about what other people think is in general a low status thing, while caring about the truth is a high status thing (if not particularly important).
There’s a cluster of people, including but not limited to Eliezer, Critch, and Nate, who (according to me) have what I internally call “trustworthy inside views,” another name for which might be the ability to reliably generate useful gears models, and act based on them. This is the thing they do instead of using modest epistemology; it’s the thing that allowed Eliezer to write HPMoR, among many other things. And what all of the people who seem to me to have this ability have in common is that they all have strong backgrounds in a technical subject like math, physics, or computer science (in addition to something else, this isn’t sufficient).
What makes you think this is the result of the technical background rather than a selection effect (where the kind of people who are good at thinking chooses to read technical subjects)?
This question is also very important in the scenario where good, reflective, humans doesn’t control the future. If a rogue AI takes control over the future and the best way to do work involves consciousness, we will have a universe with a lot of consciousness in it, but with no concern for their suffering.
Did you ever read any of those? I’d love to know if any were good.
I agree that lots of biases have their roots in social benefits, but I’m unsure whether they’re really here now “because we predict it’s socially helpful to be biased that way” or whether they’re here because it was socially helpful to be biased that way. Humans are adaption executers, not fitness maximizers, so the question is whether we adapted to the ancestral environment by producing a mind that could predict what biases were useful, or by producing a mind with hardcoded biases. The answer is probably some combination of the two.
Insofar as I understand, you endorse betting on 1:2 odds regardless of whether you believe the probability is 1⁄3 or 1⁄2 (i.e., regardless of whether you have received lots of random information) because of functional decision theory.
But in the case where you receive lots of random information you assign 1⁄3 probability to the coin ending up heads. If you then use FDT it looks like there is 2⁄3 probability that you will do the bet twice with the outcome tails; and 1⁄3 probability that you will do the bet once with the outcome heads. Therefore, you should be willing to bet at 1:4 odds.
That seems strange, and will mean losing money on average. I can’t see how you would get the different probabilities depending on how much random information you receive and still make the same decision about bets.
I agree with that description of FDT. And looking at the experiment from the outside, betting at 1:2 odds is the algorithm that maximizes utility, since heads and tails have equal probabilities. But once you’re in the experiment, tails have twice the probability of heads (according to your updating procedure) and FDT cares twice as much about the worlds in which tails happens, thus recommending 1:4 odds.
I think the example with selfishness is wrong even on technical grounds. It’s pretty easy to construct examples where people will help even though they’ll suffer from it, and while you can construe weird reasons why even this would be selfish (like insane hyperbolic discounting), Occam’s razor says we should go with the simple explanation, i.e. people actually care about others. Nate’s post about it is good: http://mindingourway.com/the-stamp-collector/
We will use a prisoner’s dilemma where mutual cooperation produces utility 2, mutual defiction (sic) produces utility 0, and exploitation produces utility 3 for the exploiter and 0 for the exploited. Each player will also pay a penalty of ε times its depth.
Am I reading this correctly if I think that (cooperate, defect) would produce (0, 3) and (defect, defect) would produce (0, 0)? Is that an error? Because in other parts of the text it looks like (defect, defect) should be (1, 1). Also, (cooperate, defect) should be a nash equilibrium if my interpretation is correct.
Yes, it’s weird when you are motivated to force your future copy to do things
If you couple these probability theories with the right decision theories, this should never come up. FNC yields the correct answer if you use a decision theory that lets you decide for all your identical copies (but not the ones who has had different experiences), and SIA yields the correct answer if you assume that you can’t affect the choices of the rest of your copies.
Actually, I realise that you can get around this. If you use a decision theory that assumes that you are deciding for all identical copies of you, but that you can’t affect the choices of copies that has diverged from you in any way, math says you will always bet correctly.
My preferred way of doing anthropics while keeping probabilities around is to update your probabilities according to the chance that at least one of the decision making agents that your decision is logically linked to exists, and then prioritise the worlds where there are more of those agents by acknowledging that you’re making the decision for all of them. This yields the same (correct) conclusions as SIA when you’re only making decisions for yourself, and FNC when you’re making decisions for all of your identical copies, but it avoids the paradoxes brought up in this article and it allows you to take into account that you’re making decisions for all of your similar copies, which you want to have for newcombs problem like situations.
However, I think it’s possible to construct even more contorted scenarios where conservation of expected evidence is violated for this as well. If there are 2 copies of you, a coin is flipped, and:
If it’s heads the copies are presented with two different choices.
If it’s tails the copies are presented with the same choice.
then you know that you will update towards heads when you’re presented with a choice after a minute, since heads make it twice as likely that anyone would be presented with that specific choice. I don’t know if there’s any way around this. Maybe if you update your probabilities according to the chance that someone following your decision theory is around, rather than someone making your exact choice, or something like that?
SSA and SIA aren’t exactly untestable. They both make predictions, and can be evaluated according to them, e.g. SIA predicts larger universes. It could be said to predict an infinite universe with probability 1, insofar as it at all works with infinities.
The anthropic bits in their paper looks like SSA, rather than SIA.
Sure, SIA assigns very high probability to us being in a simulation. That conclusions isn’t necessarily absurd, though I think anthropic decision theory (https://arxiv.org/abs/1110.6437) with aggregative ethics is a better way to think about it, and yields similar conclusions. Brian Tomasik has an excellent article about the implications https://foundational-research.org/how-the-simulation-argument-dampens-future-fanaticism
I’m not sure. The simplest way that more copies of me could exist is that the universe is larger, which doesn’t imply any crazy actions, except possible to bet that the universe is large/infinite. That isn’t a huge bullet to bite. From there you could probably get even more weight if you thought that copies of you were more densely distributed, or something like that, but I’m not sure what actions that would imply.
Speculation: The hypothesis that future civilisations spend all their resources simulating copies of you get a large update. However, if you contrast it with the hypothesis that they simulate all possible humans, and your prior probability that they would simulate you is proportional to the number of possible humans (by some principle of indifference), the update is proportional to the prior and is thus overwhelmed by the fact that it seems more interesting to simulate all humans than to simulate one of them over and over again.
Do you have any ideas of weird hypothesis that imply some specific actions?
When you repeat this experiment a bunch of times, I think an SSA advocate can choose their reference class to include all iterations of the experiment. This will result in them assigning similar credences as SIA, since a randomly chosen awakening from all iterations of the experiment is likely to be one of the new copies. So the update towards SIA won’t be that strong.
This way of choosing the reference class lets SSA avoid a lot of unintuitive results. But it’s kind of a symmetric way of avoiding unintuitive results, in that it might work even if the theory is false.
(Which I think it is.)
Sure, there are lots of ways to break calculations. That’s true for any theory that’s trying to calculate expected value, though, so I can’t see how that’s particularly relevant for anthropics, unless we have reason to believe that any of these situations should warrant some special action. Using anthropic decision theory you’re not even updating your probabilities based on number of copies, so it really is only calculating expected value.
Participants scoring in the bottom quartile on our humor test (...) overestimated their percentile ranking
A less well-known finding of Dunning—Kruger is that the best performers will systematically underestimate how good they are, by about 15 percentile points.
Isn’t this exactly what you’d expect if people were good bayesians receiving scarce evidence? Everyone starts out with assuming that they’re in the middle, and as they find something easy or hard, they gradually update away from their prior. If they don’t have good information about how good other people are, they won’t update too much.
If you then look at the extremes, the very best and the very worst people, of course you’re going to see that they should extremify their beliefs. But if everyone followed that advice, you’d ruin the accuracy of the people more towards the middle, since they haven’t received enough evidence to distinguish themselves from the extremes.
(Similarly, I’ve heard that people often overestimate their ability on easy tasks and underestimate their ability on difficult tasks, which is exactly what you’d expect if they had good epistemics but limited evidence. If task performance is a function of task difficulty and talent for a task, and the only things you can observe is your performance, then believing that you’re good at tasks you do well at and bad at tasks you fail at is the correct thing to do. As a consequence, saying that people overestimate their driving ability doesn’t tell you that much about the quality of their epistemics, in isolation, because they might be following a strategy that optimises performance across all tasks.)
The finding that people at the bottom overestimate their position with 46 percentile points is somewhat more extreme than this naïve model would suggest. As you say, however, it’s easily explained when you take into account that your ability to judge your performance on a task is correlated with your performance on that task. Thus, the people at the bottom are just receiving noise, so on average they stick with their prior and judge that they’re about average.
Of course, just because some of the evidence is consistent with people having good epistemics doesn’t mean that they actually do have good epistemics. I haven’t read the original paper, but it seems like people at the bottom actually thinks that they’re a bit above average, which seems like a genuine failure, and I wouldn’t be surprised if there are more examples of such failures which we can learn to correct. The impostor syndrome also seems like a case where people predictably fail in fixable ways (since they’d do better by estimating that they’re of average ability, in their group, rather than even trying to update on evidence).
But I do think that people often are too quick to draw conclusions from looking at a specific subset of people estimating their performance on a specific task, without taking into account how well their strategy would do if they were better or worse, or were doing a different task. This post fixes some of those problems, by reminding us that everyone lowering the estimate of their performance would hurt the people at the top, but I’m not sure if it correctly takes into account how the people in the middle of the distribution would be affected.
(The counter-argument might be that people who know about Dunning-Kruger is likely to be at the top of any distribution they find themselves in, but this seems false to me. I’d expect a lot of people to know about Dunning-Kruger (though I may be in a bubble) and there are lots of tasks where ability doesn’t correlate a lot with knowing about Dunning-Kruger. Perhaps humor is an example of this.)
If there are better replacements in general, then you will be inclined to replace things more readily.
The social analog is that in a community where friends are more replaceable—for instance, because everyone is extremely well selected to be similar on important axes—it should be harder to be close to anyone, or to feel safe and accepted
I can come up with a countervailing effect here, as well. Revealing problems is a risk: you might get help and be in a more trusting friendship, or you might be dumped. If there are lots of good replacements around, then getting dumped matters less, since you can find someone else. This predicts that people in communities that gather similar people might expose their problems more often, despite being replaced a higher fraction of the time.
Another difference between cars and friends is that you’re going to get equally good use out of your car regardless of how you feel about it, but you’re friendship is going to be different if you can credibly signal that you won’t replace it (taking the selfish-rational-individual model to the extreme, you probably want to signal that you’d replace it if the friend started treating you worse, but that you wouldn’t leave it just because your friend revealed problems). In a close community, that signal might get worse if you repeatedly replace friends, which predicts that you’d be less likely to replace friends in closer communities.
No empirical evidence of any of this.
I was a participant this year with prior exposure, and I was already involved and would have kept being involved even if I hadn’t gone to ESPR. No idea whether this applies to other people. There is a pretty good community where I live, but I think I would have been involved even if that wasn’t the case. Couldn’t you run this on a survey, Owen, if the answer is important?
Edit: Actually, that depends on what you’re after. Independent of going to ESPR (and independent of a close community) I would have devoted a significant amount of resources (e.g. time and money) into EA. However, ESPR (and definitely my close community) might end up increasing my interaction with other EAs/rationalists.