While not comprehensively covered, GiveWell mentions this in a few places. The second point here links to a report with this section discussing whether people are willing to pay for nets, as well as a link to this old blog post which briefly makes the argument that people won’t buy their own nets, since previous hand-outs (from other charities) have resulted in a lack of local producers and an expectation of free nets. They also mention that nets have some positive externalities, and mostly benefits children, who aren’t the ones paying, which gives some reason to subsidize them.
Blind spots and biases can be harmful to your goals without being harmful to your reproductive fitness. Being wrong about which future situations will make you (permanently) happier is an excellent example of such a blind spot.
Shah et al.’s Value Learning Sequence is a short sequence of blog posts outlining the specification problem.
Shah et al.’s Value Learning Sequence is a short sequence of blog posts outlining the specification problem.
The link goes to the Embedded Agency sequence, not the value learning sequence (https://www.lesswrong.com/s/4dHMdK5TLN6xcqtyc)
“Indeed Pascal’s Mugging type issues are already present with the more standard infinities.”
Right, infinity of any kind (surreal or otherwise) doesn’t belong in decision theory.
But Pascal’s Mugging type issues are present with large finite numbers, as well. Do you bite the bullet in the finite case, or do you think that unbounded utility functions don’t belong in decision theory, either?
Satan’s Apple: Satan has cut a delicious apple into infinitely many pieces. Eve can take as many pieces as she likes, but if she takes infinitely many pieces she will be kicked out of paradise and this will outweigh the apple. For any finite number i, it seems like she should take that apple piece, but then she will end up taking infinitely many pieces.
Proposed solution for finite Eves (also a solution to Trumped, for finite Trumps who can’t count to surreal numbers):
After having eaten n pieces, Eve’s decision isn’t between eating n pieces and eating n+1 pieces, it’s between eating n pieces and whatever will happen if she eats the n+1st piece. If Eve knows that the future Eve will be following the strategy “always eat the next apple piece”, then it’s a bad decision to eat the n+1st piece (since it will lead to getting kicked out of paradise).
So what strategy should Eve follow? Consider the problem of programming a strategy that an Eve-bot will follow. In this case, the best strategy is the strategy that will lead to the largest amount of finite pieces being eaten. What this strategy is depends on the hardware, but if the hardware is finite, then there exists such a strategy (perhaps count the number of pieces and stop when you reach N, for the largest N you can store and compare with). Generalising to (finite) humans, the best strategy is the strategy that results in the largest amount of finite pieces eaten, among all strategies that a human can precommit to.
Of course, if we allow infinite hardware, then the problem is back again. But that’s at least not a problem that I’ll ever encounter, since I’m running on finite hardware.
However, for the other two I ‘just see’ the correct answer. Is this common for other people, or do you have a different split?
I think I figured out and verified the answer to all 3 questions in 5-10 seconds each, when I first heard them (though I was exposed to them in the context of “Take the cognitive reflection test which people fail because the obvious answer is wrong”, which always felt like cheating to me).
If I recall correctly, the third question was easier than the second question, which was easier than bat & ball: I think I generated the correct answer as a suggestion for 2 and 3 pretty much immediately (alongside the supposedly obvious answers), and I just had to check them. I can’t quite remember my strategy for bat & ball, but I think I generated the $0.1 ball, $1 bat answer, saw that the difference was $0.9 instead of $1, adjusted to $0.05, $1.05, and found that that one was correct.
I suspect that this is less true the other two problems—ratios and exponential growth are topics that a mathematical or scientific education is more likely to build intuition for.
This seems to be contradicted by:
the bat and ball question is the most difficult on average – only 32% of all participants get it right, compared with 40% for the widgets and 48% for the lilypads. It also has the biggest jump in success rate when comparing university students with non-students.
If there are better replacements in general, then you will be inclined to replace things more readily.
The social analog is that in a community where friends are more replaceable—for instance, because everyone is extremely well selected to be similar on important axes—it should be harder to be close to anyone, or to feel safe and accepted
I can come up with a countervailing effect here, as well. Revealing problems is a risk: you might get help and be in a more trusting friendship, or you might be dumped. If there are lots of good replacements around, then getting dumped matters less, since you can find someone else. This predicts that people in communities that gather similar people might expose their problems more often, despite being replaced a higher fraction of the time.
Another difference between cars and friends is that you’re going to get equally good use out of your car regardless of how you feel about it, but you’re friendship is going to be different if you can credibly signal that you won’t replace it (taking the selfish-rational-individual model to the extreme, you probably want to signal that you’d replace it if the friend started treating you worse, but that you wouldn’t leave it just because your friend revealed problems). In a close community, that signal might get worse if you repeatedly replace friends, which predicts that you’d be less likely to replace friends in closer communities.
No empirical evidence of any of this.
Participants scoring in the bottom quartile on our humor test (...) overestimated their percentile ranking
A less well-known finding of Dunning—Kruger is that the best performers will systematically underestimate how good they are, by about 15 percentile points.
Isn’t this exactly what you’d expect if people were good bayesians receiving scarce evidence? Everyone starts out with assuming that they’re in the middle, and as they find something easy or hard, they gradually update away from their prior. If they don’t have good information about how good other people are, they won’t update too much.
If you then look at the extremes, the very best and the very worst people, of course you’re going to see that they should extremify their beliefs. But if everyone followed that advice, you’d ruin the accuracy of the people more towards the middle, since they haven’t received enough evidence to distinguish themselves from the extremes.
(Similarly, I’ve heard that people often overestimate their ability on easy tasks and underestimate their ability on difficult tasks, which is exactly what you’d expect if they had good epistemics but limited evidence. If task performance is a function of task difficulty and talent for a task, and the only things you can observe is your performance, then believing that you’re good at tasks you do well at and bad at tasks you fail at is the correct thing to do. As a consequence, saying that people overestimate their driving ability doesn’t tell you that much about the quality of their epistemics, in isolation, because they might be following a strategy that optimises performance across all tasks.)
The finding that people at the bottom overestimate their position with 46 percentile points is somewhat more extreme than this naïve model would suggest. As you say, however, it’s easily explained when you take into account that your ability to judge your performance on a task is correlated with your performance on that task. Thus, the people at the bottom are just receiving noise, so on average they stick with their prior and judge that they’re about average.
Of course, just because some of the evidence is consistent with people having good epistemics doesn’t mean that they actually do have good epistemics. I haven’t read the original paper, but it seems like people at the bottom actually thinks that they’re a bit above average, which seems like a genuine failure, and I wouldn’t be surprised if there are more examples of such failures which we can learn to correct. The impostor syndrome also seems like a case where people predictably fail in fixable ways (since they’d do better by estimating that they’re of average ability, in their group, rather than even trying to update on evidence).
But I do think that people often are too quick to draw conclusions from looking at a specific subset of people estimating their performance on a specific task, without taking into account how well their strategy would do if they were better or worse, or were doing a different task. This post fixes some of those problems, by reminding us that everyone lowering the estimate of their performance would hurt the people at the top, but I’m not sure if it correctly takes into account how the people in the middle of the distribution would be affected.
(The counter-argument might be that people who know about Dunning-Kruger is likely to be at the top of any distribution they find themselves in, but this seems false to me. I’d expect a lot of people to know about Dunning-Kruger (though I may be in a bubble) and there are lots of tasks where ability doesn’t correlate a lot with knowing about Dunning-Kruger. Perhaps humor is an example of this.)
Sure, there are lots of ways to break calculations. That’s true for any theory that’s trying to calculate expected value, though, so I can’t see how that’s particularly relevant for anthropics, unless we have reason to believe that any of these situations should warrant some special action. Using anthropic decision theory you’re not even updating your probabilities based on number of copies, so it really is only calculating expected value.
When you repeat this experiment a bunch of times, I think an SSA advocate can choose their reference class to include all iterations of the experiment. This will result in them assigning similar credences as SIA, since a randomly chosen awakening from all iterations of the experiment is likely to be one of the new copies. So the update towards SIA won’t be that strong.
This way of choosing the reference class lets SSA avoid a lot of unintuitive results. But it’s kind of a symmetric way of avoiding unintuitive results, in that it might work even if the theory is false.
(Which I think it is.)
I’m not sure. The simplest way that more copies of me could exist is that the universe is larger, which doesn’t imply any crazy actions, except possible to bet that the universe is large/infinite. That isn’t a huge bullet to bite. From there you could probably get even more weight if you thought that copies of you were more densely distributed, or something like that, but I’m not sure what actions that would imply.
Speculation: The hypothesis that future civilisations spend all their resources simulating copies of you get a large update. However, if you contrast it with the hypothesis that they simulate all possible humans, and your prior probability that they would simulate you is proportional to the number of possible humans (by some principle of indifference), the update is proportional to the prior and is thus overwhelmed by the fact that it seems more interesting to simulate all humans than to simulate one of them over and over again.
Do you have any ideas of weird hypothesis that imply some specific actions?
Sure, SIA assigns very high probability to us being in a simulation. That conclusions isn’t necessarily absurd, though I think anthropic decision theory (https://arxiv.org/abs/1110.6437) with aggregative ethics is a better way to think about it, and yields similar conclusions. Brian Tomasik has an excellent article about the implications https://foundational-research.org/how-the-simulation-argument-dampens-future-fanaticism
SSA and SIA aren’t exactly untestable. They both make predictions, and can be evaluated according to them, e.g. SIA predicts larger universes. It could be said to predict an infinite universe with probability 1, insofar as it at all works with infinities.
The anthropic bits in their paper looks like SSA, rather than SIA.
My preferred way of doing anthropics while keeping probabilities around is to update your probabilities according to the chance that at least one of the decision making agents that your decision is logically linked to exists, and then prioritise the worlds where there are more of those agents by acknowledging that you’re making the decision for all of them. This yields the same (correct) conclusions as SIA when you’re only making decisions for yourself, and FNC when you’re making decisions for all of your identical copies, but it avoids the paradoxes brought up in this article and it allows you to take into account that you’re making decisions for all of your similar copies, which you want to have for newcombs problem like situations.
However, I think it’s possible to construct even more contorted scenarios where conservation of expected evidence is violated for this as well. If there are 2 copies of you, a coin is flipped, and:
If it’s heads the copies are presented with two different choices.
If it’s tails the copies are presented with the same choice.
then you know that you will update towards heads when you’re presented with a choice after a minute, since heads make it twice as likely that anyone would be presented with that specific choice. I don’t know if there’s any way around this. Maybe if you update your probabilities according to the chance that someone following your decision theory is around, rather than someone making your exact choice, or something like that?
Actually, I realise that you can get around this. If you use a decision theory that assumes that you are deciding for all identical copies of you, but that you can’t affect the choices of copies that has diverged from you in any way, math says you will always bet correctly.
Yes, it’s weird when you are motivated to force your future copy to do things
If you couple these probability theories with the right decision theories, this should never come up. FNC yields the correct answer if you use a decision theory that lets you decide for all your identical copies (but not the ones who has had different experiences), and SIA yields the correct answer if you assume that you can’t affect the choices of the rest of your copies.
We will use a prisoner’s dilemma where mutual cooperation produces utility 2, mutual defiction (sic) produces utility 0, and exploitation produces utility 3 for the exploiter and 0 for the exploited. Each player will also pay a penalty of ε times its depth.
Am I reading this correctly if I think that (cooperate, defect) would produce (0, 3) and (defect, defect) would produce (0, 0)? Is that an error? Because in other parts of the text it looks like (defect, defect) should be (1, 1). Also, (cooperate, defect) should be a nash equilibrium if my interpretation is correct.
I think the example with selfishness is wrong even on techincal grounds. It’s pretty easy to construct examples where people will help even though they’ll suffer from it, and while you can construe weird reasons why even this would be selfish (like insane hyperbolic discounting), Occam’s razor says we should go with the simple explanation, i.e. people actually care about others. Nate’s post about it is good: http://mindingourway.com/the-stamp-collector/
I agree with that description of FDT. And looking at the experiment from the outside, betting at 1:2 odds is the algorithm that maximizes utility, since heads and tails have equal probabilities. But once you’re in the experiment, tails have twice the probability of heads (according to your updating procedure) and FDT cares twice as much about the worlds in which tails happens, thus recommending 1:4 odds.