Perhaps the best theory for this role is CDT but perhaps it is instead BT, which many people think reasons better in the psychopath button scenario.
Hmm… What is BT, and what’s the psychopath button? The terms don’t appear in the Sequences or the LessWrong Wiki. Searching the whole site, I found a few references to “Benchmark Theory” which I presume is what you mean, but no definition.
Do you define them elsewhere in your FAQ, or give references to where they are defined… ?
ETA: While these may not have received much discussion on LW, they’ve attracted a fair bit of attention in academia which is why they’re being mentioned in an introductory FAQ.
Thanks for this. I am a bit surprised they haven’t cropped up more on Less Wrong, if they are indeed standard in the literature. I thought I’d come across pretty much every variant of chewing gum, smoking lesion, and Newcomb by now… but clearly not.
Incidentally, having very quickly glanced at “Psychopath button”, I wonder if the decider should first imagine a “safe psychopath button” which would kill every psychopath in the world apart from the presser. Consider whether you would push that button. If you are sure you would push it (and under the preferences described in the problem, the decider should be sure) then you get strong evidence that you are a psychopath yourself, so CDT says you shouldn’t push the original button. So I can’t see a very convincing counter-example to CDT here.
Hmm… What is BT, and what’s the psychopath button? The terms don’t appear in the Sequences or the LessWrong Wiki. Searching the whole site, I found a few references to “Benchmark Theory” which I presume is what you mean, but no definition.
Do you define them elsewhere in your FAQ, or give references to where they are defined… ?
Yes, these are both explained earlier in the FAQ.
If you are independently interested then the Psychopath Button is described here (paywall): http://philreview.dukejournals.org/content/116/1/93.citation (ETA: http://fitelson.org/few/few_05/egan.pdf)
And Benchmark Theory is described here: http://www-personal.umich.edu/~ericsw/2fef/gandalf.ltr.pdf (I’m not sure if this is a draft or a pre-print)
It is also discussed here (another paywall, unfortunately): http://philreview.dukejournals.org/content/119/1/1.abstract
ETA: While these may not have received much discussion on LW, they’ve attracted a fair bit of attention in academia which is why they’re being mentioned in an introductory FAQ.
Thanks for this. I am a bit surprised they haven’t cropped up more on Less Wrong, if they are indeed standard in the literature. I thought I’d come across pretty much every variant of chewing gum, smoking lesion, and Newcomb by now… but clearly not.
Incidentally, having very quickly glanced at “Psychopath button”, I wonder if the decider should first imagine a “safe psychopath button” which would kill every psychopath in the world apart from the presser. Consider whether you would push that button. If you are sure you would push it (and under the preferences described in the problem, the decider should be sure) then you get strong evidence that you are a psychopath yourself, so CDT says you shouldn’t push the original button. So I can’t see a very convincing counter-example to CDT here.
Yes, you might be interested in http://www-personal.umich.edu/~jjoyce/papers/rscdt.pdf