It starts with the sense that, if something doesn’t feel viscerally obvious, there is something left to be explained.
It’s a bottom up process. I don’t determine that images will convince me, then think of some images and play them in front of me so that they will hopefully convince my s1.
Instead I “become” my s1, take on a skeptical attitude, and ask myself what the fuss is all about.
Warning: the following might give you nightmares, if you’re imaginative enough.
In this case, what happened was something like “okay, well I guess at some point we’re going to have pretty strong optimizers. Fine. So what? Ah, I guess that’s gonna mean we’re going to have some machines that carry out commands for us. Like what? Like *picture of my living room magically tidying itself up*. Really? Well yeah I can see that happening. And I suppose this magical power can also be pretty surprising. Like *blurry picture/sense of surprising outcome*. Is this possible? Yeah like *memory of this kind of surprise*. What if this surprise was like 1000x stronger? Oh fuck...”
I guess the point is that convincing a person, or a subagent, can be best explained as an internal decision to be convinced, and not as an outside force of convincingness. So if you want to convince a part of you that feels like something outside of you, then first you have to become it. You do this by sincerely endorsing whatever it has to say. Then if the part of you feels like you, you (formerly it) decide to re-evaluate the thing that the other subagent (formerly you) disagreed with.
A bit like internal double crux, but instead of going back and forth you just do one round. Guess you could call it internal ITT.
Sure.
It starts with the sense that, if something doesn’t feel viscerally obvious, there is something left to be explained.
It’s a bottom up process. I don’t determine that images will convince me, then think of some images and play them in front of me so that they will hopefully convince my s1.
Instead I “become” my s1, take on a skeptical attitude, and ask myself what the fuss is all about.
Warning: the following might give you nightmares, if you’re imaginative enough.
In this case, what happened was something like “okay, well I guess at some point we’re going to have pretty strong optimizers. Fine. So what? Ah, I guess that’s gonna mean we’re going to have some machines that carry out commands for us. Like what? Like *picture of my living room magically tidying itself up*. Really? Well yeah I can see that happening. And I suppose this magical power can also be pretty surprising. Like *blurry picture/sense of surprising outcome*. Is this possible? Yeah like *memory of this kind of surprise*. What if this surprise was like 1000x stronger? Oh fuck...”
I guess the point is that convincing a person, or a subagent, can be best explained as an internal decision to be convinced, and not as an outside force of convincingness. So if you want to convince a part of you that feels like something outside of you, then first you have to become it. You do this by sincerely endorsing whatever it has to say. Then if the part of you feels like you, you (formerly it) decide to re-evaluate the thing that the other subagent (formerly you) disagreed with.
A bit like internal double crux, but instead of going back and forth you just do one round. Guess you could call it internal ITT.