I believe one of your assumptions could stand to be examined; the final risk to others should be multiplied by the chance that the immunity typically conferred by having antibodies will hold true for this variation of corona and any mutations that now or will then exist. Antibody effectiveness is probably the case, and I haven’t heard anything about this virus being an especially rapid mutator, but I’m personally not more than 90% certain of all this.
A variant on this topic:
Notice when providing evidence X for a position P you believe in.
Bonus points for reviewing recent memories to see if you have supported P repeatedly, especially to the exclusion of evidence to the contrary.
Feel revulsion at having become the puppet of P.
Introduce a nudge away from P. Some examples:
Provide some good evidence counter to P.
If you cannot point to specific counter evidence, try to at least describe what counter evidence would look like.
State just how surprised you would be to see the evidence X if the position P were false. Can you rank it relative to other pieces of evidence under consideration? If the evidence is really weak, ask to have it weighted as such.
This seems sloppy, as it relies on the sense of revulsion to determine how much of a counter-nudge to give. It should still be useful, I hope.
The exercise to train this with:
Propose a character facing a choice, especially on topics that are muddled by being high-profile (e.g. Jane Senator must decide how to vote on extending unemployment benefits).
Provide a small selection of evidence that the character has considered, and state that their position after seeing just that evidence is for, against or undecided.
Ask the participants what additional evidence they think the character should consider.
I did it. Where’s my cookie?
Eliezer’s comment describes the importance of Jumping Out Of The System, which I attribute to the “cross-domain” aspect of intelligence, but I don’t see this defined anywhere in the formula given for intelligence, which so far only covers “efficient” and “optimizer”.
First, a quick-and-dirty description of the process: Find an optimization process in domain A (whether or not it help attain goals). Determine one or many mapping functions between domains A and B. Use a mapping to apply the optimization process to achieve a goal in domain B.
I think the heart of crossing domains is in the middle step—the construction of a mapping between domains. Plenty of these mappings will be incomplete, mere projections that lose countless dimensions, but they still occasionally allow for useful portings of optimization processes. This is the same skill as abstraction or generalization: turning data into simplified patterns, turning apples and oranges into numbers all the same. The measure of this power could then be the maximum distance from domain A to domain B that the agent can draw mappings across. Or maybe the maximum possible complexity of a mapping function (or is that the same thing)? Or the number of possible mappings between A and B? Or speed; it just would not do to run through every possible combination of projections between two domains. So here, then, is itself a domain that can be optimized in. Is the measure of being cross-domain just a measure of how efficiently one can optimize in the domain of “mapping between domains”?
Eliezer: “It makes you more moral (at least in my experience and observation) because it gives you the opportunity to make moral choices about things that would otherwise be taken for granted, or decided for you by your brain.”
I have to take specific issue with this (despite being further down in the comments than I think will attract anyone’s attention). This post and its comments discuss a process by which a mind can modify its behavior through reflection, but while Eliezer and many others may use this power to strengthen their morality, it can just as easily be used to further any behavioral goal. “This compassion is getting in the way of my primary goal of achieving greater wealth; best to ignore it from now on.” How could things like slavery or torture happen among humans who seem to all have the capacity for compassion, if not for our overriding capacity to selectively ignore parts of ourself we deem inappropriate? It is a power which makes us more adaptive, giving us the opportunity to make choices about things that we would otherwise take for granted, but not inherently towards morality. I do think it would be accurate to describe people without much skill in reflection as being limited in the morality they can apply to themselves. Notably, I cannot presently imagine an FAI that does not include this ability (except maybe the degenerate-case of an FAI that does nothing).
I also think the process described is just an example of many routes by which we modify our behavior. Regret is the example that pops out to me immediately. When we screw things up enough to cause an intense, negative, emotional reaction, we tend to avoid those behaviors which previously led to that situation. The exertion of control over emotional reactions is functionally the same; one mental construct inhibits action on the account of another. Personally, with respect to this example, I would much rather use reflection than regret, both for the cost of use and for the time-frame. Though not enough to agree with PJ Eby’s statement “your concern about losing the negative feelings is irrational”. I would find it very difficult to accept a proposed FAI that wasn’t terrified of not being friendly (in proportion to the power it exerted over the fate of humanity), nor regretful of having failed to reach positive outcomes (in proportion to the distance from positive). Or whatever AI structures map to the same human emotions.
Hyperbole as a perversion of projection, arguments like: ”...and next you’ll be killing AI developers who disagree with FAI, to prevent them posing an existential threat.” that contain both sufficient clear reasoning and sufficient unknowable elements as to sound possible, sure, plausible, even. This is used to discredit the original idea, not the fantastical extrapolation.
A particular flavor of “if it ain’t broke, don’t fix it” that points to established traditions as “having worked for ages”. Playing off the fear of the unknown? The meme of traditions in general adds weight to many of these.
I second “cultural relativity” as being an extension of “everyone having a right to their opinion”, but in both cases point to them as also being tools to find things in one’s own life that are arbitrary and in need of evaluation on a more objective basis.
Beside the point, but you can calculate arbitrary digits of pi with the formula explained in this article.