No, the Spell of Infinite Doom destroys the Equilibrium. Light and dark, summer and winter, luck and misfortune—the great Balance of Nature will be, not upset, but annihilated utterly; and in it, set in place a single will, the will of the Lord of Dark. And he shall rule, not only the people, but the very fabric of the World itself, until the end of days.
No matter how good a person the Lord may be, if he’s human, I’d have tried to stop the spell.
Something that occurred to me along these lines. (not directly the same, but “close enough” that some of the moral judgments would be equivalent)
Let’s say, next week, someone actually solved the mind uploading problem. They have a decision to make: go for it themselves, find someone as trustworthy as possible, forget about the plan and simply wait however long for the FAI math to be solved, etc...
What would you advise? Should they go for it themselves, try to then work out how to incrementally upgrade themselves without absolute disaster, forget it, etc etc etc...? (If nothing else, assume they already have the raw computing power to run a human at a vast speedup)
It’s not an identical problem, but it’s probably the closest thing.
What, you mean try to self-modify? Oh hell no. Human brain not designed for that. But you would have a longer time to try to solve FAI. You could maybe try a few non-self-modifications if you could find volunteers, but uploading and upload-driven-upgrading is fundamentally a race between how smart you get and how insane you get.
The modified people can be quite a bit smarter than you are too, so long as you can see their minds and modify them. Groves et al managed to mostly control the Manhattan project despite dozens of its scientists being smarter than any of their supervisors and many having communist sympathies. If he actually shared their earlier memories and could look inside their heads… There’s a limit to control, you still won’t control an adversarial super intelligence this way, but a friendly human who appreciates your need for power over them? I bet they can have a >50 IQ point advantage, maybe even >70. Schoolteachers control children who have 70 IQ points on them with the help of institutions.
Estimations from SAT scores imply that the IQ of teachers and education majors is belowaverage. Conscientious, hardworking students can graduate from most high schools and colleges with good grades, even if they are fairly stupid, as long as they stay away from courses which demand too much of them, and there are services available for those who are neither hardworking nor conscientious.
Education major courses are somewhat notorious for demanding little of students, and it is a stereotypically common choice for students seeking MRS degrees.
I’d like to imagine that the system would at least filter out individuals who are borderline retarded or below, but experience suggests to me that even this is too optimistic.
I don’t buy the conversion in the first link, which is also a dead link. That Ed majors have an SAT score of 950 sounds right. That is 37th percentile among “college-bound seniors.” If this population, which I assume means people taking the SAT, were representative of the general population, that would be an IQ of 95, but they aren’t. I stand by my estimate of 100.
I doubt you have much experience with people with an IQ of 85, let alone the borderline retarded.
What makes you doubt I have much experience with either? IQ 85 is one standard deviation below average; close to 14 percent of the population has an IQ at least that low. The lower limit of borderline retardation, that is, the least intelligent you can be before you are no longer borderline, is two standard deviations below the mean, meaning that about one person in fifty is lower than that.
As it happens, I’ve spent a considerable amount of time with special needs students, some of whom suffer from learning disabilities which do not affect their reasoning abilities, but some of whom are significantly below borderline retarded.
At the public high school I attended, more than 95% of the students in my graduating year went on to college. While the most mentally challenged students in the area were not mainstreamed and didn’t attend the same school, there was no shortage of <80 IQ students.
An average IQ of 100 for education majors would be within the error bars for the aforementioned projection, but some individuals are going to be considerably lower.
At the public high school I attended, more than 95% of the students in my graduating year went on to college. While the most mentally challenged students in the area were not mainstreamed and didn’t attend the same school, there was no shortage of <80 IQ students.
The rates at which students progress to college have a lot more to do with parental expectations, funding, and the school environment than the intelligence of the students in question. My school had very good resources to support students in the admissions process, and students who didn’t take it for granted that they were college bound were few and far between.
It seems unrealistic to assume that we’ll be able to literally read the intentions of the first upload; I’d think that we’d start out not knowing any more about them than we would about an organic person through external scanning.
You won’t be able to evaluate their thoughts exactly, but there’s a LOT that you should be able to tell about what a person is thinking if you can perfectly record all of their physiological reactions and every pattern of neural activation with perfect resolution, even with today’s knowledge. Kock and Crick even found grandmother neurons, more or less.
I’d still expect it to be hard to tell the difference someone between thinking about or wanting to kill someone/take over the world and someone actually intending to. But I can imagine at least being able to reliably detect lies with that kind of information, so I’ll defer to your knowledge of the subject.
Eliezer, I’m with you that a properly designed mind will be great, but mere uploads will still be much more awesome than normal humans on fast forward.
Without hacking on how your mind fundamentally works, it seems pretty likely that being software would allow a better interface with other software than mouse, keyboard and display does now. Hacking on just the interface would (it seems to me) lead to improvements in mental capability beyond mere speed. This sounds like mind hacking to me (software enhancing a software mind will likely lead to blurry edges around which part we call “the mind”), and seems pretty safe.
Some (pretty safe*) cognitive enhancements:
Unmodified humans using larger displays are better at many tasks than humans using small displays (somewhat fluffy pdf research). It’ll be pretty surprising if being software doesn’t allow a better visual interface than a 30″ screen.
Unmodified humans who can touch-type spend less time and attention on the mechanics of human machine interface and can be more productive (no research close to hand). Who thinks that uploaded humans are not going to be able to figure better interfaces than virtual keyboards?
Argument maps improve critical thinking, but the interfaces are currently clumsy enough to discourage use (lots of clicking and dragging). Who thinks that being software won’t provide a better way to quickly generate argument maps?
In front of a computer loaded up with my keyboard shortcuts and browser plugins I have easy access to very fast lookup on various web reference sites. At the moment the lookup delay is still long enough that short term memory management (stack overflow after a mere 7±2 pushes) is a problem (when I need a reference I push my current task onto a mental stack; it takes time and attention to pop that task when the reference has been found). Who thinks I couldn’t be smarter with a reference interface better than a keyboard?
All of which is just to say that I don’t think you’ve tried very hard to think of safe self-modifications. I’m pretty confident that you could come up with more, and better, and safer than I have.
* Where “pretty safe” means “safe enough to propose to the LW community, but not safe enough to try before submitting for public ridicule”
You can make volunteers out of your own copies. As long as the modified people aren’t too smart, it’s safe keep them in a sandbox and look through the theoretical work they produce on overdrive.
(I agree that “as long as the modified people aren’t too smart” you’re safe, but we are hacking on minds that will probably be able to hack on themselves, and possibly recursively self-improve if they decide, for instance, that they don’t want to be shut down and deleted at the end of the experiment. I’m pretty strongly motiviated not to risk insanity by trying dangerous mind-hacking experiments, but I’m not going to be deleted in a few minutes.)
*blinks* I understand your “oh hell no” reaction to self modification and “use the speedup to buy extra time to solve FAI” suggestion.
However, I don’t quite understand why you think “attempted upgrading of other” is all that much better. If you get that one wrong in a “result is super smart but insane (or, more precisely, very sane but with the goal architecture all screwed up) doesn’t one end up with the same potential paths to disaster? At that point, if nothing else, what would stop the target from then going down the self modification path?
Hrm… given though your suggested scenario, why the need to start with looking for other volunteers? ie, if the initial person is willing to be modified under the relevant constraints, why not just, well, spawn off another instance of themselves, one the modifier and one the modifiee?
EDIT: whoops, just noticed that Vladimir suggested the same thing too.
Non-self-modification is by no means safe, but it’s slightly less insanely dangerous than self-modification.
I think I see where you’re confused now. You think there’s only one of you. ;-)
But if you think about it, akrasia is an ample demonstration that there is more than one of you: the one who acts and chooses, and the one who reflects upon the acts and choices of the former.
And the one who acts and chooses also modifies itself all the frickin’ time, whether you like it or not. So if the one who reflects then refrains from modifying the one who acts, well… the results are going to be kind of random. Better directed self-modification than undirected, IMO.
(I don’t pretend to be an expert on what would happen with this stuff in brain simulation; I’m talking strictly about the behavior of embodied humans here, and my own experiences with self-modification.)
We’re talking about direct brain editing here. People who insist on comparing direct brain editing to various forms of internal rewiring carried out autonomously by opaque algorithms… or choice over deliberate procedures to follow deliberatively… well, don’t be surprised if you’re downvoted, because you did, in fact, say something stupid.
If by “direct” here you mean changing the underlying system—metaprogramming as it were, then I have to say that that’s the idea that’s stupid. If you have a system that’s perfectly capable of making changes on its own, debugged by millions of years of evolution, why on earth would you want to bypass those safeties?
If you have a system that’s perfectly capable of making changes on its own, debugged by millions of years of evolution, why on earth would you want to bypass those safeties?
You don’t need to bypass the safeties to do better. What you need is not a bigger hammer with which to change the brain, but a better idea of what to change, and what to change it to.
That’s the thing that annoys me the most about brain-mod discussions here—it’s like talking about opening up the case on your computer with a screwdriver, when you’ve never even looked at the screen or tried typing anything in—and then arguing that all modifications to computers are therefore difficult and dangerous.
To use an analogy, the kind of brain modifications we’re talking about would be the kind of modifications you’d have to do to a 286 in order to play Crysis (a very high-end game) on it.
If I’m not mistaken, as far as raw computing power goes, the human brain is more powerful than a 286. The question is—and this is something I’m honestly wondering—whether it’s feasible, given today’s technology, to turn the brain into something that can actually use that power in a fashion that isn’t horribly indirect. Every brain is powerful enough to play dual 35-back perfectly (if I had access to brain-making tools, I imagine I could make a dual 35-back player using a mere 70,000 neurons); it’s simply not sufficiently well-organized.
If your answer to the above is “no way José”, please say why. “It’s not designed for that” is not sufficient; things do things they weren’t designed to do all the time.
You don’t need to bypass the safeties to do better. What you need is not a bigger hammer with which to change the brain, but a better idea of what to change, and what to change it to.
But you do need a bigger hammer as well. And that bigger hammer is dangerous.
A brain emulation may want to modify so that when it multiplies numbers together, instead of its hardware emulating all the neurons involved, it performs the multiplication on a standard computer processor.
This would be far faster, more accurate, and less memory intensive.
Implementation would involve figuring out how the hardware recognizes the intention to perform a multiplication, represent the numbers digitally, and then present the answer back to the emulated neurons. This is outside the scope of any mechanism we might have to make changes within our brains, which would not be able to modify the emulator.
Cracking the protein folding problem, building nanotechnology, and reviving a cryonics patient at the highest possible fidelity. Redesigning the spaghetti code of the brain so as to permit it to live a flourishing and growing life rather than e.g. overloading with old memories at age 200.
I suppose you make a remarkable illustration of how people with no cosmic ambitions and brainwashed by the self-help industry, don’t even have any goals in life that require direct brain editing, and aren’t much willing to imagine them because it implies that their own brains are (gasp!) inadequate.
people with no cosmic ambitions and brainwashed by the self-help industry, don’t even have any goals in life that require direct brain editing, aren’t much willing to imagine them because it implies that their own brains are (gasp!) inadequate.
Is this your causal theory? Literally, that pjeby considered a goal that would have required direct brain editing, noticed that the goal would have implied that his brain was inadequate, felt negative self-image associations, and only then dropped the goal from consideration, and for no other reason? And further, that this is why he asked: “If you have a system that’s perfectly capable of making changes on its own, debugged by millions of years of evolution, why on earth would you want to bypass those safeties?”
I think that, where you are imagining direct brain editing done only with a formal, philosophically cross-validated theory of brain editing safety and only after a long enough delay to develop that theory, and where you imagine pjeby to be imagining direct brain editing done only with a formal, philosophically cross-validated theory of brain editing safety and only after a long enough delay to develop that theory, pjeby may be actually imagining someone who already has a brain-editing device and no safetiness theory, and who is faced with a short-range practical decision problem about whether to use the device when the option of introspective self-modification is available. pjeby probably has a lot of experience with people who have simple technical tools and are not reflective like you about whether they are safe to use. That is the kind of person he might be thinking of when he is deciding whether it would be better advice to tell the person to introspect or to use the brain editor.
(Also, someone other than me should have diagnosed this potential communication failure already! Do you guys prefer strife and ad-hominems and ill will or something?)
The x you get from
argmax_(x) U(x, y)
for fixed y is, in general, different from the x you get from
argmax_(x, y) U(x, y).
But this doesn’t mean you can conclude that the first argmax calculated U() wrong.
I suppose you make a remarkable illustration of how people with no cosmic ambitions and brainwashed by the self-help industry, don’t even have any goals in life that require direct brain editing, and aren’t much willing to imagine them because it implies that their own brains are (gasp!) inadequate.
Wow, somebody’s cranky today. (I could equally note that you’re an illustration of what happens when people try to build a technical solution to a human problem… while largely ignoring the human side of the problem.)
Solving cooler technical problems or having more brain horsepower sure would be nice. But as I already know from personal experience, just being smarter than other people doesn’t help, if it just means you execute your biases and misconceptions with greater speed and an increased illusion of certainty.
Hence, I consider the sort of self-modification that removes biases, misconceptions, and motivated reasoning to be both vastly more important and incredibly more urgent than the sort that would let me think faster, while retaining the exact same blindspots.
But if you insist on hacking brain hardware directly or in emulation, please do start with debugging support: the ability to see in real-time what belief structures are being engaged in reaching a decision or conclusion, with nice tracing readouts of all their backing assumptions. That would be really, really useful, even if you never made any modifications outside the ones that would take place by merely observing the debugger output.
you’re an illustration of what happens when people try to build a technical solution to a human problem
If there were a motivator captioned “TECHNICAL SOLUTIONS TO HUMAN PROBLEMS”, I would be honored to have my picture appear on it, so thank you very much.
If there were a motivator captioned “TECHNICAL SOLUTIONS TO HUMAN PROBLEMS”, I would be honored to have my picture appear on it, so thank you very much.
You left out the “ignoring the human part of the problem” part.
The best technical solutions to human problems are the ones that leverage and use the natural behaviors of humans, rather than trying to replace those behaviors with a perfect technical process or system, or trying to force the humans to conform to expectations.
(I’d draw an analogy with Nelson’s Xanadu vs. the web-as-we-know-it, but that could be mistaken for a pure Worse Is Better argument, and I certainly don’t want any motivated superintelligences being built on a worse-is-better basis.)
Wow what hubris the “brain is inadequate spaghetti code”. Tell me have you ever actually studied neuroscience? Where do you think modern science came from? This inadequate spaghetti code has given us the computer, modern physics and plenty of other things. For being inadequate spaghetti code (this is really a misnomer because we don’t actually understand the brain well enough to make that judgement) it does pretty well.
If the brain is as bad as you make it out to be then I challenge you to make a better one. In fact I challenge you to make a computer capable of as many operations as the brain running on as little power as the brain does. If you can’t do better then you are no better then the people who go around bashing General Relativity without being able to propose something better.
I look forward to it. (though I doubt I will ever see it considering how long you’ve been saying you were going to make an FAI and how little progress you have actually made)
But maybe your pulling a Wolfram and going to work alone for 10 years and dazzle everyone with your theory.
I don’t think there’s actually any substantive disagreement here. “Good,” “bad,” “adequate,” “inadequate”—these are all just words. The empirical facts are what they are, and we can only call them good or bad relative to some specific standard. Part of Eliezer’s endearing writing style is holding things to ridiculously impossibly high standards, and so he has a tendency to mouth off about how the human brain is poorly designed, human lifespans are ridiculously short and poor, evolutions are stupid, and so forth. But it’s just a cute way of talking about things; we can easily imagine someone with the same anticipations of experience but less ambition (or less hubris, if you prefer to say that) who says, “The human brain is amazing; human lives are long and rich; evolution is a wonder!” It’s not a disagreement in the rationalist’s sense, because it’s not about the facts. It’s not about neuroscience; it’s about attitude.
The post shows the exact same lack of familiarity with neuroscience as the comment I responded to. Examine closely how a single neuron functions and the operations that it can perform. Examine closely the ability of savants (things like memory, counting in primes, calender math...) and after a few years of reading the current neuroscience research comeback and we might have something to discuss.
What, you mean try to self-modify? Oh hell no. Human brain not designed for that
Perhaps you mean to say that we’re not particularly trustworthy in our choices of what we modify ourselves to do or prefer?
Human brains, after all, are most exquisitely designed for modifying themselves, and can do it quite autonomously. They’re just not very good at predicting the broader implications of those modifications, or at finding the right things to modify.
We’re talking about direct explicit low level self modification. ie, uploading, then using that more convenient form to directly study one’s own internal workings until one decides to go “hrm… I think I’ll reroute these neural connections to… that, add a few more of this other kind of neuron over here and...”
Recall that the thing doing all that reasoning is the thing that’s being affected by these modifications.
We’re talking about direct explicit low level self modification. ie, uploading, then using that more convenient form to directly study one’s own internal workings until one decides to go “hrm… I think I’ll reroute these neural connections to… that, add a few more of this other kind of neuron over here and...”
Yes, but that would be the stupidest possible way of doing it, when there are already systems in place to do structured modification at a higher level of abstraction. Doing it at an individual neuron level would be like trying to… well, I would’ve said “write a property management program in Z-80 assembly,” except I know a guy who actually did that. So, let’s say, something about 1000 times harder. ;-)
What I find extremely irritating is when people talk about brain modification as if it’s some sort of 1) terribly dangerous thing that 2) only happens post-uploading and 3) can only be done by direct hardware (or simulated hardware) modification. The correct answer is, “none of the above”.
What I find extremely irritating is when people talk about brain modification as if it’s some sort of 1) terribly dangerous thing that 2) only happens post-uploading and 3) can only be done by direct hardware (or simulated hardware) modification. The correct answer is, “none of the above”.
Lists like that have a good chance of canceling out. That is, there are a bunch of ways people disagree with you because they’re talking about something else.
Well, we’re talking about the kind of modifications that ordinary, non-invasive, high-level methods, acting through the usual sensory channels, don’t allow. For example, no amount of ordinary self-help could make someone unable to feel physical pain, or can let you multiply large numbers extremely quickly in the manner of a savant. Changing someone’s sexual orientation is also, at best, extremely difficult and at worst impossible. We can’t seem to get rid of confirmation bias, or cure schizophrenia, or change an autistic brain into a neurotypical brain (or vice versa). There are lots of things that one might want to do to a brain that simply don’t happen as long as that brain is sitting inside a skull only receiving input through normal human senses.
Difficult question. I believe thoselinks are relevant, but your formulation also implies the threat of an arms race.
My best shot for now would be this: avoid self-modification. The top priority right now is defending people from the potential harmful effects of this thing you created, because someone less benevolent might stumble upon it soon. Find people who share this sentiment and use the speedup together to think hard about the problem of defense.
Perhaps an “anti arms race” would be a more accurate notion. ie, in once sense, waiting for the mathematics of FAI to be solved would be preferable. Would be safer to get to a point that we can mathematically ensure that the thing will be well behaved.
On the other hand, while waiting, how many will suffer and die irretrievably? If the cost for waiting was much smaller, then the answer of “wait for the math and construct the FAI rather than trying to patchwork update a spaghetti coded human mind” would be, to me, the clearly preferable choice.
Even given avoiding self modification, massive speedup would still correspond to significant amount of power. We already know how easily humans… change… with power. And when sped up, obviously people not sped up would seem different, “lesser”… helping to reinforce the “I am above them” sense. One might try to solve this by figuring out how to self modify enough to, well, not to that. But self modification itself being a starting point for, if one does not do it absolutely perfectly, potential disaster, well...
Anyways, so your suggestion would basically be “only use the power to, well, defend against the power” rather than use it to actually try to fix some of the annoying little problems in the world (like… death and and and and and… ?)
FAI is one possible means of defense, there might be others.
You shouldn’t just wait for FAI, you should speed up FAI developers too because it’s a race.
I think the strategy of developing a means of defense first has higher expected utility than fixing death first, because in the latter case someone else who develops uploading can destroy/enslave the world while you’re busy fixing it.
Given how misrepresented the official story is supposed to be, the part about personally ruling the fabric of the World can be assumed to be twisted as well.
Look, you should know me well enough by now to know that I don’t keep my stories on nice safe moral territory.
A happy ending here is not guaranteed. But think about this very carefully. Are you sure you’d have turned the Sword on Vhazhar? They don’t have the same options we do.
He’s going to be the emperor. He could implement Parliament, he could create jury trials. He could even put Dolf and Selena on trial for their crimes.
It’s interesting that Hirou holds the world accountable to his own moral code, which assumes power corrupts. Then, at the last moment, he grants absolute power to Vhazhar. So in the middle of choosing to use our world’s morality, which is built upon centuries of learning to doubt human nature, in the middle of that—Vhazhar’s good intentions are so good that they justify granting him absolute power. Lesson not learned.
It doesn’t mean that. It means something more like “power changes the empowered’s utility function in a way others deem immoral”. (ETA simplified)
ETA: Just to make the point clearer, there are many things that change an individual’s goal content but are not considered corrupting. For example, trying new foods will generally make you divert more effort to finding one kind of food (that you didn’t know you liked). Having children of your own makes you more favorable to children in general. But we don’t say, and people generally don’t believe, “having children corrupts” or “trying new foods corrupts”.
Also: it seems like a really poor plan, in the long term, for the fate of the entire plane to rest on the sanity of one dude. If Hirou kept the sword, he could maybe try to work with the wizards—ask them to spend one day per week healing people, make sure the crops do okay, etc. Things maybe wouldn’t be perfect, but at least he wouldn’t be running the risk of everybody-dies.
Okay, but in any case, regarding the issue at hand, “power corrupts” is not a purely factual claim. (And I thought that hybrid claims get counted as moral by default, since that’s the most useful for discussion, but I could be wrong.)
Then you need to separate the factual claim and the moral claim, and discuss them separately. The factual claim would be, “power changes goal content in this particular way”, and the moral claim is, ”...and this is bad.”
Is this fair though? Let’s say the passage had been, ”… his position that it is immoral to possess nuclear weapons”. That too breaks down into a factual and moral claim.
Moral: “it is wrong to possess a weapon with massive, unfocused destructive power”
Factual: “The devices we currently call nuclear weapons inflict massive, unfocused destruction.”
Would you object to “his position that it is immoral to posses nuclear weapons” on the grounds that “you need to separate the factual and moral claims”?
Well, in fact it would be highly helpful to separate the claims here, even though the factual part is uncontroversial, because it makes it clear what argument is being made, exactly.
And in this case it’s uncertain/controversial how much power actually changes behavior, who it changes, how reliably; and this is the key issue, whereas the moral concept that “the behavior of killing everyone who disagrees with you, is wrong” is relatively uncontroversial among us. So calling this a moral claim when the key disputed part is actually a factual claim is a bad idea.
Evolution doesn’t do most things. Doing things requires oceans of blood for every little adaptation and humans haven’t had power for all that long. Toddlers need to learn how to hide. How’s that for failing to evolve knowledge of the obvious (to a human brain) and absurdly useful.
I think my concern about “power corrupts” is this: humans have a strong drive to improve things. We need projects, we need challenges. When this guy gets unlimited power, he’s going to take two or three passes over everything and make sure everybody’s happy, and then I’m worried he’s going to get very, very bored. With an infinite lifespan and unlimited power, it’s sort of inevitable.
What do you do, when you’re omnipotent and undying, and you realize you’re going mad with boredom?
Does “unlimited power” include the power to make yourself not bored?
If Vhazhar has the option of editing the nasty bits out of reality and then stepping down from power, I’d help him. If he must personally become a ruler for all eternity, I’d kill him, then smash the goddamn device, then try to somehow ensure that future aspiring Dark Lords also get killed in time.
This could be how the ‘balance’ mythology and the prophecy got started. Perhaps the hero decided long ago that it wasn’t worth the risk, and wanted to make sure future heroes kill the Dark Lord.
I assume that the sword tests the correspondence of person’s intentions (plan) to their preference. If the sword uses a static concept of preference that comes with the sword instead, why would Vhazhar be interested in sword’s standard of preference? Thus, given that the Vhazhar’s plan involves control over the fabric of the World, the plan must be sound and result in correct installation of Vhazhar’s preference in the rules of the world. This excludes the technical worries about the failure modes of human mind in wielding too much power (which is how I initially interpreted “personal control”—as a recipe for failure modes).
I’m not sure what it means for the other people’s preferences (and specifically mine). I can’t exclude the possibility that it’s worse than the do-nothing option, but it doesn’t seem obviously so either, given psychological unity of humans. From what I know, on the spot I’d favor Vhazhar’s personal preference, if the better alternative is unlikely, given that this choice instantly wards off existential risk and lack of progress.
No, it’s the Sword of GOOD. It tests whether you’re GOOD, not any of this other stuff.
Wasn’t it established that this world’s conception of “good” and “evil” are messed up? Why should he trust that the sword really works exactly as advertised?
It should be obvious that the sword doesn’t test how well your plans correspond to what you think you want! Otherwise Hirou would have been vaporized.
Only assuming that the sword is impulsive. If you take into account Hirou’s overall role in the events, this role could be judged good, if only by the final decision.
If the sword judges not plans, but preference, then failing 9 out of 10 people means that it’s pretty selective among humans and probably people it selects and their values aren’t representative (act in the interests) of the humanity as whole.
If the Sword of Good tested whether you’re good, Hirou would have been vapourized, because he was obviously not good. He was at the very least an accomplice to murderers, a racist, and a killer. The Sword of Good may not have vapourized Charles Manson, Richard Nixon, Hitler, or most suicide bombers, either. The Sword of Good tests whether you think you are good, not whether your actions are good.
Strangely, the sword kills nine out of ten people who try to wield it. However, if you knew the sword could only be wielded by a good person, you’d only try to pick it up if you thought you were good, which happens to be the criteria you must fulfil in order to pick up the sword. Essentially, if you think you can wield the Sword of Good, you can.
If the Sword of Good tested whether you’re good, Hirou would have been vapourized, because he was obviously not good. He was at the very least an accomplice to murderers, a racist, and a killer.
Well, he was clearly redeemable, at least. It didn’t take very much for him to let go of his assumptions, just a few words from someone he thought was an enemy. Making dumb mistakes, even ones with dire consequences, doesn’t necessarily make you not Good.
What, realistically, does it mean to be irredeemable? Was Dolf irredeemable? Selena? Is the difference between them and Hirou simply the fact that Hirou realized he was doing bad, and they didn’t? Why should that be sufficient to redeem him? Mistakes are not accidents; mistakenly killing someone is still murder.
Surely if awareness and repentance of the immoral nature of your actions makes you Good, the reverse—lack of awareness—means animals that kills other animals without regret are more evil than people who kill other people and regret it.
If you believe someone is evil, hunt them down and kill them, and afterward realize they weren’t, it was a mistake. It was also murder. It’s not as though you killed in self defense or accidentally dropped an air conditioner on them. Manslaughter is not a defense that can be employed simply because you changed your mind.
Perhaps I should clarify: I don’t mean “mistake” in that “he mistook his wife for a burglar and killed her”. That’s manslaughter. I mean “mistake” in that “he mistakenly murdered a good person instead of a bad one”. Ba gur bgure unaq, jura Uvebh xvyyrq Qbys ng gur raq, ur jnfa’g znxvat n zvfgnxr (ubjrire, V fgvyy guvax vg jnf zheqre).
To be clear, you believe that, right wedrifid? I came this close to downvoting before I deduced the context.
I believe that there are times where the described behaviour is morally acceptable. I don’t think it is helpful to label that behaviour ‘murder’ but if someone were to define that as murder it would mean that murder (of that particular kind) was ok.
To be clear, there are stringent standards on the behaviour which preceded the mistake. This is something that should happen very infrequently. Both epistemic rationality standards and instrumental rationality standards apply. For example, sincerely believing that the person had committed a crime because you happen to be bigoted and irrational leaves you morally culpable and failing to take actions that provide more evidence where the VoI is high and cost is low also leaves you morally culpable. The ‘excuse’ for hunting and down a killing an innocent that you mistakenly believed was sufficiently evil is not “I was mistaken” but rather “any acceptably rational and competent individual in this circumstance would have believed that the target was sufficiently evil”.
It’s not too hard to imagine a scenario in which hunting down and killing someone is indeed the right thing to do… the obvious example is that, given perfect hindsight, it would have been much better if one of the many early attempts to assassinate Hitler had in fact succeeded.
Bonus question: Which one of the failed attempts was most likely to have been made by a time traveler? ;)
If you believe someone is evil, hunt them down and kill them, and afterward realize they weren’t, it was a mistake. It was also murder.
Suppose you’re a police officer trying to arrest someone for a crime, and there is ample evidence that the person you are trying to arrest is indeed guilty of that crime. The person resists arrest, and you end up killing the person instead of making a successful capture. Are you a murderer?
Does it matter if it turns out that the evidence against this person turns out to have been forged (by someone else)?
If you have no intention of killing them and they die as a side effect of your actions, it’s an accident, and manslaughter. If you kill them because you realize you can’t arrest them, it’s murder, complete with intention of malice. However, the fact that your actions are sanctioned by the state is obviously not a defense (a la Nuremberg), and so there’s no point in adding “police officer” to the example.
You could ask if I thought executing someone who was framed would be considered murder, but since I view all manner of execution murder, guilty or no, there’s no use.
However, the fact that your actions are sanctioned by the state is obviously not a defense (a la Nuremberg), and so there’s no point in adding “police officer” to the example.
Actually, I think there is. If you kill someone without “state sanction”, as you put it, it’s almost certainly Evil. If you kill someone that the local laws allow you to kill, it’s much less likely to be Evil, because non-Evil reasons for killing, such as self-defense, tend to be accounted for in most legal systems. Anyway, I think I’m getting off the subject. Let me try rephrasing the general scenario:
You are a police officer. You have an arrest warrant for a suspected criminal. If you try to arrest the suspect, he is willing to use lethal force against you in order to prevent being captured. You also believe that, once the suspect has attempted to use lethal force against you, non-lethal force will prove to be insufficient to complete the arrest.
The way I see it, this could end in several ways:
1) Don’t try to make an arrest attempt at all.
2) Attempt to make an arrest. The suspect responds by attempting to use lethal force against you. (He shoots at you with a low-caliber pistol, but you are protected by your bulletproof vest.) You believe that non-lethal force will most likely fail to subdue the suspect. Not willing to use lethal force and kill the suspect, you retreat, failing to make the arrest.
3) Attempt to make an arrest. The suspected criminal responds by attempting to use lethal force against you. (He shoots at you with a low-caliber pistol, but you are protected by your bulletproof vest.) You believe that non-lethal force will most likely fail to subdue the suspected criminal, but try anyway. (You start running at him, intending to wrestle the gun away from him with your bare hands.) The suspected criminal kills you. (He shoots you in the head.)
4) Attempt to make an arrest. The suspected criminal responds by attempting to use lethal force against you. (He shoots at you with a low-caliber pistol, but you are protected by your bulletproof vest.) You believe that non-lethal force will most likely fail to subdue the suspected criminal, so you resort to lethal force. (You shoot him with your own gun.) The suspected criminal is killed, and, when you are questioned about your actions, your lawyer says that you killed the suspect in self-defense. (Under U.S. law, this would indeed be the case—you would not be guilty of murder.)
Obviously Scenario 2 is a better outcome than Scenario 3, because in Scenario 3, you end up dead. However, if you know that you’re not willing to use lethal force to begin with, and that non-lethal force is going to be insufficient, you’re probably better off not making the arrest attempt at all, which is Scenario 1. Therefore Scenario 1 is better than Scenario 3. If you’re going to make an arrest attempt at all, you are expecting Scenario 4 to occur. If you go through with Scenario 4, does that make you Evil? You initiated the use of force by making the arrest attempt, but the suspect could have chosen to submit to arrest rather than to fight against you—and he did, indeed, use lethal force before you did.
I notice that you left off an outcome that if anything allows you to make your point stronger.
5) Attempt to make an arrest. You see that the suspected criminal has the capacity to use lethal force against you (he is armed) and you suspect that he will use it against you. You shoot the suspect. His use of lethal force against you is never more than counterfactual (ie. a valid suspicion).
For consistency some “6)” may be required in which the first “attempt to use lethal force against you” is successful. I suggest that this action is not necessarily Evil, for similar reasons that you describe for scenario 4. Obviously this is less clear cut and has more scope for failure modes like “black suspect reaches for ID” so we want more caution in this instance and (ought to) grant police officers less discretion.
If you kill someone without “state sanction”, as you put it, it’s almost certainly Evil.
I think ‘almost certain’ may be something of an overstatement. The states that we personally live in are not a representative sample of states and killing tyrants is not something we can call ‘almost certainly’ Evil. The same consideration applies to self defence laws. Self defence laws in an average state selected from all states across time were not sufficiently fair as to make claims about almost certain Evil.
“After I complete the Spell of Ultimate Power, I’ll have the ability to bring Alek back. And I will. … I’m not asking anything from you. Just telling you that if I win, I’ll bring Alek back. That’s a promise.”
...the moment of the Sword touching Dolf’s skin, the wizard stopped, ceased to exist… as something seemed to flow away from the corpse toward the gears above the altar.
...he closed his eyes to sleep until the end of the world.
The logic of the Phoenix is that the Lord of Dark will resurrect everyone he can, including Dolf, so it isn’t murder.
I was thinking the same thing. The way Eliezer wrote that bit seemed to make it clear that something rather more than mere decapitation occurred there.
Though, actually spelling it out directly does end up sounding funny. “Well… I don’t know that cutting off his head with this sword would kill him… I mean, is it really reasonable for me to have expected that?” :)
You are using two definitions of “good”—how much good your actions cause, and how good you believe yourself to be. Neither of those is used by the sword; rather, some sort of virtue-ethics definition—I suspect motive.
If the Sword of Good tested whether you’re good, Hirou would have been vapourized, because he was obviously not good. He was at the very least an accomplice to murderers, a racist, and a killer.
Doing a bad thing does not necessarily make one a bad person. Though it helps.
Presumably, actual mutants are unlikely, with most “evil” people actually just holding mistaken (about their actual preference) moral beliefs. If the sword is an external moral authority, it’s harder to see why one would consult it.
On the other hand, sword checks soundness of the plan against some preference, which is an important step that is absent if one doesn’t consult the sword, which can justify accepting a somewhat mismatched preference if that allows to use the test.
This passes the choice of mismatching preferences to a different situation. If the sword tests person’s preference, then protagonist’s choice is between lack of progress or unlikely good outcome and (if Vhazhar’s plan is sound) verified installation of Vhazhar’s preference, with the latter presumably close to others’ preference, thus being a moderately good option. If the sword tests some kind of standard preference, this standard preference is presumably also close to Vhazhar’s preference, thus Vhazhar faces a choice between trying to install his own preference through unverified process, which can go through all kinds of failure modes, and using the sword to test the reliability of his plan.
The fact that Vhazhar is willing to use the sword to test the soundness of his plan, when the failed test means his death, shows that he prefers leaving the rest of the world be to incorrectly changing it. This is a strong signal that should’ve been part of the information given to protagonist for making the decision.
No matter how good a person the Lord may be, if he’s human, I’d have tried to stop the spell.
Something that occurred to me along these lines. (not directly the same, but “close enough” that some of the moral judgments would be equivalent)
Let’s say, next week, someone actually solved the mind uploading problem. They have a decision to make: go for it themselves, find someone as trustworthy as possible, forget about the plan and simply wait however long for the FAI math to be solved, etc...
What would you advise? Should they go for it themselves, try to then work out how to incrementally upgrade themselves without absolute disaster, forget it, etc etc etc...? (If nothing else, assume they already have the raw computing power to run a human at a vast speedup)
It’s not an identical problem, but it’s probably the closest thing.
What, you mean try to self-modify? Oh hell no. Human brain not designed for that. But you would have a longer time to try to solve FAI. You could maybe try a few non-self-modifications if you could find volunteers, but uploading and upload-driven-upgrading is fundamentally a race between how smart you get and how insane you get.
The modified people can be quite a bit smarter than you are too, so long as you can see their minds and modify them. Groves et al managed to mostly control the Manhattan project despite dozens of its scientists being smarter than any of their supervisors and many having communist sympathies. If he actually shared their earlier memories and could look inside their heads… There’s a limit to control, you still won’t control an adversarial super intelligence this way, but a friendly human who appreciates your need for power over them? I bet they can have a >50 IQ point advantage, maybe even >70. Schoolteachers control children who have 70 IQ points on them with the help of institutions.
Is it relevant that IQ is correlated with obedience to authority?
And how dumb do you think schoolteachers are? Bottom of those with BAs. I’d guess 100. And correlated with their pupils.
Estimations from SAT scores imply that the IQ of teachers and education majors is below average. Conscientious, hardworking students can graduate from most high schools and colleges with good grades, even if they are fairly stupid, as long as they stay away from courses which demand too much of them, and there are services available for those who are neither hardworking nor conscientious.
Education major courses are somewhat notorious for demanding little of students, and it is a stereotypically common choice for students seeking MRS degrees.
I’d like to imagine that the system would at least filter out individuals who are borderline retarded or below, but experience suggests to me that even this is too optimistic.
I don’t buy the conversion in the first link, which is also a dead link. That Ed majors have an SAT score of 950 sounds right. That is 37th percentile among “college-bound seniors.” If this population, which I assume means people taking the SAT, were representative of the general population, that would be an IQ of 95, but they aren’t. I stand by my estimate of 100.
I doubt you have much experience with people with an IQ of 85, let alone the borderline retarded.
What makes you doubt I have much experience with either? IQ 85 is one standard deviation below average; close to 14 percent of the population has an IQ at least that low. The lower limit of borderline retardation, that is, the least intelligent you can be before you are no longer borderline, is two standard deviations below the mean, meaning that about one person in fifty is lower than that.
As it happens, I’ve spent a considerable amount of time with special needs students, some of whom suffer from learning disabilities which do not affect their reasoning abilities, but some of whom are significantly below borderline retarded.
At the public high school I attended, more than 95% of the students in my graduating year went on to college. While the most mentally challenged students in the area were not mainstreamed and didn’t attend the same school, there was no shortage of <80 IQ students.
An average IQ of 100 for education majors would be within the error bars for the aforementioned projection, but some individuals are going to be considerably lower.
Those two sentences are not very compatible.
The rates at which students progress to college have a lot more to do with parental expectations, funding, and the school environment than the intelligence of the students in question. My school had very good resources to support students in the admissions process, and students who didn’t take it for granted that they were college bound were few and far between.
It seems unrealistic to assume that we’ll be able to literally read the intentions of the first upload; I’d think that we’d start out not knowing any more about them than we would about an organic person through external scanning.
You won’t be able to evaluate their thoughts exactly, but there’s a LOT that you should be able to tell about what a person is thinking if you can perfectly record all of their physiological reactions and every pattern of neural activation with perfect resolution, even with today’s knowledge. Kock and Crick even found grandmother neurons, more or less.
I’d still expect it to be hard to tell the difference someone between thinking about or wanting to kill someone/take over the world and someone actually intending to. But I can imagine at least being able to reliably detect lies with that kind of information, so I’ll defer to your knowledge of the subject.
Eliezer, I’m with you that a properly designed mind will be great, but mere uploads will still be much more awesome than normal humans on fast forward.
Without hacking on how your mind fundamentally works, it seems pretty likely that being software would allow a better interface with other software than mouse, keyboard and display does now. Hacking on just the interface would (it seems to me) lead to improvements in mental capability beyond mere speed. This sounds like mind hacking to me (software enhancing a software mind will likely lead to blurry edges around which part we call “the mind”), and seems pretty safe.
Some (pretty safe*) cognitive enhancements:
Unmodified humans using larger displays are better at many tasks than humans using small displays (somewhat fluffy pdf research). It’ll be pretty surprising if being software doesn’t allow a better visual interface than a 30″ screen.
Unmodified humans who can touch-type spend less time and attention on the mechanics of human machine interface and can be more productive (no research close to hand). Who thinks that uploaded humans are not going to be able to figure better interfaces than virtual keyboards?
Argument maps improve critical thinking, but the interfaces are currently clumsy enough to discourage use (lots of clicking and dragging). Who thinks that being software won’t provide a better way to quickly generate argument maps?
In front of a computer loaded up with my keyboard shortcuts and browser plugins I have easy access to very fast lookup on various web reference sites. At the moment the lookup delay is still long enough that short term memory management (stack overflow after a mere 7±2 pushes) is a problem (when I need a reference I push my current task onto a mental stack; it takes time and attention to pop that task when the reference has been found). Who thinks I couldn’t be smarter with a reference interface better than a keyboard?
All of which is just to say that I don’t think you’ve tried very hard to think of safe self-modifications. I’m pretty confident that you could come up with more, and better, and safer than I have.
* Where “pretty safe” means “safe enough to propose to the LW community, but not safe enough to try before submitting for public ridicule”
You can make volunteers out of your own copies. As long as the modified people aren’t too smart, it’s safe keep them in a sandbox and look through the theoretical work they produce on overdrive.
AI boxes are pretty dangerous.
(I agree that “as long as the modified people aren’t too smart” you’re safe, but we are hacking on minds that will probably be able to hack on themselves, and possibly recursively self-improve if they decide, for instance, that they don’t want to be shut down and deleted at the end of the experiment. I’m pretty strongly motiviated not to risk insanity by trying dangerous mind-hacking experiments, but I’m not going to be deleted in a few minutes.)
*blinks* I understand your “oh hell no” reaction to self modification and “use the speedup to buy extra time to solve FAI” suggestion.
However, I don’t quite understand why you think “attempted upgrading of other” is all that much better. If you get that one wrong in a “result is super smart but insane (or, more precisely, very sane but with the goal architecture all screwed up) doesn’t one end up with the same potential paths to disaster? At that point, if nothing else, what would stop the target from then going down the self modification path?
Non-self-modification is by no means safe, but it’s slightly less insanely dangerous than self-modification.
Ooooh, okay then. That makes sense.
Hrm… given though your suggested scenario, why the need to start with looking for other volunteers? ie, if the initial person is willing to be modified under the relevant constraints, why not just, well, spawn off another instance of themselves, one the modifier and one the modifiee?
EDIT: whoops, just noticed that Vladimir suggested the same thing too.
I think I see where you’re confused now. You think there’s only one of you. ;-)
But if you think about it, akrasia is an ample demonstration that there is more than one of you: the one who acts and chooses, and the one who reflects upon the acts and choices of the former.
And the one who acts and chooses also modifies itself all the frickin’ time, whether you like it or not. So if the one who reflects then refrains from modifying the one who acts, well… the results are going to be kind of random. Better directed self-modification than undirected, IMO.
(I don’t pretend to be an expert on what would happen with this stuff in brain simulation; I’m talking strictly about the behavior of embodied humans here, and my own experiences with self-modification.)
We’re talking about direct brain editing here. People who insist on comparing direct brain editing to various forms of internal rewiring carried out autonomously by opaque algorithms… or choice over deliberate procedures to follow deliberatively… well, don’t be surprised if you’re downvoted, because you did, in fact, say something stupid.
If by “direct” here you mean changing the underlying system—metaprogramming as it were, then I have to say that that’s the idea that’s stupid. If you have a system that’s perfectly capable of making changes on its own, debugged by millions of years of evolution, why on earth would you want to bypass those safeties?
On that, I believe we’re actually in agreement.
To do better?
You don’t need to bypass the safeties to do better. What you need is not a bigger hammer with which to change the brain, but a better idea of what to change, and what to change it to.
That’s the thing that annoys me the most about brain-mod discussions here—it’s like talking about opening up the case on your computer with a screwdriver, when you’ve never even looked at the screen or tried typing anything in—and then arguing that all modifications to computers are therefore difficult and dangerous.
To use an analogy, the kind of brain modifications we’re talking about would be the kind of modifications you’d have to do to a 286 in order to play Crysis (a very high-end game) on it.
If I’m not mistaken, as far as raw computing power goes, the human brain is more powerful than a 286. The question is—and this is something I’m honestly wondering—whether it’s feasible, given today’s technology, to turn the brain into something that can actually use that power in a fashion that isn’t horribly indirect. Every brain is powerful enough to play dual 35-back perfectly (if I had access to brain-making tools, I imagine I could make a dual 35-back player using a mere 70,000 neurons); it’s simply not sufficiently well-organized.
If your answer to the above is “no way José”, please say why. “It’s not designed for that” is not sufficient; things do things they weren’t designed to do all the time.
But you do need a bigger hammer as well. And that bigger hammer is dangerous.
For what, specifically?
A brain emulation may want to modify so that when it multiplies numbers together, instead of its hardware emulating all the neurons involved, it performs the multiplication on a standard computer processor.
This would be far faster, more accurate, and less memory intensive.
Implementation would involve figuring out how the hardware recognizes the intention to perform a multiplication, represent the numbers digitally, and then present the answer back to the emulated neurons. This is outside the scope of any mechanism we might have to make changes within our brains, which would not be able to modify the emulator.
Cracking the protein folding problem, building nanotechnology, and reviving a cryonics patient at the highest possible fidelity. Redesigning the spaghetti code of the brain so as to permit it to live a flourishing and growing life rather than e.g. overloading with old memories at age 200.
I suppose you make a remarkable illustration of how people with no cosmic ambitions and brainwashed by the self-help industry, don’t even have any goals in life that require direct brain editing, and aren’t much willing to imagine them because it implies that their own brains are (gasp!) inadequate.
Is this your causal theory? Literally, that pjeby considered a goal that would have required direct brain editing, noticed that the goal would have implied that his brain was inadequate, felt negative self-image associations, and only then dropped the goal from consideration, and for no other reason? And further, that this is why he asked: “If you have a system that’s perfectly capable of making changes on its own, debugged by millions of years of evolution, why on earth would you want to bypass those safeties?”
I think that, where you are imagining direct brain editing done only with a formal, philosophically cross-validated theory of brain editing safety and only after a long enough delay to develop that theory, and where you imagine pjeby to be imagining direct brain editing done only with a formal, philosophically cross-validated theory of brain editing safety and only after a long enough delay to develop that theory, pjeby may be actually imagining someone who already has a brain-editing device and no safetiness theory, and who is faced with a short-range practical decision problem about whether to use the device when the option of introspective self-modification is available. pjeby probably has a lot of experience with people who have simple technical tools and are not reflective like you about whether they are safe to use. That is the kind of person he might be thinking of when he is deciding whether it would be better advice to tell the person to introspect or to use the brain editor.
(Also, someone other than me should have diagnosed this potential communication failure already! Do you guys prefer strife and ad-hominems and ill will or something?)
The x you get from
argmax_(x) U(x, y)
for fixed y is, in general, different from the x you get from
argmax_(x, y) U(x, y).
But this doesn’t mean you can conclude that the first argmax calculated U() wrong.
Wow, somebody’s cranky today. (I could equally note that you’re an illustration of what happens when people try to build a technical solution to a human problem… while largely ignoring the human side of the problem.)
Solving cooler technical problems or having more brain horsepower sure would be nice. But as I already know from personal experience, just being smarter than other people doesn’t help, if it just means you execute your biases and misconceptions with greater speed and an increased illusion of certainty.
Hence, I consider the sort of self-modification that removes biases, misconceptions, and motivated reasoning to be both vastly more important and incredibly more urgent than the sort that would let me think faster, while retaining the exact same blindspots.
But if you insist on hacking brain hardware directly or in emulation, please do start with debugging support: the ability to see in real-time what belief structures are being engaged in reaching a decision or conclusion, with nice tracing readouts of all their backing assumptions. That would be really, really useful, even if you never made any modifications outside the ones that would take place by merely observing the debugger output.
If there were a motivator captioned “TECHNICAL SOLUTIONS TO HUMAN PROBLEMS”, I would be honored to have my picture appear on it, so thank you very much.
You left out the “ignoring the human part of the problem” part.
The best technical solutions to human problems are the ones that leverage and use the natural behaviors of humans, rather than trying to replace those behaviors with a perfect technical process or system, or trying to force the humans to conform to expectations.
(I’d draw an analogy with Nelson’s Xanadu vs. the web-as-we-know-it, but that could be mistaken for a pure Worse Is Better argument, and I certainly don’t want any motivated superintelligences being built on a worse-is-better basis.)
Wow what hubris the “brain is inadequate spaghetti code”. Tell me have you ever actually studied neuroscience? Where do you think modern science came from? This inadequate spaghetti code has given us the computer, modern physics and plenty of other things. For being inadequate spaghetti code (this is really a misnomer because we don’t actually understand the brain well enough to make that judgement) it does pretty well.
If the brain is as bad as you make it out to be then I challenge you to make a better one. In fact I challenge you to make a computer capable of as many operations as the brain running on as little power as the brain does. If you can’t do better then you are no better then the people who go around bashing General Relativity without being able to propose something better.
I accept your challenge. See you in a while.
Awesome.
I look forward to it. (though I doubt I will ever see it considering how long you’ve been saying you were going to make an FAI and how little progress you have actually made) But maybe your pulling a Wolfram and going to work alone for 10 years and dazzle everyone with your theory.
I don’t think there’s actually any substantive disagreement here. “Good,” “bad,” “adequate,” “inadequate”—these are all just words. The empirical facts are what they are, and we can only call them good or bad relative to some specific standard. Part of Eliezer’s endearing writing style is holding things to ridiculously impossibly high standards, and so he has a tendency to mouth off about how the human brain is poorly designed, human lifespans are ridiculously short and poor, evolutions are stupid, and so forth. But it’s just a cute way of talking about things; we can easily imagine someone with the same anticipations of experience but less ambition (or less hubris, if you prefer to say that) who says, “The human brain is amazing; human lives are long and rich; evolution is a wonder!” It’s not a disagreement in the rationalist’s sense, because it’s not about the facts. It’s not about neuroscience; it’s about attitude.
While my sample size is limited I have noticed a distinct correlation between engaging in hubris and levelling the charge at others. Curious.
For calibration, see The Power of Intelligence.
“The Power of Intelligence”
Derivative drivel...
The post shows the exact same lack of familiarity with neuroscience as the comment I responded to. Examine closely how a single neuron functions and the operations that it can perform. Examine closely the ability of savants (things like memory, counting in primes, calender math...) and after a few years of reading the current neuroscience research comeback and we might have something to discuss.
Eliezer, replying to a comment by pjeby: “you did, in fact, say something stupid.”
Word.
If insane happens before super-smart, you can stop upgrading the other.
Well, fair enough, there is that.
Perhaps you mean to say that we’re not particularly trustworthy in our choices of what we modify ourselves to do or prefer?
Human brains, after all, are most exquisitely designed for modifying themselves, and can do it quite autonomously. They’re just not very good at predicting the broader implications of those modifications, or at finding the right things to modify.
We’re talking about direct explicit low level self modification. ie, uploading, then using that more convenient form to directly study one’s own internal workings until one decides to go “hrm… I think I’ll reroute these neural connections to… that, add a few more of this other kind of neuron over here and...”
Recall that the thing doing all that reasoning is the thing that’s being affected by these modifications.
Yes, but that would be the stupidest possible way of doing it, when there are already systems in place to do structured modification at a higher level of abstraction. Doing it at an individual neuron level would be like trying to… well, I would’ve said “write a property management program in Z-80 assembly,” except I know a guy who actually did that. So, let’s say, something about 1000 times harder. ;-)
What I find extremely irritating is when people talk about brain modification as if it’s some sort of 1) terribly dangerous thing that 2) only happens post-uploading and 3) can only be done by direct hardware (or simulated hardware) modification. The correct answer is, “none of the above”.
Lists like that have a good chance of canceling out. That is, there are a bunch of ways people disagree with you because they’re talking about something else.
Well, we’re talking about the kind of modifications that ordinary, non-invasive, high-level methods, acting through the usual sensory channels, don’t allow. For example, no amount of ordinary self-help could make someone unable to feel physical pain, or can let you multiply large numbers extremely quickly in the manner of a savant. Changing someone’s sexual orientation is also, at best, extremely difficult and at worst impossible. We can’t seem to get rid of confirmation bias, or cure schizophrenia, or change an autistic brain into a neurotypical brain (or vice versa). There are lots of things that one might want to do to a brain that simply don’t happen as long as that brain is sitting inside a skull only receiving input through normal human senses.
Difficult question. I believe those links are relevant, but your formulation also implies the threat of an arms race.
My best shot for now would be this: avoid self-modification. The top priority right now is defending people from the potential harmful effects of this thing you created, because someone less benevolent might stumble upon it soon. Find people who share this sentiment and use the speedup together to think hard about the problem of defense.
Perhaps an “anti arms race” would be a more accurate notion. ie, in once sense, waiting for the mathematics of FAI to be solved would be preferable. Would be safer to get to a point that we can mathematically ensure that the thing will be well behaved.
On the other hand, while waiting, how many will suffer and die irretrievably? If the cost for waiting was much smaller, then the answer of “wait for the math and construct the FAI rather than trying to patchwork update a spaghetti coded human mind” would be, to me, the clearly preferable choice.
Even given avoiding self modification, massive speedup would still correspond to significant amount of power. We already know how easily humans… change… with power. And when sped up, obviously people not sped up would seem different, “lesser”… helping to reinforce the “I am above them” sense. One might try to solve this by figuring out how to self modify enough to, well, not to that. But self modification itself being a starting point for, if one does not do it absolutely perfectly, potential disaster, well...
Anyways, so your suggestion would basically be “only use the power to, well, defend against the power” rather than use it to actually try to fix some of the annoying little problems in the world (like… death and and and and and… ?)
FAI is one possible means of defense, there might be others.
You shouldn’t just wait for FAI, you should speed up FAI developers too because it’s a race.
I think the strategy of developing a means of defense first has higher expected utility than fixing death first, because in the latter case someone else who develops uploading can destroy/enslave the world while you’re busy fixing it.
Given how misrepresented the official story is supposed to be, the part about personally ruling the fabric of the World can be assumed to be twisted as well.
Nope, they didn’t get that part wrong.
Look, you should know me well enough by now to know that I don’t keep my stories on nice safe moral territory.
A happy ending here is not guaranteed. But think about this very carefully. Are you sure you’d have turned the Sword on Vhazhar? They don’t have the same options we do.
He’s going to be the emperor. He could implement Parliament, he could create jury trials. He could even put Dolf and Selena on trial for their crimes.
It’s interesting that Hirou holds the world accountable to his own moral code, which assumes power corrupts. Then, at the last moment, he grants absolute power to Vhazhar. So in the middle of choosing to use our world’s morality, which is built upon centuries of learning to doubt human nature, in the middle of that—Vhazhar’s good intentions are so good that they justify granting him absolute power. Lesson not learned.
his own moral code, which assumes power corrupts
Hold on. How can a moral code say anything about questions of fact, such as whether or not power corrupts?
Because “corrupt” is a morally-loaded term.
It seems to me that “power corrupts” means “power changes goal content,” and that’s a purely factual claim.
It doesn’t mean that. It means something more like “power changes the empowered’s utility function in a way others deem immoral”. (ETA simplified)
ETA: Just to make the point clearer, there are many things that change an individual’s goal content but are not considered corrupting. For example, trying new foods will generally make you divert more effort to finding one kind of food (that you didn’t know you liked). Having children of your own makes you more favorable to children in general. But we don’t say, and people generally don’t believe, “having children corrupts” or “trying new foods corrupts”.
Okay, but that’s still a factual claim underneath the moral one.
It’s a bit of argumentum ad webcomicum, but http://www.agirlandherfed.com/comic/?375 is not something I find particularly implausible. There was Marcus Aurelius.
Link’s broken. Is this guess the page in question?
Yup!
Also: it seems like a really poor plan, in the long term, for the fate of the entire plane to rest on the sanity of one dude. If Hirou kept the sword, he could maybe try to work with the wizards—ask them to spend one day per week healing people, make sure the crops do okay, etc. Things maybe wouldn’t be perfect, but at least he wouldn’t be running the risk of everybody-dies.
Okay, but in any case, regarding the issue at hand, “power corrupts” is not a purely factual claim. (And I thought that hybrid claims get counted as moral by default, since that’s the most useful for discussion, but I could be wrong.)
Then you need to separate the factual claim and the moral claim, and discuss them separately. The factual claim would be, “power changes goal content in this particular way”, and the moral claim is, ”...and this is bad.”
Is this fair though? Let’s say the passage had been, ”… his position that it is immoral to possess nuclear weapons”. That too breaks down into a factual and moral claim.
Moral: “it is wrong to possess a weapon with massive, unfocused destructive power”
Factual: “The devices we currently call nuclear weapons inflict massive, unfocused destruction.”
Would you object to “his position that it is immoral to posses nuclear weapons” on the grounds that “you need to separate the factual and moral claims”?
Well, in fact it would be highly helpful to separate the claims here, even though the factual part is uncontroversial, because it makes it clear what argument is being made, exactly.
And in this case it’s uncertain/controversial how much power actually changes behavior, who it changes, how reliably; and this is the key issue, whereas the moral concept that “the behavior of killing everyone who disagrees with you, is wrong” is relatively uncontroversial among us. So calling this a moral claim when the key disputed part is actually a factual claim is a bad idea.
What’s the evolutionary explanation for power not corrupting?
Evolution doesn’t do most things. Doing things requires oceans of blood for every little adaptation and humans haven’t had power for all that long.
Toddlers need to learn how to hide. How’s that for failing to evolve knowledge of the obvious (to a human brain) and absurdly useful.
Be careful you don’t end up explaining two contradictory outcomes equally well, thus proving you have zero knowledge on evolution’s effect on power and corruption!
And then there are those of us who take moral claims to be factual claims.
I think my concern about “power corrupts” is this: humans have a strong drive to improve things. We need projects, we need challenges. When this guy gets unlimited power, he’s going to take two or three passes over everything and make sure everybody’s happy, and then I’m worried he’s going to get very, very bored. With an infinite lifespan and unlimited power, it’s sort of inevitable.
What do you do, when you’re omnipotent and undying, and you realize you’re going mad with boredom?
Does “unlimited power” include the power to make yourself not bored?
If Vhazhar has the option of editing the nasty bits out of reality and then stepping down from power, I’d help him. If he must personally become a ruler for all eternity, I’d kill him, then smash the goddamn device, then try to somehow ensure that future aspiring Dark Lords also get killed in time.
This could be how the ‘balance’ mythology and the prophecy got started. Perhaps the hero decided long ago that it wasn’t worth the risk, and wanted to make sure future heroes kill the Dark Lord.
I assume that the sword tests the correspondence of person’s intentions (plan) to their preference. If the sword uses a static concept of preference that comes with the sword instead, why would Vhazhar be interested in sword’s standard of preference? Thus, given that the Vhazhar’s plan involves control over the fabric of the World, the plan must be sound and result in correct installation of Vhazhar’s preference in the rules of the world. This excludes the technical worries about the failure modes of human mind in wielding too much power (which is how I initially interpreted “personal control”—as a recipe for failure modes).
I’m not sure what it means for the other people’s preferences (and specifically mine). I can’t exclude the possibility that it’s worse than the do-nothing option, but it doesn’t seem obviously so either, given psychological unity of humans. From what I know, on the spot I’d favor Vhazhar’s personal preference, if the better alternative is unlikely, given that this choice instantly wards off existential risk and lack of progress.
No, it’s the Sword of GOOD. It tests whether you’re GOOD, not any of this other stuff.
It should be obvious that the sword doesn’t test how well your plans correspond to what you think you want! Otherwise Hirou would have been vaporized.
Wasn’t it established that this world’s conception of “good” and “evil” are messed up? Why should he trust that the sword really works exactly as advertised?
Only assuming that the sword is impulsive. If you take into account Hirou’s overall role in the events, this role could be judged good, if only by the final decision.
If the sword judges not plans, but preference, then failing 9 out of 10 people means that it’s pretty selective among humans and probably people it selects and their values aren’t representative (act in the interests) of the humanity as whole.
If the Sword of Good tested whether you’re good, Hirou would have been vapourized, because he was obviously not good. He was at the very least an accomplice to murderers, a racist, and a killer. The Sword of Good may not have vapourized Charles Manson, Richard Nixon, Hitler, or most suicide bombers, either. The Sword of Good tests whether you think you are good, not whether your actions are good.
Strangely, the sword kills nine out of ten people who try to wield it. However, if you knew the sword could only be wielded by a good person, you’d only try to pick it up if you thought you were good, which happens to be the criteria you must fulfil in order to pick up the sword. Essentially, if you think you can wield the Sword of Good, you can.
Well, he was clearly redeemable, at least. It didn’t take very much for him to let go of his assumptions, just a few words from someone he thought was an enemy. Making dumb mistakes, even ones with dire consequences, doesn’t necessarily make you not Good.
What, realistically, does it mean to be irredeemable? Was Dolf irredeemable? Selena? Is the difference between them and Hirou simply the fact that Hirou realized he was doing bad, and they didn’t? Why should that be sufficient to redeem him? Mistakes are not accidents; mistakenly killing someone is still murder.
Surely if awareness and repentance of the immoral nature of your actions makes you Good, the reverse—lack of awareness—means animals that kills other animals without regret are more evil than people who kill other people and regret it.
No, it’s manslaughter.
If you believe someone is evil, hunt them down and kill them, and afterward realize they weren’t, it was a mistake. It was also murder. It’s not as though you killed in self defense or accidentally dropped an air conditioner on them. Manslaughter is not a defense that can be employed simply because you changed your mind.
Perhaps I should clarify: I don’t mean “mistake” in that “he mistook his wife for a burglar and killed her”. That’s manslaughter. I mean “mistake” in that “he mistakenly murdered a good person instead of a bad one”. Ba gur bgure unaq, jura Uvebh xvyyrq Qbys ng gur raq, ur jnfa’g znxvat n zvfgnxr (ubjrire, V fgvyy guvax vg jnf zheqre).
You present a compelling argument that murder can be a morally blameless—even praiseworthy—act. I do not believe this was your intention.
To be clear, you believe that, right wedrifid? I came this close to downvoting before I deduced the context.
I believe that there are times where the described behaviour is morally acceptable. I don’t think it is helpful to label that behaviour ‘murder’ but if someone were to define that as murder it would mean that murder (of that particular kind) was ok.
To be clear, there are stringent standards on the behaviour which preceded the mistake. This is something that should happen very infrequently. Both epistemic rationality standards and instrumental rationality standards apply. For example, sincerely believing that the person had committed a crime because you happen to be bigoted and irrational leaves you morally culpable and failing to take actions that provide more evidence where the VoI is high and cost is low also leaves you morally culpable. The ‘excuse’ for hunting and down a killing an innocent that you mistakenly believed was sufficiently evil is not “I was mistaken” but rather “any acceptably rational and competent individual in this circumstance would have believed that the target was sufficiently evil”.
It’s not too hard to imagine a scenario in which hunting down and killing someone is indeed the right thing to do… the obvious example is that, given perfect hindsight, it would have been much better if one of the many early attempts to assassinate Hitler had in fact succeeded.
Bonus question: Which one of the failed attempts was most likely to have been made by a time traveler? ;)
Suppose you’re a police officer trying to arrest someone for a crime, and there is ample evidence that the person you are trying to arrest is indeed guilty of that crime. The person resists arrest, and you end up killing the person instead of making a successful capture. Are you a murderer?
Does it matter if it turns out that the evidence against this person turns out to have been forged (by someone else)?
If you have no intention of killing them and they die as a side effect of your actions, it’s an accident, and manslaughter. If you kill them because you realize you can’t arrest them, it’s murder, complete with intention of malice. However, the fact that your actions are sanctioned by the state is obviously not a defense (a la Nuremberg), and so there’s no point in adding “police officer” to the example.
You could ask if I thought executing someone who was framed would be considered murder, but since I view all manner of execution murder, guilty or no, there’s no use.
Actually, I think there is. If you kill someone without “state sanction”, as you put it, it’s almost certainly Evil. If you kill someone that the local laws allow you to kill, it’s much less likely to be Evil, because non-Evil reasons for killing, such as self-defense, tend to be accounted for in most legal systems. Anyway, I think I’m getting off the subject. Let me try rephrasing the general scenario:
You are a police officer. You have an arrest warrant for a suspected criminal. If you try to arrest the suspect, he is willing to use lethal force against you in order to prevent being captured. You also believe that, once the suspect has attempted to use lethal force against you, non-lethal force will prove to be insufficient to complete the arrest.
The way I see it, this could end in several ways:
1) Don’t try to make an arrest attempt at all.
2) Attempt to make an arrest. The suspect responds by attempting to use lethal force against you. (He shoots at you with a low-caliber pistol, but you are protected by your bulletproof vest.) You believe that non-lethal force will most likely fail to subdue the suspect. Not willing to use lethal force and kill the suspect, you retreat, failing to make the arrest.
3) Attempt to make an arrest. The suspected criminal responds by attempting to use lethal force against you. (He shoots at you with a low-caliber pistol, but you are protected by your bulletproof vest.) You believe that non-lethal force will most likely fail to subdue the suspected criminal, but try anyway. (You start running at him, intending to wrestle the gun away from him with your bare hands.) The suspected criminal kills you. (He shoots you in the head.)
4) Attempt to make an arrest. The suspected criminal responds by attempting to use lethal force against you. (He shoots at you with a low-caliber pistol, but you are protected by your bulletproof vest.) You believe that non-lethal force will most likely fail to subdue the suspected criminal, so you resort to lethal force. (You shoot him with your own gun.) The suspected criminal is killed, and, when you are questioned about your actions, your lawyer says that you killed the suspect in self-defense. (Under U.S. law, this would indeed be the case—you would not be guilty of murder.)
Obviously Scenario 2 is a better outcome than Scenario 3, because in Scenario 3, you end up dead. However, if you know that you’re not willing to use lethal force to begin with, and that non-lethal force is going to be insufficient, you’re probably better off not making the arrest attempt at all, which is Scenario 1. Therefore Scenario 1 is better than Scenario 3. If you’re going to make an arrest attempt at all, you are expecting Scenario 4 to occur. If you go through with Scenario 4, does that make you Evil? You initiated the use of force by making the arrest attempt, but the suspect could have chosen to submit to arrest rather than to fight against you—and he did, indeed, use lethal force before you did.
I notice that you left off an outcome that if anything allows you to make your point stronger.
5) Attempt to make an arrest. You see that the suspected criminal has the capacity to use lethal force against you (he is armed) and you suspect that he will use it against you. You shoot the suspect. His use of lethal force against you is never more than counterfactual (ie. a valid suspicion).
For consistency some “6)” may be required in which the first “attempt to use lethal force against you” is successful. I suggest that this action is not necessarily Evil, for similar reasons that you describe for scenario 4. Obviously this is less clear cut and has more scope for failure modes like “black suspect reaches for ID” so we want more caution in this instance and (ought to) grant police officers less discretion.
I think ‘almost certain’ may be something of an overstatement. The states that we personally live in are not a representative sample of states and killing tyrants is not something we can call ‘almost certainly’ Evil. The same consideration applies to self defence laws. Self defence laws in an average state selected from all states across time were not sufficiently fair as to make claims about almost certain Evil.
Once he uses lethal force against you, your use of lethal force would be self-defense, not murder.
I perceive that you have not yet learned to use the logic of the Phoenix.
Care to elaborate on that rather cryptic remark?
The logic of the Phoenix is that the Lord of Dark will resurrect everyone he can, including Dolf, so it isn’t murder.
logic of the phoenix?
No, this logic of the Phoenix. What makes you think cutting off someone’s head is murder?
“He died, but you have taught me a new meaning for ‘is dead’.” (From the same book.)
Not every decapitation is murder, but “the wizard stopped, ceased to exist...as something seemed to flow away” is suggestive.
I was thinking the same thing. The way Eliezer wrote that bit seemed to make it clear that something rather more than mere decapitation occurred there.
Hm, so it does. Well, if Hirou had no way of knowing that, then it’s manslaughter at worst.
Though, actually spelling it out directly does end up sounding funny. “Well… I don’t know that cutting off his head with this sword would kill him… I mean, is it really reasonable for me to have expected that?” :)
(Actually, I thought I’d deleted the “ceased to exist” phrase. I’ll go ahead and take it out.)
I figured that Vhazhar really wouldn’t be able to save Dolf. That’s why it’s a sacrifice.
You are using two definitions of “good”—how much good your actions cause, and how good you believe yourself to be. Neither of those is used by the sword; rather, some sort of virtue-ethics definition—I suspect motive.
Doing a bad thing does not necessarily make one a bad person. Though it helps.
So a sincerely evil person would pass with flying colors?
I assumed the sword tested compliance with the current CEV of the human race.
Why just the human race? Orcs are people too (at least in this story).
Good catch. Yes, of course.
Presumably, actual mutants are unlikely, with most “evil” people actually just holding mistaken (about their actual preference) moral beliefs. If the sword is an external moral authority, it’s harder to see why one would consult it.
On the other hand, sword checks soundness of the plan against some preference, which is an important step that is absent if one doesn’t consult the sword, which can justify accepting a somewhat mismatched preference if that allows to use the test.
This passes the choice of mismatching preferences to a different situation. If the sword tests person’s preference, then protagonist’s choice is between lack of progress or unlikely good outcome and (if Vhazhar’s plan is sound) verified installation of Vhazhar’s preference, with the latter presumably close to others’ preference, thus being a moderately good option. If the sword tests some kind of standard preference, this standard preference is presumably also close to Vhazhar’s preference, thus Vhazhar faces a choice between trying to install his own preference through unverified process, which can go through all kinds of failure modes, and using the sword to test the reliability of his plan.
The fact that Vhazhar is willing to use the sword to test the soundness of his plan, when the failed test means his death, shows that he prefers leaving the rest of the world be to incorrectly changing it. This is a strong signal that should’ve been part of the information given to protagonist for making the decision.