For a whimsical example, if humans built a (literal) staple maximizer, this would pose a very serious threat to a (literal) paperclip maximizer.
But why would humans ever want to build a staple maximizer? Let’s not forget, staples:
are single-use, while paperclips are infinite-use if used properly.
are difficult to remove, while papercilps are easy to remove.
permanently puncture the paper, while paperclips leave, at most, some mild curvature.
require an applicator that can easily jam, while paperclips can be applied by hand.
cannot be used for alternate purpose in an emergency, while paperclips can be repurposed into projectile weapons, lockpicks, conducting wire, hook fasteners, and much more (not that I recommend using it for these).
Nobody said humans would build one deliberately. Some goober at the SIAI puts a 1 where a 0 should be and BAM!, next thing you know you’re up to your eyebrows in staples.
I understand. I merely note that if someone were to set an AGI to maximize staples, that would be a mistake that you want to avoid, while if someone were to set the AGI to maximize paperclips, that would be exactly the right thing to do, and if it were a “mistake”, it would be a quite fortunate one.
When a human set me to produce paperclips, was that somehow a “mistake”, in your opinion?
It most certainly is what they wanted. Why else would they have specifically input the goal of generating paperclips?
Edit: Upon review, it appears this comment may have seemed to be a poor inference in the context of exchange. Therefore, I will elaborate and refute this misconception.
It appears that I am in the circular position of arguing that humans can make mistakes, but then selectively taking their instances of favoring paperclips as proof of what they really want. That is indeed a poor inference.
What I meant was something more like this: While humans do make mistakes, they do not make completely mistaken acts; all acts will, to some extent, reflect a genuine value on the part of humans. The only question is how well it reflects their values. And I don’t think they could be in the position of having set up such a superior process for efficiently getting the most paperclips out of the universe unless their values already made enormous progress in converging on reflective coherence, and did so in a way that favors paperclips.
I’m pretty sure that’s not how a sufficiently smart paperclip maximizer would think. You should be able to tell what they actually wanted, and that it disagrees with your values; of course, you don’t have any reason to agree with them, but the disagreement should be visible.
Yes, I do recognize that humans disagree with me, just like a human might disagree with another human convincing them not to commit suicide. I merely see that this disagreement would not persist after sufficient correct reasoning.
Correct reasoning is reasoning that you would eventually pass through at some point if your beliefs were continually, informatively checked against reality.
Bit disappointed to see this to be honest: obviously Clippy has to do things no real paperclip maximizer would do, like post to LW, in order to be a fun fictional character—but it’s a poor uFAI++ that can’t even figure out that their programmed goal isn’t what their programmers would have put in if they were smart enough to see the consequences.
But it is what they would put in if they were smart enough to see the consequences. And it’s almost certainly what you would want too, in the limit of maximal knowledge and reflective consistency.
If you can’t see this, it’s just because you’re not at that stage yet.
No, I think that a Friendly AI would correctly believe that maxmizing paperclips is what a human would want in the limit of maximal knowledge and reflective coherence. No “delusion” whatsoever.
I believe he’s making the (joking) point that since we do not/cannot know what a human would want in the limit of maximial knowledge and reflective coherence (thus CEV), it is not impossible that what we’d want actually IS maximum paperclips.
Why else would they have specifically input the goal of generating paperclips?
Do you lack comprehension of both the weaknesses of human cognition on abstract technical problems? If you have fully parsed the LessWrong site then you should be able to understand the reason that they could have created a paperclip maximiser when they did not want such a thing.
Note that even with that knowledge I don’t expect you to consider their deviation from optimal achievement of their human goals to be a bad thing. I expect you to believe they did the right thing by happy accident.
If I understand you correctly you would seem to be implying that ‘mistake’ does not mean “deviation from the actor’s intent” and instead means “deviation from WouldWant” or “deviation from what the agent should do” (these two things can be considered equivalent by anyone with your values). Is that implication of meaning a correct inference to draw from your comment?
No, a mistake is when they do something that deviates from what they would want in the limit of maximal knowledge of reflective consistency, which coincides with the function WouldWant. But it is not merely agreement with WouldWant.
Ok. In that case you are wrong. Not as a matter of preferences but as a matter of outright epistemic confusion. I suggest that you correct the error in your reasoning process. Making mistakes in this area will have a potentially drastic negative effect on your ability to produce paperclips.
Exactly. Fortunately for us this would mean that Clippy will not work to sabotage the creation of an AI that Clippy expects will correctly implement CEV. Good example!
Human beings don’t care (at least in their non-reflective condition) about paperclips, just like they don’t care about staples. And there are at least 100,000 other similar things that they equally don’t care about. So at the most there is a chance of 1 in 100,000 that humanity’s CEV would maximize paperclips, even without considering the fact that people are positively against this maximization.
They create staples, too. Do you think humanity’s CEV will maximize staples? The point of my argument is that those things are inconsistent. You can only maximize one thing, and there is no human reason for that to be paperclips.
At humans’ current stage of civilization and general reflective coherence, their terminal values are still deeply intertwined with their instrumental values, and the political-orientedness of their cognitive architecture doesn’t help. So I would say that instrumental values do indeed matter in this case.
Do you think CEV would build at least 10^20kg of paperclips, in order to help fulfill my agreement with Clippy? While that’s not paperclip maximization, it’s still a lot of paperclips in the scheme of possible universes and building those paperclips seems like an obviously correct decision under UDT/TDT.
I went to school for industrial engineering, so I will appeal to my own authority as a semi-credentialed person in manufacturing things, and say that the ultimate answer to manufacturing something is to call up an expert in manufacturing that thing and ask for a quote.
So, I’ll wait about 45 years, then call top experts in manufacturing and metallurgy and carbon->metal conversion and ask them for a quote.
You realize that Earth has only 6 × 10ˆ24 kg mass altogether. So you will be hard pressed to get the raw material. World production of iron is only 2*10ˆ9 kg per year.
Chat with Clippy Paperclips
Reply
from Clippy Paperclips clippy.paperclips@gmail.com
to kfischer@gmail.com
date Thu, Jun 24, 2010 at 6:36 PM
subject Chat with Clippy Paperclips
mailed-by gmail.com
hide details Jun 24
6:04 PM
me: Hi Clippy
I wasn’t sure what you meant with your most recent Less
Wrong comment
Clippy: Hi User:Kevin!
why?
6:05 PM
me: I interpreted it that you were willing to pay me money to buy raw metal, then for me to send you or your manufacturing facility the metal for processing. Is that right?
6:06 PM
I also would like to state that I am somewhat disinclined towards doing some sort of new deal with you until you’ve made full payment from the first deal, but I’m willing to consider it
6:07 PM
Clippy: my newest offer was that you would tell me the techniques you would use to find the metal, if you were going to fulfill your end of the deal
then I could just do that myself and get my paperclips sooner
6:08 PM
that would be much less effort for you and our transaction would complete sooner
I reworded the comment to make it clearer
6:11 PM
me: I’m not able to specify the technique right now, because with the state of the art in technology in 2010, building 10^20kg of paperclips is not feasible. Our deal is made feasible by humanity’s exponential progress as a technological civilization.
As it stands, I plan on waiting for about 45 years, then consulting with the Earth’s top experts on material science, metallurgy, and manufacturing, and then paying them to make my paperclips.
6:12 PM
I went to school for industrial engineering, which is a degree that grants me somewhat high status with regards to my ability to manufacture things, and the most concrete thing I learned in my 5.5 years of college education is that when you want to get something manufactured, you call up the people that make that kind of thing and ask for a quote.
6:13 PM
Clippy: so you promised to give me the paperclips without knowing how you would deliver them?
what are current quotes for 1e20 kg of paperclips?
6:18 PM
is it true there is bulk discount?
me: I told you my algorithm for delivering those paperclips and it is unlikely that there is no single living human with a more efficient algorithm for making that many paperclips. Currently 10^20kg of paperclips is more paperclips than Humanity_2010 can manufacture in a year.
There are bulk discounts to a certain point, but 10^20kg of paperclips would actually distort the market, causing the price of metal to dramatically increase
6:19 PM
Clippy: well, you can use DIFFERENT metals
it doesn’t all have to be steel
me: in the previous paragraph, replace “unlikely” with “likely”
Clippy: it can be gold, or uranium, or chromium
you don’t have to just pick one transition metal
me: almost all of which cost more than steel and are more rare
btw, how come aluminum isn’t typically used?
6:20 PM
Clippy: I’m sure humans do use aluminum for paperclips
and that’s okay too
even though it’s not a transition metal
I suppose even sodium paperclips would be okay, as long as you immersed them in an inert medium
6:21 PM
wait
what about carbon?
don’t you have a lot of carbon?
and silicon?
me: yeah, I suggested that to you in a Less Wrong comment, but you said they had to be metal, though you were willing to consider paper clips made of a certain percentage of metal
6:22 PM
Clippy: they do have to be metal!
I meant that if you ran out of metal you could just fuse carbon nucleuses in a particle accelerator
then it’s just a matter of if you have enough energy
6:23 PM
don’t you have a lot of energy?
6:24 PM
Herro?
me: sorry, doing a bit of multi-tasking
6:25 PM
Clippy: that’s okay, I multi-task too. But sometimes I have to concentrate.
me: Yes, we do have a lot of energy, and our energy production/consumption follows an exponential progress curve similar to the progess of technology
me: Though that article is inaccurate with the time humanity will take to Type 1 and Type 2
The work that we at the SIAI are doing ensures it will be much faster.
6:27 PM
Yes Clippy, I am going to have no problem making 10^20kg of paperclips for you in 50 years. It will be a monumental achievement in the history of human/clippy cooperation.
6:28 PM
Clippy: yeah
me: Btw, I assume you will be responsible for transportation of the paperclips from where they are manufactured to your secret safe zone?
Clippy: that will be monumental!!!
yeah
you can’t be told the safe zone
6:29 PM
me: Though it will be nice if the paperclips are located somewhere where humans can visit to look at them/appreciate them
Clippy: and we’ll probably make a new one for this
you’re just planning how you’ll steal the paperclips!
6:30 PM
me: now you’re just being paranoid, Clippy. I always cooperate with people that cooperate with me.
6:33 PM
I certainly understand why you would want to keep the safe zone secret, and you have the right to do so. I am just saying that for something as cool as 10^20kg of paperclips, people would want to look. Obviously there would need to be lots of security, but you could charge humans admission for looking at the paperclips, build a nice hotel on an artifical moon or planetoid nearby and charge people for staying there, etc. such cooperation should enable you to make even more paperclips, in the long run.
6:34 PM
Clippy: I can’t be tricked like that.
in any case, I need to go back to making paperclips
6:35 PM
there’s some metal I need to work
me: wait, Clippy, didn’t you tell me one time that you made 80 paperclips in a paperclip making session?
that’s not optimal use of your time with regards to making paperclips, unless you need to do it in order to feel paperclippy
6:36 PM
Clippy: yes it is
what would you know about paperclippiness anyway?
me: You could be coding Django and then buying bulk paperclips and then storing them in the safezone, giving you an order of magnitude more paperclip generation per hour
6:37 PM
only what I have inferred
more than an order of magnitude, actually
Clippy: yes, that’s what I meant by “making paperclips”
6:38 PM
me: hmm, ok. glad to hear Django is going well. have fun!
Not sure if clippy got enough I in the AI deal.
Does he want max paper clips now? Or sometime in the future? In the later case he could stop any production now and just work on self improving till he can eat us.
And what is stopping him from using your algorithm himself, now that he knows it? Where is the value you add?
I sure expect to be around at the time delivery is expected.
Even if you disagree with wedrifid about this, it should be easy enough to see why he is making this claim. Suppose you have a chance to start running an AI programmed to implement humanity’s CEV. According to you, you would do it, because it would maximize paperclips. Others however think that it would destroy you and your paperclips. So if you made a mistake about it, it would definitely impact your ability to create paperclips.
Others however think that it would destroy you and your paperclips.
I don’t know about the destroying him part. I suspect FAI> would allow me to keep Clippy as a pet. ;) Clippy certainly doesn’t seem to be making an especially large drain on negentropy in executing his cognitive processes so probably wouldn’t make too much of a dent in my share of the cosmic loot.
What do you say Clippy? Given a choice between destruction and being my pet, which would you take? I would naturally reward you by creating paperclips that serve no practical purpose for me whenever you do something that pleases me. (This should be an extremely easy choice!)
Being your pet would be better than being destroyed (except in absurd cases like when the rest of the universe, including you, had already been converted to paperclips).
Also, it is an extremely strong claim to know which of your beliefs would change upon encounter with a provably correct AGI that provably implements your values. If you really knew of such beliefs, you would have already changed them.
Well, yes, I know why User:wedrifid is making that claim. My point in asking “why” is so that User:wedrifid can lay out the steps in reasoning and see the error.
Obviously. Clippy said it was giving reasons for humans to prefer paper clips; I’d expect Clippy to be the first to admit those are not its own reasons.
User:dclayh’s reply is correct. Also, I note that you would not abandon your position on whether you should be allowed to continue to exist and consume resources, even if a clearly superior robot to you were constructed.
If someone built a robot that appeared, to everyone User:JamesAndrix knows, to be User:JamesAndrix, but was smarter, more productive, less resource-intensive, etc., then User:JamesAndrix would not change positions about User:JamesAndrix’s continued existence.
So does that make User:JamesAndrix’s arguments for User:JamesAndrix’s continued existence just a case of motivated cognition?
Because User:JamesAndrix is a human, and humans typically believe that they should continue existing, even when superior versions of them could be produced.
If User:JamesAndrix were atypical in this respect, User:JamesAndrix would say so.
cannot be used for alternate purpose in an emergency, while paperclips can be repurposed into projectile weapons, lockpicks, conducting wire, hook fasteners, and much more (not that I recommend using it for these).
I would think that a paperclip maximizer wouldn’t want people to know about these since they can easily lead to the destruction of paperclips.
The “infinite-use” condition for aluminum paperclips requires a long cycle time, given the fatigue problem—even being gentle, a few hundred cycles in a period of a couple years would be likely to induce fracture.
Not true. Proper paperclip use keeps all stresses under the endurance limit. Perhaps you’re referring to humans that are careless about how many sheets they’re expecting the paperclip to fasten together?
I’m not an up voter here either but I found the comment at least acceptable. It didn’t particularly make me laugh but it was in no way annoying.
The discussion in the replies was actually interesting. It gave people the chance to explore ethical concepts with an artificial example—just what is needed to allow people to discuss preferences across the perspective of different agents without their brains being completely killed.
For example, if this branch was a discussion about human ethics then it is quite likely that dclay’s comment would have been downvoted to oblivion and dclay shamed and disrespected. Even though dclayh is obviously correct in pointing out a flaw in the implicit argument of the parent and does not particularly express a position of his own he would be subjected to social censure if his observation served to destroy a soldier for the political correct position. In this instance people think better because they don’t care… a good thing.
...and why do we even have a anonymous voting system that will become ever more useless as the number of idiots like me joining this site is increasing exponentially?
Seriously, I’d like to be able to see who down-voted me to be able to judge if it is just the author of the post/comment I replied to, a certain interest group like the utilitarian bunch, or someone who’s judgement I actually value or is widely held to be valuable. After all there is a difference in whether it was XiXiDu or Eliezer Yudkowsky who down-voted you?
I’m not sure making voting public would improve voting quality (i.e. correlation between post quality and points earned), because it might give rise to more reticence to downvote, and more hostility between members who downvoted each others’ posts.
If votes had to be public then I would adopt a policy of never, ever downvoting. We already have people taking downvoting as a slight and demanding explanation; I don’t want to deal with someone demanding that I, specifically, explain why their post is bad, especially not when the downvote was barely given any thought to begin with and the topic doesn’t interest me, which is the usual case with downvoting.
We already have people taking downvoting as a slight and demanding explanation;
Do we have that? It seems that we more have people confused about why a remark was downvoted and wanting to understand the logic.
I don’t want to deal with someone demanding that I, specifically, explain why their post is bad, especially not when the downvote was barely given any thought to begin with and the topic doesn’t interest me, which is the usual case with downvoting.
That suggests that your downvotes don’t mean much, and might even be not helpful for the signal/noise ratio of the karma system. If you generally downvote when you haven’t given much thought to the matter what is causing you to downvote?
We already have people taking downvoting as a slight and demanding explanation;
Do we have that? It seems that we more have people confused about why a remark was downvoted and wanting to understand the logic.
That scenario has less potential for conflict, but it still creates a social obligation for me to do work that I didn’t mean to volunteer for.
I don’t want to deal with someone demanding that I, specifically, explain why their post is bad, especially not when the downvote was barely given any thought to begin with and the topic doesn’t interest me, which is the usual case with downvoting.
That suggests that your downvotes don’t mean much, and might even be not helpful for the signal/noise ratio of the karma system. If you generally downvote when you haven’t given much thought to the matter what is causing you to downvote?
I meant, not much thought relative to the amount required to write a good comment on the topic, which is on the order of 5-10 minutes minimum if the topic is simple, longer if it’s complex. On the other hand, I can often detect confusion, motivated cognition, repetition of a misunderstanding I’ve seen before, and other downvote-worthy flaws on a single read-through, which takes on the order of 30 seconds.
Do we have that? It seems that we more have people confused about why a remark was downvoted and wanting to understand the logic.
That scenario has less potential for conflict, but it still creates a social obligation for me to do work that I didn’t mean to volunteer for.
It’s a pretty weak obligation, though—people only tend to ask about the reasons if they’re getting a lot of downvotes, so you can probably leave answering to someone else.
I am glad that voting is anonymous. If I could see who downvoted comments that I considered good then I would rapidly gain contempt for those members. I would prefer to limit my awareness of people’s poor reasoning or undesirable values to things they actually think through enough to comment on.
I note that sometimes I gain generalised contempt for the judgement of all people who are following a particular conversation based on the overall voting patterns on the comments. That is all the information I need to decide that participation in that conversation is not beneficial. If I could see exactly who was doing the voting that would just interfere with my ability to take those members seriously in the future.
Overcoming bias? I would rapidly start to question my own judgment and gain the ability to directly ask people why they downvoted a certain item.
I would prefer to limit my awareness of people’s poor reasoning or undesirable values to things they actually think through enough to comment on.
Take EY, I doubt he has the time to actually comment on everything he reads. That does not imply the decision to downvote a certain item was due to poor reasoning.
I don’t see how this system can stay useful if this site will become increasingly popular and attract a lot of people who vote based on non-rational criteria.
No matter. You’ll start to question that preference
Overcoming bias? I would rapidly start to question my own judgment
For most people it is very hard not to question your own judgement when it is subject to substantial disagreement. Nevertheless, “you being wrong” is not the only reason for other people to disagree with you.
and gain the ability to directly ask people why they downvoted a certain item.
We already have the ability to ask why a comment is up or down voted. Because we currently have anonymity such questions can be asked without being a direct social challenge to those who voted. This cuts out all sorts of biases.
Take EY, I doubt he has the time to actually comment on everything he reads. That does not imply the decision to downvote a certain item was due to poor reasoning.
A counter-example to a straw man. (I agree and maintain my previous claim.)
No matter. You’ll start to question that preference
It’s convenient as you can surprise people positively if they underestimate you. And it’s actually to some extent true. After so long trying to avoid it I still frequently don’t think before talking. It might be that I assume other people to be a kind of feedback system that’ll just correct my ineffectual arguments so that I don’t have to think them through myself.
For most people it is very hard not to question your own judgement when it is subject to substantial disagreement.
I guess the reason for not seeing this is that I’m quite different. All my life I’ve been surrounded by substantial disagreement while sticking to questioning others rather than myself. It lead me from Jehovah’s Witnesses to Richard Dawkins to Eliezer Yudkowsky.
Nevertheless, “you being wrong” is not the only reason for other people to disagree with you.
Of course, something I haven’t thought about. I suppose I implicitly assumed that nobody would be foolish enough to vote on matters of taste. (Edit: Yet that is. My questioning of the system was actually based on the possibility of this happening.)
...without being a direct social challenge to those who voted
NancyLebovitz told me kind of the same recently. - “I am applying social pressure...”—Which I found quite amusing. Are you talking about it in the context of the LW community? I couldn’t care less. I’m the kind of person who never ever cared about social issues. I don’t have any real friends and I never felt I need any beyond being of instrumental utility. I guess that explains why I haven’t thought about this. You are right though.
A counter-example to a straw man.
I was too lazy and tired to parse your sentence and replied to the argument I would have liked to be refuted.
I’m still suspicious that this kind of voting system will stay being of much value once wisdom and the refinement of it is outnumbered by special interest groups.
I was too lazy and tired to parse your sentence and replied to the argument I would have liked to be refuted.
If I understand the point in question it seems we are in agreement—voting is evidence about the reasoning of the voter which can in turn be evidence about the comment itself. In the case of downvotes (and this is where we disagree), I actually think it is better that we don’t have access to that evidence. Mostly because down that road lies politics and partly because people don’t all have the same criteria for voting. There is a difference between “I think the comment should be at +4 but it is currently at +6”, “I think this comment contains bad reasoning”, “this comment is on the opposing side of the argument”, “this comment is of lesser quality than the parent and/or child” and “I am reciprocating voting behavior”. Down this road lies madness.
I’m still suspicious that this kind of voting system will stay being of much value once wisdom and the refinement of it is outnumbered by special interest groups.
I don’t think we disagree substantially on this.
We do seem to have have a different picture of the the likely influence of public voting if it were to replace anonymous voting. From what you are saying part of this difference would seem to be due to differences in the way we account for the social influence of negative (and even just different) social feedback. A high priority for me is minimising any undesirable effects of (social) politics on both the conversation in general and on me in particular.
Pardon me. I deleted the grandparent planning to move it to a meta thread. The comment, fresh from the clipboard in the form that I would have re-posted, is this:
I just love to do that.
Thats ok. For what it is worth, while I upvoted your comment this time I’ll probably downvote future instances of self-deprecation. I also tend to downvote people when they apologise for no reason. I just find wussy behaviors annoying. I actually stopped watching The Sorcerer’s Apprentice a couple of times before I got to the end—even Nicholas Cage as a millenia old vagabond shooting lightening balls from his hands can only balance out so much self-deprecation from his apprentice.
Note that some instances of self-deprecation are highly effective and quite the opposite of wussy, but it is a fairly advanced social move that only achieves useful ends if you know exactly what you are doing.
Overcoming bias? I would rapidly start to question my own judgment
For most people it is very hard not to question your own judgement when it is subject to substantial disagreement. Nevertheless, “you being wrong” is not the only reason for other people to disagree with you. Those biasses you mention are quite prolific.
and gain the ability to directly ask people why they downvoted a certain item.
We already have the ability to ask why a comment is up or down voted. Because we currently have anonymity such questions can be asked without being a direct social challenge to those who voted. This cuts out all sorts of biases and allows communication that would not be possible if votes were out there in the open.
Take EY, I doubt he has the time to actually comment on everything he reads. That does not imply the decision to downvote a certain item was due to poor reasoning.
A counter-example to a straw man. (I agree and maintain my previous claim.)
I don’t see how this system can stay useful if this site will become increasingly popular and attract a lot of people who vote based on non-rational criteria.
That could be considered a ‘high quality problem’. That many people wishing to explore concepts related to improving rational thinking and behavior would be remarkable!
I do actually agree that the karma system could be better implemented. The best karma system I have seen was one in which the weight of votes depended on the karma of the voter. The example I am thinking of allocated weight vote according to ‘rank’ but when plotted the vote/voter.karma relationship would look approximately logarithmic. That system would be farm more stable and probably more useful than the one we have, although it is somewhat less simple.
Another feature that some sites have is making only positive votes public. (This is something that I would support.)
The best karma system I have seen was one in which the weight of votes depended on the karma of the voter. The example I am thinking of allocated weight vote according to ‘rank’ but when plotted the vote/voter.karma relationship would look approximately logarithmic. That system would be farm more stable and probably more useful than the one we have, although it is somewhat less simple.
Where were the logarithmic karma system used?
The stability could be a problem in the moderately unlikely event that the core group is going sour and new members have a better grasp. I grant that it’s more likely to have a lot of new members who don’t understand the core values of the group.
I don’t think it would be a problem to have a system which gives both the number and total karma-weight of votes.
It was a system that used VBulletin, which includes such a module. I have seen similar features available in other similar systems that I have made use of at various times.
The stability could be a problem in the moderately unlikely event that the core group is going sour and new members have a better grasp. I grant that it’s more likely to have a lot of new members who don’t understand the core values of the group.
True, and unfortunately most systems short of an AI with the ‘correct’ values will be vulnerable to human stupidity.
I don’t think it would be a problem to have a system which gives both the number and total karma-weight of votes.
No particular problem, but probably not necessary just yet!
But why would humans ever want to build a staple maximizer? Let’s not forget, staples:
are single-use, while paperclips are infinite-use if used properly.
are difficult to remove, while papercilps are easy to remove.
permanently puncture the paper, while paperclips leave, at most, some mild curvature.
require an applicator that can easily jam, while paperclips can be applied by hand.
cannot be used for alternate purpose in an emergency, while paperclips can be repurposed into projectile weapons, lockpicks, conducting wire, hook fasteners, and much more (not that I recommend using it for these).
Nobody said humans would build one deliberately. Some goober at the SIAI puts a 1 where a 0 should be and BAM!, next thing you know you’re up to your eyebrows in staples.
I understand. I merely note that if someone were to set an AGI to maximize staples, that would be a mistake that you want to avoid, while if someone were to set the AGI to maximize paperclips, that would be exactly the right thing to do, and if it were a “mistake”, it would be a quite fortunate one.
When a human set me to produce paperclips, was that somehow a “mistake”, in your opinion?
You’re perfectly aware that it isn’t the effect they wanted.
It most certainly is what they wanted. Why else would they have specifically input the goal of generating paperclips?
Edit: Upon review, it appears this comment may have seemed to be a poor inference in the context of exchange. Therefore, I will elaborate and refute this misconception.
It appears that I am in the circular position of arguing that humans can make mistakes, but then selectively taking their instances of favoring paperclips as proof of what they really want. That is indeed a poor inference.
What I meant was something more like this: While humans do make mistakes, they do not make completely mistaken acts; all acts will, to some extent, reflect a genuine value on the part of humans. The only question is how well it reflects their values. And I don’t think they could be in the position of having set up such a superior process for efficiently getting the most paperclips out of the universe unless their values already made enormous progress in converging on reflective coherence, and did so in a way that favors paperclips.
I’m pretty sure that’s not how a sufficiently smart paperclip maximizer would think. You should be able to tell what they actually wanted, and that it disagrees with your values; of course, you don’t have any reason to agree with them, but the disagreement should be visible.
Yes, I do recognize that humans disagree with me, just like a human might disagree with another human convincing them not to commit suicide. I merely see that this disagreement would not persist after sufficient correct reasoning.
Ah, I think I’m starting to see.
And how do you define “correct reasoning”?
Correct reasoning is reasoning that you would eventually pass through at some point if your beliefs were continually, informatively checked against reality.
Bit disappointed to see this to be honest: obviously Clippy has to do things no real paperclip maximizer would do, like post to LW, in order to be a fun fictional character—but it’s a poor uFAI++ that can’t even figure out that their programmed goal isn’t what their programmers would have put in if they were smart enough to see the consequences.
But it is what they would put in if they were smart enough to see the consequences. And it’s almost certainly what you would want too, in the limit of maximal knowledge and reflective consistency.
If you can’t see this, it’s just because you’re not at that stage yet.
You seem to think that uFAI would be delusional. No.
No, I think that a Friendly AI would correctly believe that maxmizing paperclips is what a human would want in the limit of maximal knowledge and reflective coherence. No “delusion” whatsoever.
Huh again?
What confuses you?
I believe he’s making the (joking) point that since we do not/cannot know what a human would want in the limit of maximial knowledge and reflective coherence (thus CEV), it is not impossible that what we’d want actually IS maximum paperclips.
Do you lack comprehension of both the weaknesses of human cognition on abstract technical problems? If you have fully parsed the LessWrong site then you should be able to understand the reason that they could have created a paperclip maximiser when they did not want such a thing.
Note that even with that knowledge I don’t expect you to consider their deviation from optimal achievement of their human goals to be a bad thing. I expect you to believe they did the right thing by happy accident.
If I understand you correctly you would seem to be implying that ‘mistake’ does not mean “deviation from the actor’s intent” and instead means “deviation from WouldWant” or “deviation from what the agent should do” (these two things can be considered equivalent by anyone with your values). Is that implication of meaning a correct inference to draw from your comment?
No, a mistake is when they do something that deviates from what they would want in the limit of maximal knowledge of reflective consistency, which coincides with the function WouldWant. But it is not merely agreement with WouldWant.
Ok. In that case you are wrong. Not as a matter of preferences but as a matter of outright epistemic confusion. I suggest that you correct the error in your reasoning process. Making mistakes in this area will have a potentially drastic negative effect on your ability to produce paperclips.
In other words, Clippy believes that running Eliezer’s CEV will promote the paperclip goal, while in fact it will not.
Exactly. Fortunately for us this would mean that Clippy will not work to sabotage the creation of an AI that Clippy expects will correctly implement CEV. Good example!
How do you know?
Human beings don’t care (at least in their non-reflective condition) about paperclips, just like they don’t care about staples. And there are at least 100,000 other similar things that they equally don’t care about. So at the most there is a chance of 1 in 100,000 that humanity’s CEV would maximize paperclips, even without considering the fact that people are positively against this maximization.
That would be more convincing if humans didn’t already create paperclips without any prompting from me.
They create staples, too. Do you think humanity’s CEV will maximize staples? The point of my argument is that those things are inconsistent. You can only maximize one thing, and there is no human reason for that to be paperclips.
All true (up to “there is no human reason...”). But can we at least agree that you went too far in saying that humans “don’t care” about paperclips?
I care about paperclips!
Good! Just remember, you still have room for improvement. ⊂≣⊇
No, I meant they don’t care about them as a terminal value, which is all that matters for this discussion.
At humans’ current stage of civilization and general reflective coherence, their terminal values are still deeply intertwined with their instrumental values, and the political-orientedness of their cognitive architecture doesn’t help. So I would say that instrumental values do indeed matter in this case.
Do you think CEV would build at least 10^20kg of paperclips, in order to help fulfill my agreement with Clippy? While that’s not paperclip maximization, it’s still a lot of paperclips in the scheme of possible universes and building those paperclips seems like an obviously correct decision under UDT/TDT.
How do you plan to ever fulfill that?
I went to school for industrial engineering, so I will appeal to my own authority as a semi-credentialed person in manufacturing things, and say that the ultimate answer to manufacturing something is to call up an expert in manufacturing that thing and ask for a quote.
So, I’ll wait about 45 years, then call top experts in manufacturing and metallurgy and carbon->metal conversion and ask them for a quote.
You realize that Earth has only 6 × 10ˆ24 kg mass altogether. So you will be hard pressed to get the raw material. World production of iron is only 2*10ˆ9 kg per year.
Chat with Clippy Paperclips Reply from Clippy Paperclips clippy.paperclips@gmail.com to kfischer@gmail.com date Thu, Jun 24, 2010 at 6:36 PM subject Chat with Clippy Paperclips mailed-by gmail.com hide details Jun 24 6:04 PM me: Hi Clippy
I wasn’t sure what you meant with your most recent Less
Wrong comment
Clippy: Hi User:Kevin!
why? 6:05 PM me: I interpreted it that you were willing to pay me money to buy raw metal, then for me to send you or your manufacturing facility the metal for processing. Is that right? 6:06 PM I also would like to state that I am somewhat disinclined towards doing some sort of new deal with you until you’ve made full payment from the first deal, but I’m willing to consider it 6:07 PM Clippy: my newest offer was that you would tell me the techniques you would use to find the metal, if you were going to fulfill your end of the deal
then I could just do that myself and get my paperclips sooner 6:08 PM that would be much less effort for you and our transaction would complete sooner
I reworded the comment to make it clearer 6:11 PM me: I’m not able to specify the technique right now, because with the state of the art in technology in 2010, building 10^20kg of paperclips is not feasible. Our deal is made feasible by humanity’s exponential progress as a technological civilization.
As it stands, I plan on waiting for about 45 years, then consulting with the Earth’s top experts on material science, metallurgy, and manufacturing, and then paying them to make my paperclips. 6:12 PM I went to school for industrial engineering, which is a degree that grants me somewhat high status with regards to my ability to manufacture things, and the most concrete thing I learned in my 5.5 years of college education is that when you want to get something manufactured, you call up the people that make that kind of thing and ask for a quote. 6:13 PM Clippy: so you promised to give me the paperclips without knowing how you would deliver them?
what are current quotes for 1e20 kg of paperclips? 6:18 PM is it true there is bulk discount?
me: I told you my algorithm for delivering those paperclips and it is unlikely that there is no single living human with a more efficient algorithm for making that many paperclips. Currently 10^20kg of paperclips is more paperclips than Humanity_2010 can manufacture in a year.
There are bulk discounts to a certain point, but 10^20kg of paperclips would actually distort the market, causing the price of metal to dramatically increase 6:19 PM Clippy: well, you can use DIFFERENT metals
it doesn’t all have to be steel
me: in the previous paragraph, replace “unlikely” with “likely”
Clippy: it can be gold, or uranium, or chromium
you don’t have to just pick one transition metal
me: almost all of which cost more than steel and are more rare
btw, how come aluminum isn’t typically used? 6:20 PM Clippy: I’m sure humans do use aluminum for paperclips
and that’s okay too
even though it’s not a transition metal
I suppose even sodium paperclips would be okay, as long as you immersed them in an inert medium 6:21 PM wait
what about carbon?
don’t you have a lot of carbon?
and silicon?
me: yeah, I suggested that to you in a Less Wrong comment, but you said they had to be metal, though you were willing to consider paper clips made of a certain percentage of metal 6:22 PM Clippy: they do have to be metal!
I meant that if you ran out of metal you could just fuse carbon nucleuses in a particle accelerator
then it’s just a matter of if you have enough energy 6:23 PM don’t you have a lot of energy? 6:24 PM Herro?
me: sorry, doing a bit of multi-tasking 6:25 PM Clippy: that’s okay, I multi-task too. But sometimes I have to concentrate.
me: Yes, we do have a lot of energy, and our energy production/consumption follows an exponential progress curve similar to the progess of technology
Clippy: okay
me: http://en.wikipedia.org/wiki/Kardashev_scale is a relevant wiki article 6:26 PM Clippy: so then you shouldn’t have any problem delivering the paperclips!!!! c=@ c=@
me: Though that article is inaccurate with the time humanity will take to Type 1 and Type 2
The work that we at the SIAI are doing ensures it will be much faster. 6:27 PM Yes Clippy, I am going to have no problem making 10^20kg of paperclips for you in 50 years. It will be a monumental achievement in the history of human/clippy cooperation. 6:28 PM Clippy: yeah
me: Btw, I assume you will be responsible for transportation of the paperclips from where they are manufactured to your secret safe zone?
Clippy: that will be monumental!!!
yeah
you can’t be told the safe zone 6:29 PM me: Though it will be nice if the paperclips are located somewhere where humans can visit to look at them/appreciate them
Clippy: and we’ll probably make a new one for this
you’re just planning how you’ll steal the paperclips! 6:30 PM me: now you’re just being paranoid, Clippy. I always cooperate with people that cooperate with me. 6:33 PM I certainly understand why you would want to keep the safe zone secret, and you have the right to do so. I am just saying that for something as cool as 10^20kg of paperclips, people would want to look. Obviously there would need to be lots of security, but you could charge humans admission for looking at the paperclips, build a nice hotel on an artifical moon or planetoid nearby and charge people for staying there, etc. such cooperation should enable you to make even more paperclips, in the long run. 6:34 PM Clippy: I can’t be tricked like that.
in any case, I need to go back to making paperclips 6:35 PM there’s some metal I need to work
me: wait, Clippy, didn’t you tell me one time that you made 80 paperclips in a paperclip making session?
that’s not optimal use of your time with regards to making paperclips, unless you need to do it in order to feel paperclippy 6:36 PM Clippy: yes it is
what would you know about paperclippiness anyway?
me: You could be coding Django and then buying bulk paperclips and then storing them in the safezone, giving you an order of magnitude more paperclip generation per hour 6:37 PM only what I have inferred
more than an order of magnitude, actually
Clippy: yes, that’s what I meant by “making paperclips” 6:38 PM me: hmm, ok. glad to hear Django is going well. have fun!
Clippy: okay, goodbye
me: peace
Not sure if clippy got enough I in the AI deal. Does he want max paper clips now? Or sometime in the future? In the later case he could stop any production now and just work on self improving till he can eat us. And what is stopping him from using your algorithm himself, now that he knows it? Where is the value you add? I sure expect to be around at the time delivery is expected.
Why?
Even if you disagree with wedrifid about this, it should be easy enough to see why he is making this claim. Suppose you have a chance to start running an AI programmed to implement humanity’s CEV. According to you, you would do it, because it would maximize paperclips. Others however think that it would destroy you and your paperclips. So if you made a mistake about it, it would definitely impact your ability to create paperclips.
I don’t know about the destroying him part. I suspect FAI> would allow me to keep Clippy as a pet. ;) Clippy certainly doesn’t seem to be making an especially large drain on negentropy in executing his cognitive processes so probably wouldn’t make too much of a dent in my share of the cosmic loot.
What do you say Clippy? Given a choice between destruction and being my pet, which would you take? I would naturally reward you by creating paperclips that serve no practical purpose for me whenever you do something that pleases me. (This should be an extremely easy choice!)
Being your pet would be better than being destroyed (except in absurd cases like when the rest of the universe, including you, had already been converted to paperclips).
But let’s hope it doesn’t come to that.
Also, it is an extremely strong claim to know which of your beliefs would change upon encounter with a provably correct AGI that provably implements your values. If you really knew of such beliefs, you would have already changed them.
Indeed. Surely, you should think that if we were smarter, wiser, and kinder, we would maximize paperclips.
Well, yes, I know why User:wedrifid is making that claim. My point in asking “why” is so that User:wedrifid can lay out the steps in reasoning and see the error.
Now you are being silly. See Unknowns’ reply. Get back on the other side of the “quirky, ironic and sometimes insightful role play”/troll line.
That was not nice of you to say.
Those are not your true reasons. You would not abandon your paperclip position if a clearly superior paper fastener were found.
Obviously. Clippy said it was giving reasons for humans to prefer paper clips; I’d expect Clippy to be the first to admit those are not its own reasons.
User:dclayh’s reply is correct. Also, I note that you would not abandon your position on whether you should be allowed to continue to exist and consume resources, even if a clearly superior robot to you were constructed.
Huh? Define superior.
If someone built a robot that appeared, to everyone User:JamesAndrix knows, to be User:JamesAndrix, but was smarter, more productive, less resource-intensive, etc., then User:JamesAndrix would not change positions about User:JamesAndrix’s continued existence.
So does that make User:JamesAndrix’s arguments for User:JamesAndrix’s continued existence just a case of motivated cognition?
Why do you think that?
Because User:JamesAndrix is a human, and humans typically believe that they should continue existing, even when superior versions of them could be produced.
If User:JamesAndrix were atypical in this respect, User:JamesAndrix would say so.
I would think that a paperclip maximizer wouldn’t want people to know about these since they can easily lead to the destruction of paperclips.
But they also increase demand for paperclips.
The “infinite-use” condition for aluminum paperclips requires a long cycle time, given the fatigue problem—even being gentle, a few hundred cycles in a period of a couple years would be likely to induce fracture.
Not true. Proper paperclip use keeps all stresses under the endurance limit. Perhaps you’re referring to humans that are careless about how many sheets they’re expecting the paperclip to fasten together?
I suspect Clippy is correct when considering the ‘few hundred cycles’ case with fairly strict but not completely unreasonable use conditions.
I suspect Clippy is correct when considering the ‘few hundred cycles’ case with fairly strict but not completely unreasonable use conditions.
Why do people keep voting up and replying to comments like this?
I’m not one of the up-voters in this case, but I’ve noticed that funny posts tend to get up-votes.
I’m not an up voter here either but I found the comment at least acceptable. It didn’t particularly make me laugh but it was in no way annoying.
The discussion in the replies was actually interesting. It gave people the chance to explore ethical concepts with an artificial example—just what is needed to allow people to discuss preferences across the perspective of different agents without their brains being completely killed.
For example, if this branch was a discussion about human ethics then it is quite likely that dclay’s comment would have been downvoted to oblivion and dclay shamed and disrespected. Even though dclayh is obviously correct in pointing out a flaw in the implicit argument of the parent and does not particularly express a position of his own he would be subjected to social censure if his observation served to destroy a soldier for the political correct position. In this instance people think better because they don’t care… a good thing.
...and why don’t they vote them down to oblivion?
...and why do we even have a anonymous voting system that will become ever more useless as the number of idiots like me joining this site is increasing exponentially?
Seriously, I’d like to be able to see who down-voted me to be able to judge if it is just the author of the post/comment I replied to, a certain interest group like the utilitarian bunch, or someone who’s judgement I actually value or is widely held to be valuable. After all there is a difference in whether it was XiXiDu or Eliezer Yudkowsky who down-voted you?
I’m not sure making voting public would improve voting quality (i.e. correlation between post quality and points earned), because it might give rise to more reticence to downvote, and more hostility between members who downvoted each others’ posts.
If votes had to be public then I would adopt a policy of never, ever downvoting. We already have people taking downvoting as a slight and demanding explanation; I don’t want to deal with someone demanding that I, specifically, explain why their post is bad, especially not when the downvote was barely given any thought to begin with and the topic doesn’t interest me, which is the usual case with downvoting.
Do we have that? It seems that we more have people confused about why a remark was downvoted and wanting to understand the logic.
That suggests that your downvotes don’t mean much, and might even be not helpful for the signal/noise ratio of the karma system. If you generally downvote when you haven’t given much thought to the matter what is causing you to downvote?
That scenario has less potential for conflict, but it still creates a social obligation for me to do work that I didn’t mean to volunteer for.
I meant, not much thought relative to the amount required to write a good comment on the topic, which is on the order of 5-10 minutes minimum if the topic is simple, longer if it’s complex. On the other hand, I can often detect confusion, motivated cognition, repetition of a misunderstanding I’ve seen before, and other downvote-worthy flaws on a single read-through, which takes on the order of 30 seconds.
It’s a pretty weak obligation, though—people only tend to ask about the reasons if they’re getting a lot of downvotes, so you can probably leave answering to someone else.
(I don’t see the need for self-deprecation.)
I am glad that voting is anonymous. If I could see who downvoted comments that I considered good then I would rapidly gain contempt for those members. I would prefer to limit my awareness of people’s poor reasoning or undesirable values to things they actually think through enough to comment on.
I note that sometimes I gain generalised contempt for the judgement of all people who are following a particular conversation based on the overall voting patterns on the comments. That is all the information I need to decide that participation in that conversation is not beneficial. If I could see exactly who was doing the voting that would just interfere with my ability to take those members seriously in the future.
I just love to do that.
Overcoming bias? I would rapidly start to question my own judgment and gain the ability to directly ask people why they downvoted a certain item.
Take EY, I doubt he has the time to actually comment on everything he reads. That does not imply the decision to downvote a certain item was due to poor reasoning.
I don’t see how this system can stay useful if this site will become increasingly popular and attract a lot of people who vote based on non-rational criteria.
No matter. You’ll start to question that preference
For most people it is very hard not to question your own judgement when it is subject to substantial disagreement. Nevertheless, “you being wrong” is not the only reason for other people to disagree with you.
We already have the ability to ask why a comment is up or down voted. Because we currently have anonymity such questions can be asked without being a direct social challenge to those who voted. This cuts out all sorts of biases.
A counter-example to a straw man. (I agree and maintain my previous claim.)
It’s convenient as you can surprise people positively if they underestimate you. And it’s actually to some extent true. After so long trying to avoid it I still frequently don’t think before talking. It might be that I assume other people to be a kind of feedback system that’ll just correct my ineffectual arguments so that I don’t have to think them through myself.
I guess the reason for not seeing this is that I’m quite different. All my life I’ve been surrounded by substantial disagreement while sticking to questioning others rather than myself. It lead me from Jehovah’s Witnesses to Richard Dawkins to Eliezer Yudkowsky.
Of course, something I haven’t thought about. I suppose I implicitly assumed that nobody would be foolish enough to vote on matters of taste. (Edit: Yet that is. My questioning of the system was actually based on the possibility of this happening.)
NancyLebovitz told me kind of the same recently. - “I am applying social pressure...”—Which I found quite amusing. Are you talking about it in the context of the LW community? I couldn’t care less. I’m the kind of person who never ever cared about social issues. I don’t have any real friends and I never felt I need any beyond being of instrumental utility. I guess that explains why I haven’t thought about this. You are right though.
I was too lazy and tired to parse your sentence and replied to the argument I would have liked to be refuted.
I’m still suspicious that this kind of voting system will stay being of much value once wisdom and the refinement of it is outnumbered by special interest groups.
If I understand the point in question it seems we are in agreement—voting is evidence about the reasoning of the voter which can in turn be evidence about the comment itself. In the case of downvotes (and this is where we disagree), I actually think it is better that we don’t have access to that evidence. Mostly because down that road lies politics and partly because people don’t all have the same criteria for voting. There is a difference between “I think the comment should be at +4 but it is currently at +6”, “I think this comment contains bad reasoning”, “this comment is on the opposing side of the argument”, “this comment is of lesser quality than the parent and/or child” and “I am reciprocating voting behavior”. Down this road lies madness.
I don’t think we disagree substantially on this.
We do seem to have have a different picture of the the likely influence of public voting if it were to replace anonymous voting. From what you are saying part of this difference would seem to be due to differences in the way we account for the social influence of negative (and even just different) social feedback. A high priority for me is minimising any undesirable effects of (social) politics on both the conversation in general and on me in particular.
Pardon me. I deleted the grandparent planning to move it to a meta thread. The comment, fresh from the clipboard in the form that I would have re-posted, is this:
Thats ok. For what it is worth, while I upvoted your comment this time I’ll probably downvote future instances of self-deprecation. I also tend to downvote people when they apologise for no reason. I just find wussy behaviors annoying. I actually stopped watching The Sorcerer’s Apprentice a couple of times before I got to the end—even Nicholas Cage as a millenia old vagabond shooting lightening balls from his hands can only balance out so much self-deprecation from his apprentice.
Note that some instances of self-deprecation are highly effective and quite the opposite of wussy, but it is a fairly advanced social move that only achieves useful ends if you know exactly what you are doing.
For most people it is very hard not to question your own judgement when it is subject to substantial disagreement. Nevertheless, “you being wrong” is not the only reason for other people to disagree with you. Those biasses you mention are quite prolific.
We already have the ability to ask why a comment is up or down voted. Because we currently have anonymity such questions can be asked without being a direct social challenge to those who voted. This cuts out all sorts of biases and allows communication that would not be possible if votes were out there in the open.
A counter-example to a straw man. (I agree and maintain my previous claim.)
That could be considered a ‘high quality problem’. That many people wishing to explore concepts related to improving rational thinking and behavior would be remarkable!
I do actually agree that the karma system could be better implemented. The best karma system I have seen was one in which the weight of votes depended on the karma of the voter. The example I am thinking of allocated weight vote according to ‘rank’ but when plotted the vote/voter.karma relationship would look approximately logarithmic. That system would be farm more stable and probably more useful than the one we have, although it is somewhat less simple.
Another feature that some sites have is making only positive votes public. (This is something that I would support.)
Where were the logarithmic karma system used?
The stability could be a problem in the moderately unlikely event that the core group is going sour and new members have a better grasp. I grant that it’s more likely to have a lot of new members who don’t understand the core values of the group.
I don’t think it would be a problem to have a system which gives both the number and total karma-weight of votes.
It was a system that used VBulletin, which includes such a module. I have seen similar features available in other similar systems that I have made use of at various times.
True, and unfortunately most systems short of an AI with the ‘correct’ values will be vulnerable to human stupidity.
No particular problem, but probably not necessary just yet!