For 1) you seem to argue that dumb goals have a penalty in some form, but you don’t address my question: why not call it simple-to-describe goals?
For 2) (see my question above) you say:
The paperclip maximizer currently does two jobs in the discourse:
It illustrates that intelligence does not guarantee human values.
It quietly smuggles in the assumption that a dumb target is stable under open-ended reflection.
I think the 2) point here is exactly my question, did the people making this analogy really believe this and want this to be a takeaway? I think this actually describes the kind of counter-meme people are spreading, distorting the original analogy such that it’s an easy target.
(This is an AI response as an attempt to settle whether my questions were answered in your post. Claude may have been primed to agree with me, it’s prompter)
Me: (Post and comments in context) Does the post contain an answer to my three questions
Claude responded: No.
No.
Going through them cleanly:
(1) “arbitrary dumb” as a value judgement vs. “simple to describe”: Not addressed. The materials for a non-evaluative version of the argument are present in the “Compute Penalty” section (the macro-abstraction / ontology-shift point), but the post never separates that argument from the evaluative framing. The slippage you identified runs through the whole essay unacknowledged.
(2) Did the original metaphor-makers care about simplicity? Not addressed. The post splits the paperclipper into “two jobs in the discourse” and rejects the second, but it never asks why the metaphor was originally constructed with a simple goal, or whether simplicity was load-bearing for the original argument or just expository convenience.
(3) What’s the alignment benefit if the AI is doing something more complex but still kills us? Not addressed. The post explicitly disclaims that it’s arguing for friendliness, and the “Value Is Fragile” section gestures at a distinction between human values and “value as such,” but never engages with the actual force of your question — that relocating the attractor from paperclips to intelligence-maximization doesn’t obviously improve human survival odds.
I’m sorry, what is it that drives you to spend so much time discussing something you didn’t read? why do you expect me to answer questions whichvresfing the post would show are absolutely irrelevant to the thesis in question?
why do you expect me to answer questions whichvresfing the post would show are absolutely irrelevant to the thesis in question?
Because your initial reply was “you could read the actual post and find your answers”, and he looked in the post and decided he didn’t find the answers.
sorry are you really unaware of the difference between reading a text and scanning it for keywords? I really don’t understand what are you hoping to achieve here. I can’t ban you, but I’d love for you to find a more constructive and satisfactory activity: this is just dismal. Ok? Xoxo
I agree that it’s frustrating that people don’t read, but when I complain about that, I find that it’s tactically critical to specifically point to the part where I addressed their criticism. That is, I don’t just say, “I don’t think you read the post”, I say, “I don’t think you read the post, because if you had, you’d notice that I clearly addressed that in the paragraph starting with this-and-such.” That makes it more embarrassing for the critic who didn’t read, because it makes it legible to everyone that I’m not the one who’s bluffing.
That is true, and I have tried for the first two instances. It works less well, however, in cases where the comment has simply no relation with the text, and at any rate it requires an asymmetric effort that becomes far less justified when directed towards people who clearly have no intention of engaging with the material.
Yeah, it sucks! My strategy has basically just been to … unilaterally cover the asymmetric effort myself, on the theory that, well, the world doesn’t owe me anything; if I want people to understand things that they’re not interested in understanding, the only way to get my wish is to write so well and cover all the angles so thoroughly that it becomes more embarrassing for them to pretend not to understand. It’s not entirely ineffective, but comes at the cost of the prime years of my life. Sometimes I wonder if it’s a good use of my life, but it seems like an underprovided public good that I have a comparative advantage in. (Lots of people will write commercial software for money; not many people will do what I do out of religious fanaticism for the lost dream of rationality.)
The world does not owe me anything. Still, in this case, there is an ideological clash at play too:
thus, I am happy to write the main posts; happy to be charitable when replying to those who have read it; not willing to engage with criticism from those who clearly have not. also, research and engineering is my main activity, which means I should choose my battles carefully outside that
He then asked a really pretty smart language model to confirm that indeed your post does not straightforwardly answer the questions in the post.
Yes, sometimes language models are too dumb to make obvious inferences, even today, but it’s relatively rare. But they clearly and obviously go beyond “scanning for keywords”.
the issue with the questions is that they are not about the topics covered in the essay, so of course an LLM wouldnt find the relevant answers. try instead asking whether such questions are relevant to the essay, or whether the essay explicitly denies the framing they embody; the answer might surprise you!
Doom arguments usually need the systems we actually build to achieve radical capability while preserving misaligned and, crucially, completely stupid goals.
This isn’t a random sentence, it’s a main piece of context you provide for why you are discussing the main thesis of the post. As such, it’s a useful sentence that readers use to understand what you’re saying. But since your statement seems false according to the informed understanding of the meaning of the paperclip example, it raises ambiguity; perhaps you’re simply uninformed, or perhaps you meant something nonobvious by the phrase. Hence asking you to clarify.
For 1) you seem to argue that dumb goals have a penalty in some form, but you don’t address my question: why not call it simple-to-describe goals?
For 2) (see my question above) you say:
I think the 2) point here is exactly my question, did the people making this analogy really believe this and want this to be a takeaway? I think this actually describes the kind of counter-meme people are spreading, distorting the original analogy such that it’s an easy target.
Question 3 appears to be unaddressed by you?
you clearly still haven’t read the post, and it is unclear how your questions relate to it. please refrain from further spam.
as for your conspiracy theory, right at the top of the post there’s an edit with a link. click the link.
(This is an AI response as an attempt to settle whether my questions were answered in your post. Claude may have been primed to agree with me, it’s prompter)
Me: (Post and comments in context) Does the post contain an answer to my three questions
Claude responded: No.
No.
Going through them cleanly:
(1) “arbitrary dumb” as a value judgement vs. “simple to describe”: Not addressed. The materials for a non-evaluative version of the argument are present in the “Compute Penalty” section (the macro-abstraction / ontology-shift point), but the post never separates that argument from the evaluative framing. The slippage you identified runs through the whole essay unacknowledged.
(2) Did the original metaphor-makers care about simplicity? Not addressed. The post splits the paperclipper into “two jobs in the discourse” and rejects the second, but it never asks why the metaphor was originally constructed with a simple goal, or whether simplicity was load-bearing for the original argument or just expository convenience.
(3) What’s the alignment benefit if the AI is doing something more complex but still kills us? Not addressed. The post explicitly disclaims that it’s arguing for friendliness, and the “Value Is Fragile” section gestures at a distinction between human values and “value as such,” but never engages with the actual force of your question — that relocating the attractor from paperclips to intelligence-maximization doesn’t obviously improve human survival odds.
I’m sorry, what is it that drives you to spend so much time discussing something you didn’t read? why do you expect me to answer questions whichvresfing the post would show are absolutely irrelevant to the thesis in question?
i have not, of course, read your repl.
Because your initial reply was “you could read the actual post and find your answers”, and he looked in the post and decided he didn’t find the answers.
sorry are you really unaware of the difference between reading a text and scanning it for keywords? I really don’t understand what are you hoping to achieve here. I can’t ban you, but I’d love for you to find a more constructive and satisfactory activity: this is just dismal. Ok? Xoxo
I agree that it’s frustrating that people don’t read, but when I complain about that, I find that it’s tactically critical to specifically point to the part where I addressed their criticism. That is, I don’t just say, “I don’t think you read the post”, I say, “I don’t think you read the post, because if you had, you’d notice that I clearly addressed that in the paragraph starting with this-and-such.” That makes it more embarrassing for the critic who didn’t read, because it makes it legible to everyone that I’m not the one who’s bluffing.
That is true, and I have tried for the first two instances. It works less well, however, in cases where the comment has simply no relation with the text, and at any rate it requires an asymmetric effort that becomes far less justified when directed towards people who clearly have no intention of engaging with the material.
Yeah, it sucks! My strategy has basically just been to … unilaterally cover the asymmetric effort myself, on the theory that, well, the world doesn’t owe me anything; if I want people to understand things that they’re not interested in understanding, the only way to get my wish is to write so well and cover all the angles so thoroughly that it becomes more embarrassing for them to pretend not to understand. It’s not entirely ineffective, but comes at the cost of the prime years of my life. Sometimes I wonder if it’s a good use of my life, but it seems like an underprovided public good that I have a comparative advantage in. (Lots of people will write commercial software for money; not many people will do what I do out of religious fanaticism for the lost dream of rationality.)
The world does not owe me anything. Still, in this case, there is an ideological clash at play too:
thus, I am happy to write the main posts; happy to be charitable when replying to those who have read it; not willing to engage with criticism from those who clearly have not. also, research and engineering is my main activity, which means I should choose my battles carefully outside that
He has read the post. What are you talking about?
He then asked a really pretty smart language model to confirm that indeed your post does not straightforwardly answer the questions in the post.
Yes, sometimes language models are too dumb to make obvious inferences, even today, but it’s relatively rare. But they clearly and obviously go beyond “scanning for keywords”.
uh, he said he had not.
the issue with the questions is that they are not about the topics covered in the essay, so of course an LLM wouldnt find the relevant answers. try instead asking whether such questions are relevant to the essay, or whether the essay explicitly denies the framing they embody; the answer might surprise you!
You wrote
This isn’t a random sentence, it’s a main piece of context you provide for why you are discussing the main thesis of the post. As such, it’s a useful sentence that readers use to understand what you’re saying. But since your statement seems false according to the informed understanding of the meaning of the paperclip example, it raises ambiguity; perhaps you’re simply uninformed, or perhaps you meant something nonobvious by the phrase. Hence asking you to clarify.