I’m an independent researcher currently working on a sequence of posts about consciousness. You can send me anonymous feedback here: https://www.admonymous.co/rafaelharth.
Rafael Harth
(Have read the post.) I disagree. I think overall habryka has gone through much greater pains than I think he should have to, but I don’t think this post is a part he should have skimped on. I would feel pretty negative about it if habryka had banned Said without an extensive explanation for why (modulo past discussions already kinda providing an explanation). I’d expect less transparency/effort for banning less important users.
I think Sam Harris had the right idea when he said (don’t have the link unfortunately) that asking about the meaning of life is just bad philosophy. No one who is genuinely content asks about the meaning of life. No one is like “well I feel genuine fulfillment and don’t crave anything, but I just gotta know, what’s the meaning of everything?” Meaning is a thing you ask about if you feel dissatisfied. (And tbqh that’s kinda apparent from your OP.)
So at the real risk of annoying you (by being paternalizing/not-actually-aswering-your-question), I think asking about meaning is the wrong approach altogether. The thing that, if you had it, would make it feel like you’ve solved your problem, is fulfillment. (Which I’m using in a technical way but it’s essentially happiness + absence of craving.) I’d look into meditation, especially equanimity practice.
That said, I think re-framing your life as feeling like it has more of a purpose isn’t generally pointless (even though it’s not really well-defined or attacking the root of the problem). But seems difficult in your case since your object-level beliefs about where we’re headed seem genuinely grim.
I feel like even accepting that actual model welfare is not a thing (as in, the model isn’t conscious) this might still be a reasonable feature just based on feedback to the user? Like if people are going to train social interactions based on LLM chats to whatever extent, then it’s probably better if they’ll face consequences. It can’t be too difficult to work around this.
The implication is valid in your formulation, but then Y doesn’t imply anything because it says nothing about the distribution. I’m saying that if you change Y to actually support your conclusion, then fails. Either way the entire argument doesn’t seem to work.
Fair enough. I’m mostly on board with that, my one gripe is that the definition only sounds similar to people who are into the Buddhist stuff. “Suffering mostly comes from craving” seems to me to be one of the true but not obvious insights from Buddhism. So just equating them in the definition is kinda provoking a reaction like from Said.
I agree but I don’t think the Buddhist definition is what Lsusr said it is (do you?). Suffering is primarily caused by the feeling that the world ought to be different but I don’t think it’s identical. Although I do expect you can find some prominent voices saying so.
Now you’re sort of asking me to do the work for you, but I did get interested enough to start thinking about it, so here’s my more invested take.
So first of all, I don’t see how this is a form of the fallacy of the undistributed middle. The article you linked to says that we’re taking and and conclude . I don’t see how your fallacy is doing (a probabilistic version of) that. Your fallacy is taking as given (with meaning “makes more likely”), and and concluding
Second
We’ve substituted a sharp condition for a vague one, hence the name diagnostic dilution.
I think we’ve substituted a vague condition for a sharp one, not vice-versa? The 32 bit integer seems a lot more vague than the number being about kids?
Third, your leading example isn’t an example of this fallacy, and I think you only got away with pretending it’s one by being vague about the distribution. Because if we tried to fix it, it would have to be like this
A: the number is probably > 100000
X: the number is a typical prior for having kids
Y: the number is a roughly uniformly distributed 32 bit integer
And is not actually true here. Whereas in the example you’re criticizing
A: the AI will have seek to eliminate threatening agents
X: the AI builds football stadiums
Y: the AI has goal-directed behavior
here does seem to be true.
(And also I believe the fallacy isn’t even a fallacy because if and together do in fact imply , at least if both are sufficiently strong?)
So my conclusion here is that the argument actually just doesn’t work, or I still just don’t get what you’re asserting,[1] but the example you make does not seem to have the same problems as the example you’re criticizing, and neither of them seems to have the same structure as the example of the fallacy you’re linking. (For transparency, I initially weak-downvoted because the post seemed confusing and I wasn’t sure if it’s valid, then removed the downvote because you improved the presentation, now strong-downvoted because the central argument seems just broken to me now.)
- ↩︎
like maybe the fallacy isn’t about propositions implying each other but instead about something more specific to an element being in a set, but at this point the point is just not argued clearly.
Would it help if I wrote them out more explicitly rather than tagged them into sentences?
Yes, the edit definitely makes it better.
Yes I have to choose a distribution but if I’m forced to predict an unknown int32 with no additional information the uniform distribution seems like a reasonable choice. Ad-hoc, not explicitly defined probability distributions are common in this discussion.
Well it’s clearly not a reasonable choice given that it results in a fallacy. I think if the actual error is using the wrong prior distribution then this should be reflected in what the diagnostic dilution fallacy is defined as. It’s not putting the element into the larger set because that isn’t false. I’d suggest a phrasing but I don’t have a good enough grasp on the concept to do this, and I’m still not even sure that the argument you’re criticizing is in fact an example of this fallacy if were phrased more precisely. (Also nitpicky, but “condition” in the definition is odd as well if they’re general hypotheses.)
This post is difficult for me to follow because
-
your use of types is inconsistent (the , , should be hypotheses I believe, but you’re saying stuff like this, which makes it sound like they’re half of hypotheses, or other objects---)
-
you say “I know their number of kids is a nonnegative 32 bit integer, and almost all such numbers are greater than 1 million. So I suppose they’re highly likely to have more than 1 million kids.” But this isn’t true; the premise you’re actually using here is that the number is a 32 bit integer with a uniform distribution. Just noting that an unknown element is in a set doesn’t itself imply anything about its probability distribution.
-
you’re not quoting the exact argument you’re criticizing
Maybe these are all just formalities that have no bearing on the validity of the critique, but at least for me they make it too difficult to figure out whether I agree with you or not to make it worth it, so I’m just giving up.
-
Not sure what this means? What is not okay if you agree-vote this?
I think the basic mechanism here is that people actually judge most things by their pleasantness to a much larger degree than they tend to admit, and general bad mood decreases pleasantness.
Directionally I agree with lc saying it sounds like you’re depressed, but I don’t think it actually has to be anywhere near clinical depression. I think “I’m generally sadder so things seem less exciting” is a very commonly true description. You reporting it could have more to do with you being more introspective than with how extreme the condition is.
The general remedy is just to improve your well-being, which of course is very difficult. But I don’t think there’s any conceptual move you can make that will help here; the core issue very much seems to be a lack of fulfillment.
On the contrary, if your AGI definition includes most humans, it sucks.
All the interesting stuff that humanity does is done by thing that most humans can’t do. What you call baby AGI is by itself not very relevant for any of the dangers about AGI discussed in e.g. superintelligence. You could quibble with the literal meaning of “general” or whatever but the historical associations with the term seem much more important to me. If people read years of how AGI will kill everyone and then you use the term AGI, obviously people will think you mean the thing with the properties they’ve read about.
Bottom line is, current AI is not the thing we were talking about under the label AGI for the last 15 years before LLMs, so we probably shouldn’t call it AGI.
The author has been pretty negative on LW so I’m kind of expecting this to be a (failed?) experiment to demonstrate that LW will just eat up any idea that sounds clever. (If not then I’m sorry for sounding dismissive, but wanted to register this prediction.)
The main issue I see with the thesis (even in theory, ignoring that it’s not practical) is, as Richard said,
Consider 3,124,203,346 (or ↗643,302,421,3). Suppose we don’t just care about the rough magnitude, but about its exact value. In our current system, you have to count the number of digits in a large number – reading to the right – and then jump back to the beginning of the number in order to read off its exact value.
This just isn’t true? We don’t count the number of digits. That’s not how human vision works. We recognize the length of the number almost instantly and then read the number from left to right. This example here is about as large as it can get for this principle to still hold (and it wouldn’t work without separators), but the vast majority of relevant numbers we read are small enough. And even here, I just look at this number, without counting, and my brain goes “billion!”
But not that unpleasant, I guess. I really wonder what people think when they see a benchmark on which LLMs get 30%, and then confidently say that 80% is “years away”. Obviously if LLMs already get 30%, it proves they’re fundamentally capable of solving that task[1], so the benchmark will be saturated once AI researchers do more of the same. Hell, Gemini 2.5 Pro apparently got 5⁄7 (71%) on one of the problems, so clearly outputting 5/7-tier answers to IMO problems was a solved problem, so an LLM model getting at least 6*5 = 30 out of 42 in short order should have been expected. How was this not priced in...?
Agreed, I don’t really get how this could be all that much of an update. I think the cynical explanation here is probably correct, which is that most pessimism is just vibes based (as well as most optimism).
If a child plays too many videogames you might take away their switch, and while that might decrease their utility, I’d hardly describe it as suffering in any meaningful sense.
Not sure this is important to discuss, but I definitely would. If I remember correctly, this kind of thing had a pretty strong effect on me when I was small, probably worse than getting a moderate injury as an adult. I feel like it’s very easy to make a small kid suffer because they’re so emotionally defenseless and get so easily invested in random things.
I still think you have rose-colored glasses about how discussion works—not how it should work, but how it does work—and that this is causing you to make errors. E.g., the habryka quote sounds insane until you reflect on what discourse is actually like. E.g., in the past debate we’ve had you initially said that the moderation debates weren’t about tone, before we got to the harder normative stuff.
It’s probably possible to have your normative views while having a realistic model of how discussions work, but err I guess I think you haven’t reconciled them yet, at least not in, e.g., this post.
My reactions
It’s probably good if companies are optimizing for IMO since I doubt this generalizes to anything dangerous.
This tweet thread is trying super hard to sound casual & cheerful but it is entirely an artificial persona, none of this is natural. It is extremely cringe.
Didn’t even know that! (Which kind of makes my point.)
Even though all of that is true, it just seems like such a strange way to frame the problem. Yes, the commenter’s catalogue of top-level posts does provide some information about how productive a back and fourth would be. But if Said had two curated posts but commented in exactly the same style, the number of complaints about him would be no different; the only thing that would change is that people wouldn’t use the ‘he doesn’t write top level posts’ as a justification. (Do you actually doubt this?) Conversely if everyone thought his comments were insightful and friendly, no one would care about his top-level posts. The whole point about top level posts is at least 80% a rationalization/red herring.
I mean of course it’s true today, right? It would be weird to make a prediction “AI can’t do XX in the future” (and that’s most of the predictions here) if that isn’t true today.