Rafael Harth comments on [Meta] New moderation tools and moderation guidelines

Rafael Harth 9 Jul 2025 11:56 UTC
2 points
0
Alright—apologies for the long delay, but this response meant I had to reread the Scaling Hypothesis post, and I had some motivation/willpower issues in the last week. But I reread it now.

I agree that the post is deliberately offensive at parts. E.g.:

But I think they lack a vision. As far as I can tell: they do not have any such thing, because Google Brain & DeepMind do not believe in the scaling hypothesis the way that Sutskever, Amodei and others at OA do. Just read through machine learning Twitter to see the disdain for the scaling hypothesis. (A quarter year on from GPT-3 and counting, can you name a single dense model as large as the 17b Turing-NLG—never mind larger than GPT-3?)

Google Brain is entirely too practical and short-term focused to dabble in such esoteric & expensive speculation, although Quoc V. Le’s group occasionally surprises you.

or (emphasis added)

OA, lacking anything like DM’s long-term funding from Google or its enormous headcount, is making a startup-like bet that they know an important truth which is a secret: “the scaling hypothesis is true!” So, simple DRL algorithms like PPO on top of large simple architectures like RNNs or Transformers can emerge, exploiting the blessings of scale, and meta-learn their way to powerful capabilities, enabling further funding for still more compute & scaling, in a virtuous cycle. [...]

and probably the most offensive is the ending (wont quote to not clutter the reply, but it’s in Critiquing the Critics, especially from “What should we think about the experts?” onward). You’re essentially accusing all the skeptics of falling victim to a bundle of biases/signaling incentives, rather than disagreeing with you for rational reasons. So you were right, this is deliberately offensive.

But I think the answer to the question—well actually let’s clarify what we’re debating, that might avoid miscommunication. You said this in your initial reply:

I can definitely say on my own part that nothing of major value I have done as a writer online—whether it was popularizing Bitcoin or darknet markets or the embryo selection analysis or writing ‘The Scaling Hypothesis’—would have been done if I had cared too much about “vibes” or how it made the reader feel. (Many of the things I have written definitely did make a lot of readers feel bad. And they should have. There is something wrong with you if you can read, say, ‘Scaling Hypothesis’ and not feel bad. I myself regularly feel bad about it! But that’s not a bad thing.) Even my Wikipedia editing earned me doxes and death threats.

So in a nutshell, I think we’re debating something like “will what I advocate mean you’ll be less effective as a writer” or more narrowly “will what I’m advocating for mean you couldn’t have written really valuable past pieces like the Scaling Hypothesis”. To me it still seems like the answer to both is a clear no.

The main thing is, you’re treating my position as if it’s just “always be nice”, which isn’t correct. I’m very utilitarian (about commenting and in general) (one of my main insights from the conversation with Zack is that this is a genuine difference). I’ve argued repeatedly that Said’s comment is ineffective, basically because of what Scott said in How Not to Lose an Argument. It was obviously ineffective at persuading Gordon. Now Said argued that persuading the author isn’t the point, which I can sort of grant, but I think it will be similarly ineffective for anyone sympathetic to religion for the same reasons. So it’s not that I terminally value being nice,^[1] it’s that being nice is generally instrumentally useful, and would have been useful in Said’s case. But that doesn’t mean it’s necessarily always useful.

I want to call attention my rephrasing of Said’s post. I still claim that this post would have been much more effective in criticizing Gordon’s post. Gordon would have reacted in more constructive way, and again, I think everyone else who sympathizes with religion is essentially in the same position. This seems to me like a really important point.

So to clarify, I would not have objected to the Scaling Hypothesis post despite some rudeness. The rudeness has a purpose (the bolded sentence is the one that I remembered most from reading it all the way back, which is evidence for your claim that “those were some of the most effective parts”). And the context is also importantly different; you’re not directly replying to a skeptic; the post was likely to be read by lots of people who are undecided. And the fact that it was a super high effort post also matters because ‘how much effort does the other person put into this conversation’ is always one of the important parameters for vibes.

I also wanna point out that your response was contradictory in an important way. (This isn’t meant as a gotcha, I think it capture the difference between “always be nice” and “maximize vibes for impact under the constraint of being honest and not misleading”.) Because you said that you wouldn’t have been successful if you worried about vibes, but also that you made the Scaling Hypothesis post deliberately offensive, which means you did care about vibes, you just didn’t optimize them to be nice in this case.

Idk if this is worth adding, but two days ago I remembered something you wrote that I had mentally tagged as “very rude”, and where following my principles would mean you’re “not allowed” to write that. (So if you think that was important to write in this way, then we have a genuine disagreement.) That was your response to now-anonymous on your Clippy post, here. Here, my take (though I didn’t reread, this is mostly from memory) is something like
- the critique didn’t make a lot of sense because it boiled down to “you’re asserting that people would do xyz, but xyz is stupid”, which is a nonseqitor (“people do xyz” and “xyz is stupid” can both be true)
- your response was needlessly aggressive and you “lost” the argument in the sense that you failed the persuade the person who complained
- it was absolutely possible to write a better reply here; you could have just made the above point (i.e., “it being stupid doesn’t mean it’s unrealistic”) in a friendly tone and the result would probably been that the commenter realizes their mistake; the same is achieved with fewer words and it arguably makes you look better. I don’t see the downside.
1. ↩︎
  Strictly speaking I do terminally value being nice a little bit because I terminally value people feeling good/bad, but I think the ‘improve everyone’s models about the world’ consideration dominates the calculation.