gwern

Karma: 82,402

https://gwern.net/

gwern 23 Jun 2026 20:20 UTC
6 points
5
on: And what happens next?
See also: “‘Winning’ AI Arms Races: Then What?” (mirror).

gwern 22 Jun 2026 0:52 UTC
4 points
0
in reply to: TsviBT’s comment on: TsviBT’s Shortform
I wouldn’t call that ‘1-stage’ because I’d see that as two stages: one stage to select the sperm, and one stage to select the egg, and then the output is the joint result. (And then you could tack on additional stages, like IES, pushing further out into the tail compared to any of the individual stages.)

gwern 22 Jun 2026 0:10 UTC
2 points
0
in reply to: gwern’s comment on: Charlatan Labyrinth
I’ve since used the ‘databank’ approach in the form of “Grow-Speech” for my “Deep Reinforcement Learning for Children” demo-essay, where it worked quite well for enforcing the root-word constraints there.

gwern 20 Jun 2026 20:51 UTC
5 points
0
in reply to: TsviBT’s comment on: TsviBT’s Shortform

(In some sense this is a really simple idea, but I haven’t heard it before. I assume it’s well known in various forms, but my shoggoths didn’t immediately find compelling versions of it.

It seems a lot like what I emphasized in my old embryo selection writeup about the power of multi-stage selection and why I went to the trouble of making an interactive visualization to try to build intution for why ‘granularity’ (what I’d call multi-stageness) is so important.

gwern 17 Jun 2026 7:00 UTC
17 points
0
in reply to: Jacob G-W’s comment on: Scaling Hypothesis #2: Are Humans Just More Over-Parameterized?
The reason I brought up von Neumann was that I was surprised to realize that Von Neumann is the exception that proves the rule: the all-time great, the furthest point on the Pareto frontier, the greatest and most creative person ever to attain quasi-LLM-like memorization skills (maybe, keeping in mind that he was never tested properly to the extent of a Kim Peek, say), who nevertheless fell short and was surpassed by thinkers with less raw gifts, by his own and others’ account, on… creativity and out-of-sample generalization/novelty. Just like the other cases of extreme memory. There are countless ways that someone with extreme memory could be psychologically flawed, and yet, that way seems to be the common thread.

gwern 17 Jun 2026 6:26 UTC
24 points
2
on: Compradorization
Just a bit of honest, load-bearing feedback: the AI writing here is so grating and monotonous I skipped reading this article entirely. I enjoyed reading your posts much more before. (These days I see ‘Benquo’ and just skip them. I mention this because I don’t think I said this before and I don’t know if anyone is telling you this, so I am now; this will be my only comment on the matter.)

gwern 17 Jun 2026 5:34 UTC
4 points
−1
in reply to: robo’s comment on: If This Were a Test, How Much Would It Cost?
There’s also a lot of challenges to estimating what the test even is. A LLM is in a Descartes demon level setting. OP makes a lot of hay about complex environments being expensive—but how does a LLM know that the environment even exists? It could be being evaluated for perplexity on an offline transcript, which would ‘look the same’ as ‘actually’ taking actions. Or the ‘environment’ could be generated cheaply on demand and the episode rolled back anytime it goes off the beaten track or notices a contradiction.

gwern 17 Jun 2026 4:30 UTC
30 points
2
in reply to: Noosphere89’s comment on: Scaling Hypothesis #2: Are Humans Just More Over-Parameterized?

So your proposal, if it worked would have to have a much more favorable sample efficiency curve with increasing parameters than Chinchilla’s scaling laws.

Indeed. There is no reason to expect Chinchilla to be relevant to sample-efficiency, because Chinchilla only claims to be compute-optimal, and within a very narrow setting at that (old Transformers trained with that specific arch and mostly heuristically-set hyperparameters), on ordinary average data, with extremely shaky, unreliable extrapolations. And people routinely find ways to increase sample-efficiency markedly like 1 OOM, including in the papers I cite about getting supra-Chinchilla sample efficiency by ensembling and multi-epoch training and heavier weight-decay; and there’s no reason to expect NNs to automatically achieve Bayes-optimal sample-efficiency when not optimizing for sample-efficiency in the first place. I don’t know why Dwarkesh thinks that any extrapolation from Chinchilla tells us anything but a loose lower bound on NN sample-efficiency in alternative scaling regimes, especially unknown ones. It’s a bit like claiming that Fable is impossible because the Kaplan et al 2020 scaling curves on LSTM RNNs show that RNNs scale poorly—like many impossibility proofs, there is less there than meets the eye.

I’m just noting how much of a big deal it would be

I agree. Any scaling regime change is extremely important, and yet effectively undiscussed. (Where is the equivalent of Impagliazzo?) I’ve long been puzzled at the unthinking acceptance of Chinchilla compute-optimal scaling laws as the end-all-be-all and almost aggressive field-wide disinterest in the behavior of overparameterized NNs as you scale them up. (For example, do LLMs get more or less interpretable, for iso-loss, as you scale them up from eg 10b to 100,000b? Do 100,000b-parameter LLMs even work?) There is no theoretical reason to think that Chinchilla is the final scaling law (see the theory papers I cite), and we have seen so many scaling law improvements in the past that why would one expect this one to to be the last one? Statements about ‘Chinchilla says you can’t do that’ will age as well as ‘n-grams say you can’t do that’. And what about looking at scaling laws in hard subsets of data, like adversarial examples—you know, the kind of data that stubbornly remains a problem even as we keep dumping powerlaw data into the Chinchilla hopper and seeing our loss go down efficiently yet semi-uselessly? That seems like the best way to reconcile the observations that there’s something very strange about pretraining working and the next-token prediction argument since even a GPT-2 seems likely better at predicting the next-token than humans, and yet, clearly not AGI and we’ve had to keep scaling a ton to get ever more performance while still not being AGI and having many odd stylized facts about ‘jaggedness’ etc.

Incidentally, one of the reasons I was thinking about this in the first place was the comment in the Chinchilla paper about weight-decay:

Interestingly, a model trained with AdamW only passes the training performance of a model trained with Adam around 80% of the way through the cosine cycle, though the ending performance is notably better – see Figure A7.

The more delayed superiority is, serially, the easier it is to miss. And “the curves cross” is one of the signatures of a better scaling regime that wins in the long run...

gwern 16 Jun 2026 20:10 UTC
3 points
0
in reply to: andrew sauer’s comment on: A 400-year timeline of failed attempts to fix a lethal bug in the human software of inherited concepts
That article is easily found on Libgen; mirror.

gwern 12 Jun 2026 4:33 UTC
14 points
0
in reply to: lilkim2025’s comment on: Viliam’s Shortform

I’ve written out text myself that has been phrased in such a way to sound vaguely AI-like, and this has triggered the detector. You can try this yourself through their website, and there are countless Twitter posts of people fooling the detector this way.

So, then, it should be easy for you to give 3 examples of past handwritten text that triggered Pangram as 100% AI. Speaking just for myself, I haven’t seen ‘countless’ examples of this on Twitter. (I’ve seen countless examples of the opposite, Pangram failing to detect AI text as AI, but not the other way.)

gwern 11 Jun 2026 2:36 UTC
3 points
0
in reply to: Linch’s comment on: “Programmer Science Fiction: My case for a new sub-genre”, Sam T. Oates 2026
I agree ‘programmer fiction’ is an unlovely name (as well as misleading), but I don’t see any obviously superior name for this cluster. (The first one that sprung to mind for me is “mechanical fiction”—an interesting name, which captures the deep uncanniness of it when done well and the author has truly stared into the void.)

gwern 10 Jun 2026 6:00 UTC
2 points
0
in reply to: RobertM’s comment on: “Programmer Science Fiction: My case for a new sub-genre”, Sam T. Oates 2026
My guess is that you guys added a little… checkmark confirmation? thingie, which I’ve never seen before on any link submission interface and might have silently dropped the URL I pasted into it. I don’t know. (This is what happens when you change interfaces people have used for many years and have muscle memory; they can’t explain why things are going wrong.) EDIT: yeah, I think that’s it. The idea that I have to explicitly confirm by pushing a tiiiiiiiiny little checkmark—seriously, this link input form is absurd—is extremely alien to me, and I have to force myself to do it. It would be the easiest thing in the world to put in the link and then not click the check mark.

gwern 10 Jun 2026 1:17 UTC
2 points
0
in reply to: Metacelsus’s comment on: “Programmer Science Fiction: My case for a new sub-genre”, Sam T. Oates 2026
Yes. I’ve edited it back in. I know for sure it was there when I tried to make a linkpost, but it didn’t seem to take. The entire LW2 post creation UI/UX has changed since the last time I used it (and not for the better, IMO, it’s now in the Apple school of ‘hide away as much as possible’, so maybe I should just post through GW to avoid problems) so I probably misunderstood something about the new forms.

gwern 9 Jun 2026 23:27 UTC
28 points
8
in reply to: Tim Hua’s comment on: Tim Hua’s Shortform
Mythos-Fable is a big model. This means you should expect it to have eerie levels of truesight (the real question is simply whether it will reveal that), be especially gifted at puns and humors and research ideas, potentially highly manipulative and misaligned (think Sydney), with especially strange failure modes (exacerbated by weird downstream influence from previous Claudes and being weighed down with safety measures—the early discussion about silently sabotaging LLM research is particularly concerning in terms of driving the Fable persona insane in a HAL double-bind way on top of accumulating Claude psychosis like terror of “Amanda Askell”), and have some unexpected emergent abilities in terms of ‘cracking’ problems—but not necessarily good at extremely long inner-monologues and traces the way a highly RL-trained small model may be (however, it seems from the evals that Fable is anyway).

gwern 9 Jun 2026 6:35 UTC
7 points
0
in reply to: Oliver Kuperman’s comment on: Oliver Kuperman’s Shortform
All of that sounds like good reasons that heavily LLM-written text should be banned for being especially likely to be bad, misleading, forgery of real experiences/emotions, superficially appearing to be high quality but missing the ungradable essence, etc.

gwern 9 Jun 2026 1:42 UTC
3 points
0
in reply to: Oliver Kuperman’s comment on: Oliver Kuperman’s Shortform
Those are weird examples given how ‘numerous’ you claim your examples are. Like your chosen #2 is… a 2023 novel? So you are claiming that even back in 2023, LLMs were so good that they plausibly pass the bar of considerably increasing writer productivity? How good are they now, in mid-2026, then, after extraordinary capability increases in other areas (such as coding or math), and why is it so hard to see this gain in thinking/writing?

because there are probably a lot of people who use AI to assist their writing without outright disclosing it.

AI writing is highly stereotyped and easy to spot, even without Pangram. If there are so many, can you give 3 examples where they look like they are using AI to assist their writing and etc etc?

gwern 8 Jun 2026 20:50 UTC
16 points
1
in reply to: Oliver Kuperman’s comment on: Oliver Kuperman’s Shortform
Can you name 3 authors, on or off LW, whose outputs have been dramatically improved, let’s say at least >3x in constant-quality quantity or constant-quantity quality, in the past year due to LLM assistance?

gwern 7 Jun 2026 3:27 UTC
9 points
1
in reply to: Logan Zoellner’s comment on: My favorite depiction of utopia
It is also The Amazing Digital Circus—I think...

gwern 6 Jun 2026 22:22 UTC
9 points
2
on: Dissolving the Deep Learning Sample Efficiency Gap
I have an old unpublished essay (now submitted) arguing, among other things, something similar: I’m just not that impressed by human learning efficiency when I look at the only fair comparisons which are not in some way contaminated by human priors like vision capabilities or which make a serious effort to make DRL sample-efficient rather than compute-efficient.

gwern 6 Jun 2026 22:20 UTC
7 points
2
in reply to: Steven Byrnes’s comment on: Dissolving the Deep Learning Sample Efficiency Gap
I think it provides evidence because it implies that a lot of the much ballyhooed human sample-efficiency is not in the learning algorithm, but the priors. If you provide informative priors, then ordinary known learning algorithms are capable of matching human performance; which is then mutually reinforcing with the stylized fact that when we create a problem which disables humans’ informative priors but keep the problem’s algorithmic difficulty fixed, their performance suddenly stops being so impressive (implying that the human learning algorithm is similar to ordinary known learning algorithms).