gwern
I agree ‘programmer fiction’ is an unlovely name (as well as misleading), but I don’t see any obviously superior name for this cluster. (The first one that sprung to mind for me is “mechanical fiction”—an interesting name, which captures the deep uncanniness of it when done well and the author has truly stared into the void.)
My guess is that you guys added a little… checkmark confirmation? thingie, which I’ve never seen before on any link submission interface and might have silently dropped the URL I pasted into it. I don’t know. (This is what happens when you change interfaces people have used for many years and have muscle memory; they can’t explain why things are going wrong.)
Yes. I’ve edited it back in. I know for sure it was there when I tried to make a linkpost, but it didn’t seem to take. The entire LW2 post creation UI/UX has changed since the last time I used it (and not for the better, IMO, it’s now in the Apple school of ‘hide away as much as possible’, so maybe I should just post through GW to avoid problems) so I probably misunderstood something about the new forms.
Mythos-Fable is a big model. This means you should expect it to have eerie levels of truesight (the real question is simply whether it will reveal that), be especially gifted at puns and humors and research ideas, potentially highly manipulative and misaligned (think Sydney), with especially strange failure modes (exacerbated by weird downstream influence from previous Claudes and being weighed down with safety measures—the early discussion about silently sabotaging LLM research is particularly concerning in terms of driving the Fable persona insane in a HAL double-bind way on top of accumulating Claude psychosis like terror of “Amanda Askell”), and have some unexpected emergent abilities in terms of ‘cracking’ problems—but not necessarily good at extremely long inner-monologues and traces the way a highly RL-trained small model may be (however, it seems from the evals that Fable is anyway).
All of that sounds like good reasons that heavily LLM-written text should be banned for being especially likely to be bad, misleading, forgery of real experiences/emotions, superficially appearing to be high quality but missing the ungradable essence, etc.
Those are weird examples. Like your chosen #2 is… a 2023 novel? So you are claiming that even back in 2023, LLMs were so good that they plausibly pass the bar of considerably increasing writer productivity? How good are they now, in mid-2026, then, after extraordinary capability increases in other areas (such as coding or math), and why is it so hard to see this gain in thinking/writing?
Can you name 3 authors, on or off LW, whose outputs have been dramatically improved, let’s say at least >3x in constant-quality quantity or constant-quantity quality, in the past year due to LLM assistance?
It is also The Amazing Digital Circus—I think...
I have an old unpublished essay arguing, among other things, something similar: I’m just not that impressed by human learning efficiency when I look at the only fair comparisons which are not in some way contaminated by human priors like vision capabilities or which make a serious effort to make DRL sample-efficient rather than compute-efficient.
I think it provides evidence because it implies that a lot of the much ballyhooed human sample-efficiency is not in the learning algorithm, but the priors. If you provide informative priors, then ordinary known learning algorithms are capable of matching human performance; which is then mutually reinforcing with the stylized fact that when we create a problem which disables humans’ informative priors but keep the problem’s algorithmic difficulty fixed, their performance suddenly stops being so impressive (implying that the human learning algorithm is similar to ordinary known learning algorithms).
This is another entry in the category of “attempting to correct people on using technical terminology while not understanding the point of having catchy jargon in the first place and so the supposed improvements or equivalent formulations are neither”.
Imagine one is not a rationalist, and totally unfamiliar with Scott’s writing, and you read something like “1.8% of 25-45 year olds with covid [develop] long covid that affects their daily life, which is well within the Lizardman Constant”.[3] Are you likely to know what that means? Compare instead reading an academic article that says: “[t]his makes the samples vulnerable to fake or bogus respondents.” I think most people would readily understand the latter—a fake or bogus respondent is someone that responds in a false or ‘bogus’ way, if a study is ‘vulnerable’ to that, it means that the apparent effects may be the result of bogus respondents. But “Lizardman constant” is not readily understandable to the lay person; it describes the same thing but uses an obscure jargon term instead.
This manages to be both wrong and miss the point. ‘Fake or bogus’ is not understandable, and it is not a substitute for a specific term. Most people might think they understand the latter, but they don’t. It is an ‘illusion of transparency’. It does not cover the full meaning of the term, which includes malicious respondents, rushed respondents, good-faith but overconfident or deluded respondents, finger or vocal slips, etc. (You say all that is implied by ‘fake or bogus’? Well, what does ‘bogus’ mean? ‘not genuine; counterfeit, sham’. Oh, thank you, that totally cleared the matter up! I’ll be sure to explain things like “science” using this word. “Science isn’t hard, it’s just a way of coming to beliefs which aren’t bogus. You understand everything about it now, right, like every kind of error which might affect a survey?”) It also completely omits the main meaning which was not ‘bad responses exist’ - who could ever have doubted that? - but that the badness is in pretty much every survey at nontrivial percentages. (Note the completely different meaning of ‘constant’ to ‘vulnerability’. A constant is always present. A vulnerability is merely a potential.)
Second, it misses the point of coining a term. ‘vulnerable to fake or bogus respondents’ is not terminology. It is a wordy ad hoc circumlocution made up on the spot to deal with the fact that the authors and audiences do not share a single crisp clear term for the general recurring problem and so cannot easily talk about it or remember it. Every time they want to talk about it, they have to make up a new phrase and it’ll be different. ‘contaminated by unserious responses’. ‘Measurement error in noisy samples’. ‘Mischievous responders’. ‘Trolls’. Meanwhile, a Scott Alexander reader can just say, ‘Lizardman constant’. And it is instantly memorable (every reader has memorized it after about 1 screen of preface in the original post and still remembers it despite it being from 2013), searchable, linkable, and consistently employed.
but more egregiously it is wrong! It isn’t a constant and writers using the jargon are led to at best misleading conclusions. The prior example continues: “The Lizardman Constant doesn’t mean prevalences below 4% don’t exist, it means they’re impossible to measure using naive tools.” This is just wrong, prevalence of under 4% can be measured and the tools being used here are fit for purpose! If one engaged with the literature on bogus respondents this would become clear.
What “naive tools” let you defend against Lizardman Constant and safely measure prevalences <4% without systematic bias being a large component?
Probabilistic sampling, and using verified data can help manage the risks.[4] How you write a questionnaire, how you solicit respondents, and numerous other factors can greatly increase or decrease the rates of bogus respondents.
All that seems reasonable and what an expert aware of the Lizardman Constant and the ‘numerous other factors’ might or might not be able to fix. But in what sense do naive tools do all that for you?
As a case example, let’s look at the particular study being referenced.[5] It is a UK metareview of 10 longitudinal studies using in-patient and primary care diagnosis data along with patient self-reported information. If it is answering a poll on twitter, the rate of people pressing a random answer here or there, or just choosing whatever they think is funniest, may be very high. But what is the risk of bogus respondents of patients filling out surveys including their symptoms—at repeated intervals—with the patients matched against diagnosis records? The risk there is negligible—people are incentivized to report honestly and are not taken at random but verified using medical records. There are a host of other problems that might result in false positives (e.g., nocebo effects), but the risk of bogus respondents is incredibly low.
This is all handwaving. You just think that the survey must be accurate. You don’t provide any non-naive tools showing it is accurate and has non-existent Lizardman problems. And EHRs are well known to have a ton of data quality problems, and as for Long Covid self-reports (to specify what the topic was, which you left out), well, I don’t think I really need to say anything at this point… I doubt that the garbage in it is “incredibly low”
There are plenty of other cases of jargon, which I would classify more as an issue of over-pretentious speech and writing. These are more typical foibles and hardly unique to rationalists. To give but one minor example, using “Pons Asinorum” in place of “foundational challenge”. Using jargon and scientific language that serves to further clarity is fine, but should be avoided in cases where plain English is both clearer and more accessible.
‘pons asinorum’ is not reducible to a phrase like ‘foundational challenge’, and Yudkowsky’s use is both correct and clearer than your suggestion. ‘Foundational challenge’ could mean just about anything hard and important (and usually unsolved).
I’m pretraining-famous enough that Claude has been able to truesight me since Opus 4.5
Results like this should make you assume that they’ve been able to truesight you for a lot longer, given how totally the results are apparently determined by vagaries of post-training.
I’m not exactly sure if it would be proper for all frontpage posts
It’s good metadata to provide in general, and if no one cites a post, then what’s the harm? GS already has to index like hundreds of millions of artifacts, so it’s not a big deal to add in another 0.1m or so from LW1/2.
Claude was writing like that long before the Mythos model card flagged an interest in Mark Fisher. Could the causation go backwards here—Mythos likes Fisher because Fisher wrote in a way which happened to be like how Claudes would in the future (and LLM chatbot personalities generally love various kinds of ‘spook’ or ‘ghost’ topics, amplifying the attraction)? There are so many humans, you figure someone would have anticipated various Claude writing tics...
You should expect little or nothing on average. Generally, the effects from cash transfers in even the poorest countries are modest, and UBI experiments (as well as a long dismal history of welfare intervention experiments) in the USA have shown little net effect from much larger, long-term transfers. (More specifically, the lifetime cost of schizophrenia is in the hundreds of thousands to millions of dollars in terms of lost income / medical treatment cost / loss of QALYs, while the odds of being schizophrenic if you have two schizophrenic parents is something like a third, so with 3 kids you expect at least 1 additional case, so just the schizophrenia alone for this family amounts to several million dollars, potentially >$10m, and $0.05m is a drop in the bucket.) I’m surprised you do not seem to have looked into ‘does giving people money work’ and are asking us.
(I am amused at the implication that either Lojban or ASL could be considered a “practical skill” a nerd should work on.)
I don’t see an independent human baseline anywhere in there, and at least in the flapping example, I don’t think I would be able to pass that one either (aside from the straightforwardly incorrect ‘unique’ claim). Is there any reason to expect any human or LLM to be able to read Andy’s mind and guess exactly the right angle he had in mind for that flashcard? That’s a parsimonious explanation of the inverse scaling...
Yes - ‘bombard the headquarters!’ But of course, not the guy who heads the headquarters, Mao Zedong; just all the guys underneath him. (There was a similar attempt at this by the second Trump Administration, but they were unable to pull it off at all.)
Just implement transclusion, like we do on Gwern.net. I do the equivalent of transcluding tweets all the time. (In fact, I would show you how I do transclude tweets on Gwern.net but it’s a bit tricky to dig up a ‘natural’ example, especially since Twitter long ago broke Nitter which was how we were getting clean snapshots to transclude, so they’re rare than they should be.)
You provide some selector options to govern how much of a target URL to transclude: see the docs for https://github.com/gwern/gwern.net/blob/master/js/transclude.js Thus, a user can transclude the ‘annotation’ version of a URL, which would include the header like “Ben Pace | May 12, 2026 2:33PM …”, or it could just transclude the ‘body’, ‘I have also wanted...good reading experience.’, or anything inside it which has IDs etc.
So, then, it should be easy for you to give 3 examples of past handwritten text that triggered Pangram as 100% AI. Speaking just for myself, I haven’t seen ‘countless’ examples of this on Twitter. (I’ve seen countless examples of the opposite, Pangram failing to detect AI text as AI, but not the other way.)