LessWrong team member / moderator. I’ve been a LessWrong organizer since 2011, with roughly equal focus on the cultural, practical and intellectual aspects of the community. My first project was creating the Secular Solstice and helping groups across the world run their own version of it. More recently I’ve been interested in improving my own epistemic standards and helping others to do so as well.
Raemon
Wow that is weird. We’ll look into it.
I’ve asked the other devs but would be somewhat surprised if we did this intentionally. What page were you on when you got it?
My somewhat-dissenting-mod-opinion:
I feel more worried than I think Habryka and Robert are about LLM-content corroding LW culture, and I think I maybe have a slightly different take on why LLM-generated text is not testimony matters. (For me, it’s not the most important thing that people are making ‘I’ statements that are false if an LLM said them. More significant to me is the implicit vouching that each statement is interesting and is some kind of worthwhile piece of a broader conversation, whether it’s your opinion or not)
If I were making the policy and new features, I’d be framing it like:
Look, I know AI is increasingly going to be a legitimate part of some people’s workflows. But, I think for >75% of LW users, it’s a mistake to use LLMs as a particularly significant part of your writing process, and I think the site should be somewhat pumping away from it.
I think there are very occasional new users for whom it’s the right call to use LLMs, but, I’d think of it more like “the people I trust to write LLM content are people who either have written, or pretty obviously could write, multiple posts that get 100+ karma.” And even then I am slightly worried about people falling down a trap that slightly fucks their reasoning longterm.
(I think LW users that are paying attention can handle writing and reading LLM assisted content, but, humans who are not concentrating are not general intelligences. I’ve already seen some clearly LLM-slop posts that got upvoted to like 20 karma, and on my last shortform about 2026-era-LLM-slop, a user gave an example of an LLM-written comment they thought was decent, and a few users pushed back, and then they update “oh, actually, yeah that was not as meaningful a comment as I thought”)
I resonated with Justis’ Don’t Let LLMs Write for You, including the comments where some people pushed back and were like “but, I mean, clearly letting them write for you a bit is reasonable” and Justis responding “okay, but, like, the vast majority of people who think they are skilled enough to do this are making a mistake.” Yeah there is some nuance here, but, I think this makes more sense as a default recommendation.
Mechanistically, the main thing I’d change is make the LLM-blocks look more like “blockquotes” than “slightly different paragraphs”, and which an implicated cultural nudge of “LLM text is not main body text, it’s a thing you can quote, not a thing half-paying-attention-at-lunch people who are skimming should risk misinterpreting as part of the main text.”
(caveat: I think processes like Neel’s that involved a lot of LLM editing but begin and end with a lot of human involvement are fine, in particular if the end result isn’t distinguishable).
There are downsides to this habryka is more worried about, and I think I might change my mind later when LLM writing actually improves, but I don’t think we’re there yet and it doesn’t make sense to pre-emptively pave the way for LLM-assisted-writing-world.
I’m not sure what I expect habryka/Robert to to rule here, but I think it’s at least notably different:
text that was written by an LLM and then edited or revised by a human
vs
text that was narrated by a human, transcribed and cleaned up by an LLM, then edited or revised by a human again
I think one answer is “does the resulting stuff score highly on Pangram or not?” and “does this smell like LLM” also inputs into the decision. In the case of @Neel Nanda’s linked posts, they all have a 0.0 on our LLM detector. (I haven’t looked into them that hard). So I would guess it is fine to not put them in the LLM block.
Man I had previously done this when it was a standalone webapp, and somehow as an embedded widget I do feel a lot more trapped.
I don’t think the opening point here is very reasonable. (I think the latter points feel more reasonable)
Inspecting it though, its core assumptions are clearly wrong! Here are the first two:
1) Regulations demand external, independent audits. […]
2) Regulations demand actions following concerning evaluations. […]There is no such regulation.
My current understanding is that getting good regulations is a chicken-egg problem. It’s difficult to pass a regulation if there isn’t yet the technical implementation details necessary to execute it. You can try to create broad FDA-like authorities that can invent the technical details later, but, that requires more political will and are more at risk at turning into molochian molasses without really doing anything.
One point of the Evals Plan was to lay the groundwork to make it possible to make regulations later.
That said, I agree something vibeswise pretty feels off about the Evals / AI Companies dynamic.
You might argue “predictably, the orgs require a culture that can interface with the labs, and this has spillover dynamics that make it less likely that real/useful regulation gets passed.” But, this post didn’t really connect those dots.
Was there any special scaffolding you used?
If the owner is the new user, I’m not sure what real alternatives there are to reviewing the user then.
The user who created the doc is treated as the primary author. I’m not sure offhand how our system treats coauthors who haven’t yet been previously reviewed. My guess is if there is an unreviewed coauthor, the system should ideally get flagged for after-the-fact-review but not be blocking on posting.
If the creating-user wasn’t meaningfully the primary author but just happened to create the post for historical reasons, then I think you can just copy-paste the doc into a new post by either whoever was the primary author, or, if there wasn’t really a primary author, idk, pick whoever else makes marginally more sense to be the document-owner. (You can also message a mod to do this for you but you can self-serve immediately with a pretty simple copy-paste)
If the creating user was the primary author, I think it’s alas just correct for it to go through normal review.
This post says “Prologue”, I’m assuming this means you’re intending followup posts? I’m curious for like… the scope of what’s to follow. Like, do you currently have 1 post queued up or like 10 or you’re not sure yet?
Overall, LLMs seem pretty incoherent to me, and incapable of having “real,” “novel,” or “scientific” thoughts; I don’t feel like I can trust them with anything important.
These feel like very different unrelated statements to me (not sure if you meant to imply they are connected). I think you can do real chunks of novel/scientific thought while being too incoherent to see it all the way through.
I’m not sure how you’re defining “real”/”novel”/”scientific” thoughts. I’m pretty sure they can and do, the thing they don’t do is persistently and strategically follow through on them and string them together in a useful way.
Curated. I liked this for a few reasons:
First, it’s just sorta heartwarming to see people both working hard and creatively problem-solving.
It gave me a bit of gears of what went into the logistics of navigating the pandemic.
On a somewhat meta-level, it introduced a new genre that maybe I wish i was exposed to more often, which is “concrete stories about how someone solved a difficult problem”, to flesh out my creative toolkit for solving difficult problems.
You are at least saying words here that sounds like you are aware this problem is hard. (the people I’m talking about, so far, tend to not even note that this sort of question has a problem to overcome. Of course, it’s easy for AIs to learn to start adding some Epistemic Caveat sentences).
I’d have to dig into the details to have more of an actual opinion on your thing.
Yeah that framing seems plausible, would have to think more.
I will bet money there will turn out to be some consideration here that is novel to LessWrong and requires changing some kind of policy or feature somewhere, when we get to the point that AIs first look like they are generating high quality alignment content.
Like, at the very least, I think we will want to have some kind of conversation about “okay but is it actually generating high quality alignment content, or, is it sychophanting us? Are misaligned or slightly-misaligned AIs subtly manipulating us?” when we hit that point.
My current guess is, the people are sort of earnestly trying to onboard themselves, but they are missing layers and layers of judgment that make them not anywhere close to being able to be a good contributor, and they are mostly showing on LessWrong after having done the work, asking “where should I post this?”, and the AI says “LessWrong”, but, they don’t really have interest in scrapping their work, reading the sequences, and starting over.
I think the actual content is… possibly real “built and ran code that does the vaguely mechinterp buzzwordy things they said”, but, the way they were evaluating the AI was by asking it simple english sentences after doing mechinterp buzzwordy things, and the questions they asked don’t really mean anything.
I think we probably should be doing something more to try to capture these people more productively but the first step is “yep, your first round of work was basically fake, here is how to start over.”
I do think this is a good alternative to at least consider, although I think some of your details are off.
In particular, re:
And it could defuse the cheap thrill of saying “look, I’m doing cutting-edge research despite having no background in the field!” when that actually means “I asked Claude to do something I only halfway understand.” The thrill, I think, involves the conceit that it’s fundamentally “your” work; having to spell out Claude’s extensive role ruins the fun (in a good way).
Most of these people are pretty excited to share that it was coauthored by Claude.
I’m not actually sure about all our exact policies (@habryka may have clearer takes), but, I think if the AI wasn’t responsible for choosing any of the phrasing, just cleaning up grammatical stuff, then it doesn’nt need to be in the AI block.
(I expect the default rule to approximately be “does our AI detector detect any AI in a given paragraph? If so, the post is probably flagged/delisted barring special circumstances. )
How about use of AI for anonymization purposes such as the recent Possessed Machines essay?
Yes, this would be a straightforward case for putting most paragraph blocks in the AI tag.
If someone is able to use AI to generate a large number of high-quality alignment posts that don’t read like AI slop, I would call that mission fucking accomplished.
If we get to this point, probably we re-evaluate the policy, and/or have a conversation about how as a community to relate to AI content. But, the problem is we need to distinguish “AI is generating high-quality alignment posts” from “AI is generating what looks like high quality alignment posts”, and we’re certainly at least going to be spending one generation of frontier-model where it’s only doing the latter.
We may generally want to raise our quality bar. But, fwiw I don’t know that we’ve actually lowered the bar for student tier content. Or, idk maybe we do, but, I don’t think MATS scholars are particularly below our bar (depends on the scholar/project). Just because they’re not contributing frontier conceptual progress doesn’t mean they’re not, like, exploring an interesting corner of the world and writing up some useful stuff about it.
I mention MATS scholars because their work is structurally similar to the current generation of slop (i.e. it sorta looks like the slop is imitating entry-level mechinterp work in particular).
Seems like a false binary. What if I speak English as a second language and only use AI to clean up my prose before publishing? What if the ideas came from an AI chat but I did all the editing myself?
If you wrote it yourself, it’s doesn’t go in the AI tag. If you used the AI to translate, it does. (The point of the tag is to filter for things people have written themselves and thus are a kind of testimony that AI writing is not).
If you used a service to do an actual strict translation of something you wrote yourself in another language, I’m not 100% sure which call we’ll make once that that AI tag exists. But, that’s the instruction we currently give people who submit ai-slop-feeling-things who say they wrote in a second language. (i.e. compare elsethread where Dagon uses an AI to summarize his point, and it loses key nuance, to what it might have said if he’d written it in spanish and asked for a strict translation)
(I went and tested having an LLM translate Dagon’s comment from english to spanish, and then [in another chat] to english again. It changed the ordering of a couple words but was basically the same)
Curated. I had been vaguely worried about OpenClaw proliferation going off the rails somehow. But, this spells out a lot of specific gears that I hadn’t previously been tracking, and gives me some tools to model the overall dynamics. It updated me that this sort of thing will probably be happening on the sooner side. (Maybe this is good, because it may create smaller scale warning shorts?)
Congratulations on finding a new specific reason for me to be alarmed about AI.