I think if I was going solely based on your comments, I would not believe that creation of a paper that could go in a 1st quartile social science journal is possible with current technology.
Doesn’t seem at all ruled out by what I said (though even then I do not think you could reliably do that without an expert human in the loop).
I expect your current course of action will effectively take that option off of the table in practice.
I don’t really understand how you came to this conclusion. There is no prohibition on using LLMs to do research that ends up published on LessWrong. I am sure that a large percentage of the work necessary for Ryan Greenblatt’s latest piece of empirical research published on LessWrong was done by LLMs. This is totally fine. I expect this ratio to become more skewed over time.
But instead I would suggest you simply discourage posts if you can tell it’s AI, rather than using the honor system.
I am confused by what you think we’re currently doing and what we’ll be doing in the future. We do already have both automated systems and human review for detecting whether posts contain non-trivial amounts of LLM-generated content in them. We are not going to stop using those systems the minute we introduce content blocks where LLM-generated content is permissible. We are moving to a more permissive regime with respect to LLM-generated content than the one we’re currently in.
Doesn’t seem at all ruled out by what I said (though even then I do not think you could reliably do that without an expert human in the loop).
If I do this with a human in the loop, it will still count as LLM-generated and you will require it to be tagged as such, correct?
I am confused by what you think we’re currently doing and what we’ll be doing in the future.
I think you are currently prohibiting LLM writing and you will soon require it to be tagged as such, which will still de facto stigmatize experimenting with automated alignment work and nudge the leading edge of “LLMs-for-research” elsewhere. You’re forcing people to jump through two hoops: (a) produce good automated alignment research, (b) convince people it’s worth a read even though it’s AI. I’m saying (a) should be enough. The skills to accomplish (a) and (b) may be very different btw. And the best people at (a) are not necessarily the people who are already ingroup such as Ryan Greenblatt.
If I do this with a human in the loop, it will still count as LLM-generated and you will require it to be tagged as such, correct?
Yes.
I think you are currently prohibiting LLM writing and you will soon require it to be tagged as such, which will still de facto stigmatize experimenting with automated alignment work and nudge the leading edge of “LLMs-for-research” elsewhere. You’re forcing people to jump through two hoops: (a) produce good automated alignment research, (b) convince people it’s worth a read even though it’s AI. I’m saying (a) should be enough. The skills to accomplish (a) and (b) may be very different btw. And the best people at (a) are not necessarily the people who are already ingroup such as Ryan Greenblatt.
Once more I implore you to look at the list of rejected posts on our moderation page and tell me that you think the signal to noise ratio would be improved by allowing unmarked LLM content on LessWrong.
I do understand your concern. But I think you are ignoring the enormous costs of adopting your policy now, while current LLMs are not able to produce automated alignment research[1]. And if we enter a regime where LLMs are able to get useful[2] alignment research done in a basically automated way, then frankly I think we will have entered a completely different regime where we will need to be rethinking quite a lot of how we relate to the world. (Also, >80% it comes from labs first, so frankly I am not that worried about the second hoop.)
At all, I think, but certainly not at sufficiently low cost that it’s dominated by the marginal cost of having a human expert verify their result and either do the write-up themselves or lean on their reputation to get the necessary eyeballs.
OK so from my perspective, this favors my point then? You seem to agree that guiding Claude Code to produce a top quality social science paper is fairly possible. You haven’t given any particular reason to believe that social science work is fundamentally different-in-kind from AI safety work—indeed I expect there is a fair amount of social science which could be relevant to AI safety! We both agree that many naive LLM posts are crap. So why would I invest weeks or months in prompting and guiding Claude Code for alignment research, if my post will get placed in the same bin as the the “naive LLM crap”? Can I pay some sort of karma fee to be placed in a different bin? Can users with at least 100 karma have some other sort of “trustworthy LLM use” credit account which gets “overdrafted” if I am found to repeatedly produce LLM crap?
I do understand your concern.
Thanks for saying this. Sorry if I’m repeating myself too much.
if we enter a regime where LLMs are able to get useful[2] alignment research done in a basically automated way, then frankly I think we will have entered a completely different regime where we will need to be rethinking quite a lot of how we relate to the world.
A lot of credible people are claiming that diffusion will be a rate-limiting step on LLM adoption. “The future is already here – it’s just not very evenly distributed.” In my view you are thinking too much in terms of binary “regimes” and too little in terms of seizing opportunities when they arise.
80% it comes from labs first, so frankly I am not that worried about the second hoop.
OK but the stuff I’m seeing online about automated production of academic papers is coming from academics, not labs. The value of automated alignment research seems high enough that we should encourage random academics to contribute to it, if they believe they have a contribution to make?
Once more I implore you to look at the list of rejected posts on our moderation page and tell me that you think the signal to noise ratio would be improved by allowing unmarked LLM content on LessWrong.
Are we talking about this page ? Based on a quick ctrl-f for “LLM Writing”, the current policy has been invoked manually around 12 times in the past 6 months, with the vast majority of invocations being automated. It looks like there were 507 accepted posts in February alone based on https://www.lesswrong.com/allPosts ? So currently, under 1% of posts are manual LLM-rejections? From my POV, up to 10% of LLM posts would be plausibly be worth it for VoI purposes.
The automated LLM detection is likely a valuable signal, but it’s quite compatible with my advocated policy of filtering based on content rather than filtering based on the honor system (as are manual rejections!)
Anyways my position is not simply “allow unmarked LLM content on LW and let it rip”, I’ve already elaborated a number of alternatives in this thread. I don’t want to become a broken clock so I’ll just encourage you once more to brainstorm and evaluate alternative approaches here. It seems you will have to solve this problem “for real” eventually regardless of what you do. If you’re going to deploy the planned change, I encourage you to see it as a stopgap and start thinking about what’s next right away. Best of luck.
Doesn’t seem at all ruled out by what I said (though even then I do not think you could reliably do that without an expert human in the loop).
I don’t really understand how you came to this conclusion. There is no prohibition on using LLMs to do research that ends up published on LessWrong. I am sure that a large percentage of the work necessary for Ryan Greenblatt’s latest piece of empirical research published on LessWrong was done by LLMs. This is totally fine. I expect this ratio to become more skewed over time.
I am confused by what you think we’re currently doing and what we’ll be doing in the future. We do already have both automated systems and human review for detecting whether posts contain non-trivial amounts of LLM-generated content in them. We are not going to stop using those systems the minute we introduce content blocks where LLM-generated content is permissible. We are moving to a more permissive regime with respect to LLM-generated content than the one we’re currently in.
If I do this with a human in the loop, it will still count as LLM-generated and you will require it to be tagged as such, correct?
I think you are currently prohibiting LLM writing and you will soon require it to be tagged as such, which will still de facto stigmatize experimenting with automated alignment work and nudge the leading edge of “LLMs-for-research” elsewhere. You’re forcing people to jump through two hoops: (a) produce good automated alignment research, (b) convince people it’s worth a read even though it’s AI. I’m saying (a) should be enough. The skills to accomplish (a) and (b) may be very different btw. And the best people at (a) are not necessarily the people who are already ingroup such as Ryan Greenblatt.
Yes.
Once more I implore you to look at the list of rejected posts on our moderation page and tell me that you think the signal to noise ratio would be improved by allowing unmarked LLM content on LessWrong.
I do understand your concern. But I think you are ignoring the enormous costs of adopting your policy now, while current LLMs are not able to produce automated alignment research[1]. And if we enter a regime where LLMs are able to get useful[2] alignment research done in a basically automated way, then frankly I think we will have entered a completely different regime where we will need to be rethinking quite a lot of how we relate to the world. (Also, >80% it comes from labs first, so frankly I am not that worried about the second hoop.)
At all, I think, but certainly not at sufficiently low cost that it’s dominated by the marginal cost of having a human expert verify their result and either do the write-up themselves or lean on their reputation to get the necessary eyeballs.
By my standards.
OK so from my perspective, this favors my point then? You seem to agree that guiding Claude Code to produce a top quality social science paper is fairly possible. You haven’t given any particular reason to believe that social science work is fundamentally different-in-kind from AI safety work—indeed I expect there is a fair amount of social science which could be relevant to AI safety! We both agree that many naive LLM posts are crap. So why would I invest weeks or months in prompting and guiding Claude Code for alignment research, if my post will get placed in the same bin as the the “naive LLM crap”? Can I pay some sort of karma fee to be placed in a different bin? Can users with at least 100 karma have some other sort of “trustworthy LLM use” credit account which gets “overdrafted” if I am found to repeatedly produce LLM crap?
Thanks for saying this. Sorry if I’m repeating myself too much.
A lot of credible people are claiming that diffusion will be a rate-limiting step on LLM adoption. “The future is already here – it’s just not very evenly distributed.” In my view you are thinking too much in terms of binary “regimes” and too little in terms of seizing opportunities when they arise.
OK but the stuff I’m seeing online about automated production of academic papers is coming from academics, not labs. The value of automated alignment research seems high enough that we should encourage random academics to contribute to it, if they believe they have a contribution to make?
Are we talking about this page ? Based on a quick ctrl-f for “LLM Writing”, the current policy has been invoked manually around 12 times in the past 6 months, with the vast majority of invocations being automated. It looks like there were 507 accepted posts in February alone based on https://www.lesswrong.com/allPosts ? So currently, under 1% of posts are manual LLM-rejections? From my POV, up to 10% of LLM posts would be plausibly be worth it for VoI purposes.
The automated LLM detection is likely a valuable signal, but it’s quite compatible with my advocated policy of filtering based on content rather than filtering based on the honor system (as are manual rejections!)
N hfre fbcuvfgvpngrq rabhtu gb cebqhpr n gbc dhnyvgl cncre jvgu na YYZ pna yvxryl nyfb qrsrng YYZ qrgrpgvba. uggcf://kxpq.pbz/125/ Qba’g rapbhentr crbcyr gb qrsrng YYZ qrgrpgvba, whfg znxr vg fb gung gur yrggre bs gur cbyvpl fnlf vg vf YYZ qrgrpgvba juvpu znggref (abg YYZ hfntr cre fr), fb vs na ubarfg crefba svaqf n jnl gb nhgbzngr uhzna-vaqvfgvathvfunoyr nyvtazrag erfrnepu, gurl pna or erjneqrq jvgu n ybg bs xnezn sbe qbvat fb. (Va bgure jbeqf: qba’g arprffnevyl nfx hfref gb hfr gur ubabe flfgrz va ynoryvat YYZ-bevtvangrq pbagrag; fvzcyl hfr lbhe rkvfgvat nhgbzngrq qrgrpgbe gb ynory cbfgf sbe ernqref, cyhf ghar vg bire gvzr gb cevbevgvmr “penc qrgrpgvba” bire “YYZ qrgrpgvba” cre fr.) (V’z znatyvat guvf cnentencu jvgu ebg13 gb znxr vg uneqre sbe qhzore YYZf gb svaq, gb uryc “novyvgl gb qrsrng YYZ qrgrpgvba” fgnl hfrshy nf n dhnyvgl fvtany.)
Anyways my position is not simply “allow unmarked LLM content on LW and let it rip”, I’ve already elaborated a number of alternatives in this thread. I don’t want to become a broken clock so I’ll just encourage you once more to brainstorm and evaluate alternative approaches here. It seems you will have to solve this problem “for real” eventually regardless of what you do. If you’re going to deploy the planned change, I encourage you to see it as a stopgap and start thinking about what’s next right away. Best of luck.