I think extensive use of LLM should be flagged at the beginning of a post, but “uses an LLM in any part of its production process whatsoever” would probably result in the majority of posts being flagged and make the flag useless for filtering. For example I routinely use LLMs to check my posts for errors (that the LLM can detect), and I imagine most other people do so as well (or should, if they don’t already).
Unfortunately this kind of self flagging/reporting is ultimately not going to work, as far as individually or societally protecting against AI-powered manipulation, and I doubt there will be a technical solution (e.g. AI content detector or other kind of defense) either (short of solving metaphilosophy). I’m not sure it will do more good than harm even in the short run because it can give a false sense of security and punish the honest / reward the dishonest, but still lean towards trying to establish “extensive use of LLM should be flagged at the beginning of a post” as a norm.
Yes mostly agree. Unless the providers themselves log all responses and expose some API to check for LLM generation, we’re probably out of luck here, and incentives are strong to defect.
One thing I was thinking about (similar to i.e—speedrunners) is just making a self-recording or screenrecording of actually writing out the content / post? This probably can be verified by an AI or neutral third party. Something like a “proof of work” for writing your own content.
If it became common to demand and check proofs of (human) work, there will be a strong incentive to use AI to generate such proofs, which doesn’t not seem very hard to do.
I think extensive use of LLM should be flagged at the beginning of a post, but “uses an LLM in any part of its production process whatsoever” would probably result in the majority of posts being flagged and make the flag useless for filtering. For example I routinely use LLMs to check my posts for errors (that the LLM can detect), and I imagine most other people do so as well (or should, if they don’t already).
Unfortunately this kind of self flagging/reporting is ultimately not going to work, as far as individually or societally protecting against AI-powered manipulation, and I doubt there will be a technical solution (e.g. AI content detector or other kind of defense) either (short of solving metaphilosophy). I’m not sure it will do more good than harm even in the short run because it can give a false sense of security and punish the honest / reward the dishonest, but still lean towards trying to establish “extensive use of LLM should be flagged at the beginning of a post” as a norm.
Yes mostly agree. Unless the providers themselves log all responses and expose some API to check for LLM generation, we’re probably out of luck here, and incentives are strong to defect.
One thing I was thinking about (similar to i.e—speedrunners) is just making a self-recording or screenrecording of actually writing out the content / post? This probably can be verified by an AI or neutral third party. Something like a “proof of work” for writing your own content.
Grammarly has https://www.grammarly.com/authorship if you want to prove that you wrote something.
If it became common to demand and check proofs of (human) work, there will be a strong incentive to use AI to generate such proofs, which doesn’t not seem very hard to do.
I don’t expect the people on LW that I read to intentionally lie about stuff.