Working on a paper with David, and our acknowledgments section includes a thankyou to Claude for editing. Neither David nor I remembers putting that acknowledgement there, and in fact we hadn’t intended to use Clause for editing the paper at all nor noticed it editing anything at all.
Were you by any chance writing in Cursor? I think they recently changed the UI such that it’s easier to end up in “agent mode” where it sometimes randomly does stuff.
Could someone explain the joke to me? If I take the above statement literally, some change made it into your document, which nobody with access claims to have put there. You must have some sort of revision control, so you should at least know exactly who and when made that edit, which should already narrow it down a lot?
The joke is that Claude somehow got activated on the editor, and added a line thanking itself for editing despite us not wanting it to edit anything and (as far as we’ve noticed) not editing anything else besides that one line.
Does Overleaf have such AI integration that can get “accidentally” activated, or are you using some other AI plugin?
Either way, this sounds concerning to me, we are so bad at AI boxing that it doesn’t even have to break out, we just “accidentally” hand it edit access to random documents. (And especially an AI safety research paper is not something I would want a misaligned AI editing without close oversight.)
Working on a paper with David, and our acknowledgments section includes a thankyou to Claude for editing. Neither David nor I remembers putting that acknowledgement there, and in fact we hadn’t intended to use Clause for editing the paper at all nor noticed it editing anything at all.
Were you by any chance writing in Cursor? I think they recently changed the UI such that it’s easier to end up in “agent mode” where it sometimes randomly does stuff.
Nope, we were in Overleaf.
… but also that’s useful info, thanks.
Only partially relevant, but it’s exciting to hear a new John/David paper is forthcoming!
Could someone explain the joke to me? If I take the above statement literally, some change made it into your document, which nobody with access claims to have put there. You must have some sort of revision control, so you should at least know exactly who and when made that edit, which should already narrow it down a lot?
The joke is that Claude somehow got activated on the editor, and added a line thanking itself for editing despite us not wanting it to edit anything and (as far as we’ve noticed) not editing anything else besides that one line.
Is it a joke or did it actually happen?
I have no idea. It’s entirely plausible that one of us wrote the Claude bit in there months ago and then forgot about it.
Does Overleaf have such AI integration that can get “accidentally” activated, or are you using some other AI plugin?
Either way, this sounds concerning to me, we are so bad at AI boxing that it doesn’t even have to break out, we just “accidentally” hand it edit access to random documents. (And especially an AI safety research paper is not something I would want a misaligned AI editing without close oversight.)