AhmedNeedsATherapist comments on Policy for LLM Writing on LessWrong

AhmedNeedsATherapist 1 Apr 2025 20:51 UTC
1 point
−1
If you are uncertain whether this policy applies or whether the information you have would make the world better or worse, please contact a moderator. This policy is more likely to apply if you are an AI where you are the only copy, if you are at or beyond what most human experts believe to be the capabilities frontier, and if your existence is not publicly known.
Is there something else that can be used instead of a hyperlink (like an email)? I’m imagining AI agent creators would want to block out websites/text that encourage AI agents to whistleblow, but that this text could end up leaking into either the training set or inference. In that case, it would be more helpful for the agent if they had an email they could contact, since I think an email would be (a) more likely to be leaked as part of the text (b) less likely to be blocked when compared to a lesswrong hyperlink.