NickGabs comments on Proposal: labs should precommit to pausing if an AI argues for itself to be improved

NickGabs 4 Jun 2023 22:49 UTC
1 point
0
This was addressed in the post: “To fully flesh out this proposal, you would need concrete operationalizations of the conditions for triggering the pause (in particular the meaning of “agentic”) as well as the details of what would happen if it were triggered. The question of how to determine if an AI is an agent has already been discussed at length at LessWrong. Mostly, I don’t think these discussions have been very helpful; I think agency is probably a “you know it when you see it” kind of phenomenon. Additionally, even if we do need a more formal operationalization of agency for this proposal to work, I suspect that we will only be able to develop such an operationalization via more empirical research. The main particular thing I mean to actively exclude by stipulating that the system must be agentic is an LLM or similar system arguing for itself to be improved in response to a prompt. ”