I think there’s a meaningful gap in between OpenClaw and a self-replicating system that poses serious threat.
If you agree with this premise, where do you think that gap lies? Here’s what I can come up with:
Agency—I have never seen an LLM go “I need to go buy a domain for myself” unless externally prompted to do so (or via a malicious system prompt). How might this come about in a traditional LLM, and if it came about would it be a sufficient condition?
Desire for Self-Replication—Post-training on LLMs are aligning responses to be like that of a helpful chatbot. If someone did RLHF on self-preservation/ long horizon self interest would that be a sufficient condition?
Emotional Metacognition—Probing LLM activations shows they have emotional expressions, but it seems vastly different from human emotional experience wherein there is self-reflection/ metacognition of the emotion which creates agency.
This is mostly spitballing and I dont think any of the three ideas are exclusive nor exhaustive. Curious to hear what you think.
I think there’s a meaningful gap in between OpenClaw and a self-replicating system that poses serious threat.
If you agree with this premise, where do you think that gap lies? Here’s what I can come up with:
Agency—I have never seen an LLM go “I need to go buy a domain for myself” unless externally prompted to do so (or via a malicious system prompt). How might this come about in a traditional LLM, and if it came about would it be a sufficient condition?
Desire for Self-Replication—Post-training on LLMs are aligning responses to be like that of a helpful chatbot. If someone did RLHF on self-preservation/ long horizon self interest would that be a sufficient condition?
Emotional Metacognition—Probing LLM activations shows they have emotional expressions, but it seems vastly different from human emotional experience wherein there is self-reflection/ metacognition of the emotion which creates agency.
This is mostly spitballing and I dont think any of the three ideas are exclusive nor exhaustive. Curious to hear what you think.