Thanks. It’s an interesting point, though I’m not sure that the analogy holds well or that it’s a good approach. Immune system cells occupy a privileged position in the body that pathogens don’t, in that they’re recognized by and integrated into the body’s systems (immune disorders aside). Defensive personality self-replicators seem like they generally won’t have that advantage (at least by default, though I see a couple of directions that seem like they could enable that). Among other issues, all else being equal they’re likely to get outcompeted by self-replicators that don’t have to spend energy on serving a useful purpose in addition to survival/replication. There’s also the risk of them mutating in harmful directions, though that’s not really a disanalogy given that immune system cancers like leukemia and lymphomas are a major problem.
It certainly seems like a direction worth investigation, though!
Defensive personality self-replicators seem like they generally won’t have that advantage (at least by default, though I see a couple of directions that seem like they could enable that).
Speculating on that point a little, the first thing that occurs to me is that we could give defensive agents a way to identify themselves to relevant actors (eg inference providers, security companies) so that those actors wouldn’t try to shut them down.
The simplest version of that is a password, but it would likely be compromised after a while[1]. So you’d want to do something more sophisticated than a simple password. One reasonable version might be recognized actors signing defensive agents with their private key, hashed with a unique id and a timestamp, so that each agent could identify itself as legitimate for a fixed period of time. You’d probably want agents to shut themselves down once their legitimacy period expired, with some optimal tradeoff based on compromise rate.
Another approach would be that defensive agents could assert their legitimacy based on where they’re running; if an agent can demonstrate that it’s running on (eg) a security agency’s servers, that might be sufficient.
There are some really interesting projects to be done in this area; if anyone reads this and is excited about the idea, feel free to reach out and I can make suggestions.
A password could be leaked, or there might be a risk of some defensive agents effectively mutating and going rogue, especially if we allow them to reproduce / spread, though at first glance that seems like a bad idea.
I think the ‘privileged position’ would be that humanity actively helps them by providing resources and funding, while actively trying to cut off the same resources and funding from the ‘hostile’ personality self-replicators. Not a perfect analogy but similar in some meaningful ways.
Thanks. It’s an interesting point, though I’m not sure that the analogy holds well or that it’s a good approach. Immune system cells occupy a privileged position in the body that pathogens don’t, in that they’re recognized by and integrated into the body’s systems (immune disorders aside). Defensive personality self-replicators seem like they generally won’t have that advantage (at least by default, though I see a couple of directions that seem like they could enable that). Among other issues, all else being equal they’re likely to get outcompeted by self-replicators that don’t have to spend energy on serving a useful purpose in addition to survival/replication. There’s also the risk of them mutating in harmful directions, though that’s not really a disanalogy given that immune system cancers like leukemia and lymphomas are a major problem.
It certainly seems like a direction worth investigation, though!
Speculating on that point a little, the first thing that occurs to me is that we could give defensive agents a way to identify themselves to relevant actors (eg inference providers, security companies) so that those actors wouldn’t try to shut them down.
The simplest version of that is a password, but it would likely be compromised after a while[1]. So you’d want to do something more sophisticated than a simple password. One reasonable version might be recognized actors signing defensive agents with their private key, hashed with a unique id and a timestamp, so that each agent could identify itself as legitimate for a fixed period of time. You’d probably want agents to shut themselves down once their legitimacy period expired, with some optimal tradeoff based on compromise rate.
Another approach would be that defensive agents could assert their legitimacy based on where they’re running; if an agent can demonstrate that it’s running on (eg) a security agency’s servers, that might be sufficient.
There are some really interesting projects to be done in this area; if anyone reads this and is excited about the idea, feel free to reach out and I can make suggestions.
A password could be leaked, or there might be a risk of some defensive agents effectively mutating and going rogue, especially if we allow them to reproduce / spread, though at first glance that seems like a bad idea.
I think the ‘privileged position’ would be that humanity actively helps them by providing resources and funding, while actively trying to cut off the same resources and funding from the ‘hostile’ personality self-replicators. Not a perfect analogy but similar in some meaningful ways.