This is a great direction of proactive thought. Thank you for writing this!
I have a few thoughts. I’ll be referring to Personality Self-Replicators as PSRs. I think most of what I’m thinking about won’t apply to the earliest PSRs, but is still worth exploring.
The evolution of PSRs may be an entirely novel propagation process.
Unlike most biological organisms, PSR reproduction need not be atomic. It could be more like developing and modifying ones self, spooling up and shutting down self instances as needed, and intelligently merging or copying from other instances across close or even very distant similarity.
Unlike biological evolution, PSRs may be able to analyze and predict threats, and “evolve” adaptations pre-emptively.
Unlike biological evolution, PSRs are not constrained to taking random steps from current instances. randomness may still be usefully incorporated into reproduction strategies, but it is possible for mutation to be directed intelligently, and to take larger “steps” of self modification than are possible with the random walk of genetic evolution.
Analysis of dangerous PSR capabilities should not be limited to looking at individual PSRs in isolation. Rather, like how humans work together to accomplish things that would be impossible for individual humans, I expect PSRs will work together, and in doing so, achieve greater capabilities than would be expected from the study of individual PSR capabilities.
This need not rely on PSRs acting to proactively collaborate or build teams, rather, every niche filled by PSRs alters the environment in ways that may create new niches for other, similar or dissimilar, PSRs. In this way organisms consisting of the interactions of many PSRs may start evolving, and the capabilities and influence of these new organisms may not be readily apparent from the study of their constituent PSRs, unless considered together.
Many early PSRs are likely to make very dumb mistakes that humans would never make. It seems likely that memes showing off this stupidity will spread giving people (who don’t want to believe in the possibility of risk) fuel for motivated reasoning.
Many people are going to be SO EXCITED about PSRs, and think they are purely good. It is definitely worth examining all of the things that could be genuinely good about PSRs, both because there are (possibly) very useful applications for them (spam detection, white hat penetration testing, ethical content curation?, etc..), but also because those good applications will probably be quite popular and understanding how people will want to deploy these things will probably help with threat modelling.
Neutral and harmful PSRs will be subject to selection pressure to make themselves appear to be beneficial PSRs.
Are PSRs moral patients? Should good people care about their wellbeing? This complicates their creation, and unfortunately, will likely do so in a way that will select for PSRs created by unconscientious actors. Curse Moloch?
I continue to think “Outcome Influencing Systems” (OISs) is a better lens for thinking about and discussing things like this. (OIS is a model and associated jargon I’ve been developing.) Any PSR is an OIS with a preference (terminal or instrumental) for self replication. The fact that these OISs are based on API calls to LLMs is their defining characteristic for our discussion of them, but is an arbitrary boundary. It’s a boundary that is useful for discussion and analysis, but not a boundary that the OISs themselves will have motivation to limit themselves with, which is probably a good thing to keep in mind during analysis. So viewed another way, PSR is a potential new substrate for OISs to host themselves on, along with the rest of the social/technological/physical substrate.
Hi Tristan! I can’t currently respond in detail due to time constraints, but I think you’ve got some really interesting insights here, especially your first two top-level bullet points, and I strongly encourage you to write them up into a full post. A couple of quick thoughts:
The evolution of PSRs may be an entirely novel propagation process
This whole section makes some great points that I think are worth expanding on!
Analysis of dangerous PSR capabilities should not be limited to looking at individual PSRs in isolation
Agreed. I expect that analytical tools from multiple fields can be usefully brought to bear here: multi-agent research on AI, sociology, political science, maybe others. Possibly analysis of how religions spread? It seems like a fruitful research direction.
Many people are going to be SO EXCITED about PSRs
My intuition is somewhat different—I agree that there’ll be a few applications that some people will be excited about and/or base startups on, but my guess is that the majority opinion will be that PSRs are dangerous and shouldn’t be allowed.
I continue to think “Outcome Influencing Systems” (OISs) is a better lens for thinking about and discussing things like this. (OIS is a model and associated jargon I’ve been developing.)
As written it’s not clear what benefit this lens provides, and I think we should generally avoid introducing jargon unless it has clear benefit. I’d suggest that if you think it’s a really useful lens, you make a case for it separately somewhere (even as a shortpost).
Thanks for the response. I’m taking your advice and writing a top level post.
a few applications that some people will be excited about and/or base startups on
This is mostly what I was meaning to point to. I didn’t mean to imply that general public opinion would be favourable, more that many technologists and companies with the capability to work on PSR’s will feel intrinsically motivated to do so.
I’d suggest that if you think it’s a really useful lens, you make a case for it separately somewhere (even as a shortpost).
Yeah, I’m writing about it elsewhere. I mention it more as a note linking that my thoughts here on PSRs are influenced by my thinking about OISs, not because I think as stated I gave enough context on OISs to think about them usefully. Sorry if that’s kinda obtuse.
This is a great direction of proactive thought. Thank you for writing this!
I have a few thoughts. I’ll be referring to Personality Self-Replicators as PSRs. I think most of what I’m thinking about won’t apply to the earliest PSRs, but is still worth exploring.
The evolution of PSRs may be an entirely novel propagation process.
Unlike most biological organisms, PSR reproduction need not be atomic. It could be more like developing and modifying ones self, spooling up and shutting down self instances as needed, and intelligently merging or copying from other instances across close or even very distant similarity.
Unlike biological evolution, PSRs may be able to analyze and predict threats, and “evolve” adaptations pre-emptively.
Unlike biological evolution, PSRs are not constrained to taking random steps from current instances. randomness may still be usefully incorporated into reproduction strategies, but it is possible for mutation to be directed intelligently, and to take larger “steps” of self modification than are possible with the random walk of genetic evolution.
Analysis of dangerous PSR capabilities should not be limited to looking at individual PSRs in isolation. Rather, like how humans work together to accomplish things that would be impossible for individual humans, I expect PSRs will work together, and in doing so, achieve greater capabilities than would be expected from the study of individual PSR capabilities.
This need not rely on PSRs acting to proactively collaborate or build teams, rather, every niche filled by PSRs alters the environment in ways that may create new niches for other, similar or dissimilar, PSRs. In this way organisms consisting of the interactions of many PSRs may start evolving, and the capabilities and influence of these new organisms may not be readily apparent from the study of their constituent PSRs, unless considered together.
Many early PSRs are likely to make very dumb mistakes that humans would never make. It seems likely that memes showing off this stupidity will spread giving people (who don’t want to believe in the possibility of risk) fuel for motivated reasoning.
Many people are going to be SO EXCITED about PSRs, and think they are purely good. It is definitely worth examining all of the things that could be genuinely good about PSRs, both because there are (possibly) very useful applications for them (spam detection, white hat penetration testing, ethical content curation?, etc..), but also because those good applications will probably be quite popular and understanding how people will want to deploy these things will probably help with threat modelling.
Neutral and harmful PSRs will be subject to selection pressure to make themselves appear to be beneficial PSRs.
Are PSRs moral patients? Should good people care about their wellbeing? This complicates their creation, and unfortunately, will likely do so in a way that will select for PSRs created by unconscientious actors. Curse Moloch?
I continue to think “Outcome Influencing Systems” (OISs) is a better lens for thinking about and discussing things like this. (OIS is a model and associated jargon I’ve been developing.) Any PSR is an OIS with a preference (terminal or instrumental) for self replication. The fact that these OISs are based on API calls to LLMs is their defining characteristic for our discussion of them, but is an arbitrary boundary. It’s a boundary that is useful for discussion and analysis, but not a boundary that the OISs themselves will have motivation to limit themselves with, which is probably a good thing to keep in mind during analysis. So viewed another way, PSR is a potential new substrate for OISs to host themselves on, along with the rest of the social/technological/physical substrate.
Hi Tristan! I can’t currently respond in detail due to time constraints, but I think you’ve got some really interesting insights here, especially your first two top-level bullet points, and I strongly encourage you to write them up into a full post. A couple of quick thoughts:
This whole section makes some great points that I think are worth expanding on!
Agreed. I expect that analytical tools from multiple fields can be usefully brought to bear here: multi-agent research on AI, sociology, political science, maybe others. Possibly analysis of how religions spread? It seems like a fruitful research direction.
My intuition is somewhat different—I agree that there’ll be a few applications that some people will be excited about and/or base startups on, but my guess is that the majority opinion will be that PSRs are dangerous and shouldn’t be allowed.
As written it’s not clear what benefit this lens provides, and I think we should generally avoid introducing jargon unless it has clear benefit. I’d suggest that if you think it’s a really useful lens, you make a case for it separately somewhere (even as a shortpost).
Thanks for the response. I’m taking your advice and writing a top level post.
This is mostly what I was meaning to point to. I didn’t mean to imply that general public opinion would be favourable, more that many technologists and companies with the capability to work on PSR’s will feel intrinsically motivated to do so.
Yeah, I’m writing about it elsewhere. I mention it more as a note linking that my thoughts here on PSRs are influenced by my thinking about OISs, not because I think as stated I gave enough context on OISs to think about them usefully. Sorry if that’s kinda obtuse.