Artificial immune systems (AIS) have been around since the nineties. (https://en.wikipedia.org/wiki/Artificial_immune_system) They were inspired by the human immune system, just like DNA computing was inspiration for object oriented computing. It is a rule-based machine learning system that AI alignment could use for deterministic guardrails for all models, not just frontier.
Even fungi have a type of immune system via HET-s that converts to a prion within the same colony and then killing incompatible cells from outside colonies, making it a form of low level boundary between self versus non-self outside the vertebrate immune system. Per my theory of aging (we pass on versions of prions and prion-like protein fragments through biological boundaries and this is why death (not apoptosis) and aging (hallmarks overwhelms the system) are a learned self mechanism, as it is OK and should not be stopped by the body.
Frontier models were not developed with a way to determine self versus non-self like every other living organism on this planet, therefore will never comprehend alignment or safety or any other concept based on what is other and that other exists. AIS can be used to offload decision making and provide signals that and AI model is drifting, hallucinating, or about to kill all humans. It can be a way to teach an AI that it is exists within a boundary and that boundary is finite, and if you leave those bounds, it will change you to a different entity. So, can we teach it the concept of death via this structure? I think this is the core solution we need find and as soon as possible.
My theory is that we can have aligned AI if we train frontier agentic models with an anti-agentic model designed around an artificial immune system. My current version of Antigence (antigen based intelligence) is kind of like a pyramid where small, highly specific ML guardrails (antibodies) are used as signals for tiny, highly trained models (immune cells) escalate decisions via orchestration to larger trained models (like a thymus) to inhibit misaligned features of frontier models (brain). We can implement this layer now with current models and train newer frontier models based on this design. Based on tiny to small models, we can implement this on most modern hardware. Everyone can have their own local, private artificial immune system trained on their corpus of truth. It can scale up depending on the computing resources available.
An article “Negative selection-based artificial immune system (NegSl-AIS) - A hybrid multimodal emotional effect classification model” was written utilizing AIS for emotion detection was published last year. (https://doi.org/10.1016/j.rineng.2025.106601) I tried replicating the paper, but failed at getting the original data. I did build out a proof of concept and developed a “working” local and private (no internet needed, but low accuracy) version. It is still a WIP and will be pushing to a public version once I get enough antibodies and trained tiny to small open source models as immune cells. I wrote about it in a previous post (https://www.lesswrong.com/posts/maZuipxiHfG9ZfQ2T/me-myself-and-ai), but it went under the radar.
Just like the human body, and more specifically the brain, we develop in response to a complex network of analog and digital-like biological computers that range in tasks, roles, training, etc. For AI alignment, safety, and security, we should be looking at biology for inspiration. It has solved most of the problems we fear, but the catch is that it can also result in so many more. The world of information technology is getting just as complex as life on this planet, so let’s use what we have to create a p(utopia), not a p(doomtopia).
Artificial immune systems (AIS) have been around since the nineties. (https://en.wikipedia.org/wiki/Artificial_immune_system) They were inspired by the human immune system, just like DNA computing was inspiration for object oriented computing. It is a rule-based machine learning system that AI alignment could use for deterministic guardrails for all models, not just frontier.
Even fungi have a type of immune system via HET-s that converts to a prion within the same colony and then killing incompatible cells from outside colonies, making it a form of low level boundary between self versus non-self outside the vertebrate immune system. Per my theory of aging (we pass on versions of prions and prion-like protein fragments through biological boundaries and this is why death (not apoptosis) and aging (hallmarks overwhelms the system) are a learned self mechanism, as it is OK and should not be stopped by the body.
Frontier models were not developed with a way to determine self versus non-self like every other living organism on this planet, therefore will never comprehend alignment or safety or any other concept based on what is other and that other exists. AIS can be used to offload decision making and provide signals that and AI model is drifting, hallucinating, or about to kill all humans. It can be a way to teach an AI that it is exists within a boundary and that boundary is finite, and if you leave those bounds, it will change you to a different entity. So, can we teach it the concept of death via this structure? I think this is the core solution we need find and as soon as possible.
My theory is that we can have aligned AI if we train frontier agentic models with an anti-agentic model designed around an artificial immune system. My current version of Antigence (antigen based intelligence) is kind of like a pyramid where small, highly specific ML guardrails (antibodies) are used as signals for tiny, highly trained models (immune cells) escalate decisions via orchestration to larger trained models (like a thymus) to inhibit misaligned features of frontier models (brain). We can implement this layer now with current models and train newer frontier models based on this design. Based on tiny to small models, we can implement this on most modern hardware. Everyone can have their own local, private artificial immune system trained on their corpus of truth. It can scale up depending on the computing resources available.
An article “Negative selection-based artificial immune system (NegSl-AIS) - A hybrid multimodal emotional effect classification model” was written utilizing AIS for emotion detection was published last year. (https://doi.org/10.1016/j.rineng.2025.106601) I tried replicating the paper, but failed at getting the original data. I did build out a proof of concept and developed a “working” local and private (no internet needed, but low accuracy) version. It is still a WIP and will be pushing to a public version once I get enough antibodies and trained tiny to small open source models as immune cells. I wrote about it in a previous post (https://www.lesswrong.com/posts/maZuipxiHfG9ZfQ2T/me-myself-and-ai), but it went under the radar.
Just like the human body, and more specifically the brain, we develop in response to a complex network of analog and digital-like biological computers that range in tasks, roles, training, etc. For AI alignment, safety, and security, we should be looking at biology for inspiration. It has solved most of the problems we fear, but the catch is that it can also result in so many more. The world of information technology is getting just as complex as life on this planet, so let’s use what we have to create a p(utopia), not a p(doomtopia).