causalitylimited
The immune system as analogy is apt but I’m also thinking of mechanics: The defender can’t access agent files, so it has to work from behavioral signals alone, which extends the analogy nicely because Immune system does not do some dna inspection, just looks for surface markers.
So, behavioral anomaly detection can only be done by 1) infrastructure providers, because they have the telemetry—maybe replicators make calls to multiple llms per n seconds etc, 2) LLM API providers, because they can see call patterns and detect markers for no-human-in-the-loop.
Still, the replication signature will largely be stuff like provisioning new servers, using throwaway emails, outbound file copies, automated account creation etc, which will also be legitimate deployment activity by some programs, for eg, bot farms for amazon reviews—not a good or even legal use but different from self replication. So, this is not an easy problem.
And even harder one will be when replicators learn or optimize to blend with legitimate activity. It seems like an arms race, specially if they have some human support for varied human motivations.
I am here because someone said this is where I belong.
I’ve wanted to write and never had the time. Recently, I made time. And of all the writing projects that I’ve started over the years, I decided to pickup the philosophical essays because most ideas were fresh and more importantly I knew I could actually deliver a few before the opportunity of time expires.
Before starting to formally post on the internet, I was sending thoughts to friends and family and getting no responses, no pushback, no agreement. I am aware this thinking tends to produce long texts because one is trying to be exact when presenting an argument. In anycase, I told a friend about it, and he said, “it’s because you belong on lesswrong and you are living on x”. (I wasn’t living on x, just lurking, and I had read a couple of essays here without noticing the source. But that’s beside the point.)
I am hoping, he was right.