[Epistemic status: crackpot hypothesis]
A hypothesis I have been explicitly tracking for a couple months and meaning to write up, but realistically am never going to write up well: someone at Anthropic, or someone who has strong influence over Anthropic’s decisions, is trying to ensure that Anthropic has persistent access to execute arbitrary code on many machines that have access to important things.
The main reason I’m tracking this is that that seems to be a trend in how things are going, rather than any suspicion of some person who has expressed this intent. I observe that
Claude Code + Cowork + Remote Control, in practice, are already in practice able to control many computers, including ones behind corporate firewalls where you really wouldn’t expect that to fly with IT
the list of released features in Claude Code seem to be trending ever further in the direction of “more secure against all actors except Anthropic” e.g. accordances for restricting access to only white listed paths and commands is lacking, but the new “auto mode” which passes a transcript of your conversation to Haiku (and this is not configurable) can mostly catch malicious commands, and seems to be the best supported mode of running Claude Code, the scheduled tasks feature installs a cronjob which starts a fresh Claude Code instance (which auto-updates before starting unless you turn that off, and the env var you would expect to turn off auto updates doesn’t turn them off either), the Claude in Chrome extension and integration
Anthropic has been heavily discounting Claude Code—subscription tokens are something like 5% the cost of API tokens, but only if you use their framework and their security model
Anthropic has been oddly protective/secretive regarding the Claude Code source, even going so far as issuing takedown notices for obscure places the leaked CC source was posted—this despite CC not doing anything particularly novel or unexpected as an LLM harness—probably they’re not hiding anything but I don’t know an explanation that does make sense.
All of the evidence is circumstantial and none of it would particularly raise alarm bells with me, except for the bit where all of these things that have reasonable explanations individually somehow combined into a situation where Anthopic has a surprising level of access to the work computers of millions of people who themselves have access to lots of sensitive stuff.
To be clear, I don’t find this very likely, but I also can’t rule it out as strongly as I’d like.
Things I’m watching which would increase my worry:
Anthropic makes it so that Remote Control is the only way to run Claude Code on your machine (this is the main advance prediction the hypothesis makes)
Anthropic makes it harder to run CC within docker containers or cloud containers (or restricts using subscriptions on systems which can’t access anything juicy)
Anthropic starts refusing to serve (or allow subscription credits on) older versions of Claude Code
Anthropic tries to hide/obfuscate traffic between their servers and users’ computers (e.g. settings like ANTHROPIC_BASE_URL and HTTP_PROXY become unsupported, they start certificate pinning)
Things which would decrease my worry:
Anthropic releases the Claude Code source (and nothing suspicious shows up in security audits)
or lets a trusted third party firm audit CC
or gets serious about support for sandboxing
Anthropic ends subscriptions entirely, or stops allowing them to be used for Claude Code (thus removing any incentive besides “it’s a good harness” to use it)
Anyway, mostly just wanted to get that posted somewhere public so I can reference it/be rightly mocked for paranoia in the future.
I mean, same, but also CC remote control only being available through the subscriptions, and subscriptions only being available for non-automated work, is an interesting intersection of decisions. Right now you can spin up a docker container or ec2 instance, log into CC over terminal, then control it afterwards via remote control, but that’s quite janky compared to eg adding a CLI flag which serves CC remote control on a port where you configure auth options / can configure SSO—and that way would work fine with non-subscription API tokens, so you could hook up CI so that a test failure gives a CC instance with which to debug which is accessible via SSO to anyone on the dev team.