Myron Hedderson comments on Thane Ruthenis’s Shortform

Myron Hedderson 14 Apr 2026 19:14 UTC
3 points
−1
I have not updated much based on “doing project Glasswing makes me update in favour of Anthropic being a better company than I expected.”

However, I’ve updated a bit on the people within Anthropic all or nearly-all being better than I expected based on the fact that they have had a powerful hacking machine for a while and no catastrophes have occurred yet. “You can hack into anything” or “you could exfiltrate and sell this for billions” are powerful temptations for an amoral power-seeking individual who had been biding their time, and if there are any such individuals with access to Mythos, they have chosen not to make a move (at least as far as we can tell at this time), where I would expect an intelligent amoral person who planned to eventually abuse their position for personal gain to maybe decide this is the time to chance it, before a bunch of security holes get fixed. And an average but amoral person with no such plans, could still see these results and have fantasies of power which cause them to try abusing Mythos’ abilities.

So, I’ve updated in favour of the quality of people at Anthropic, but not because of Glasswing specifically. More generally, they had access to a very powerful thing which it would be tempting to abuse, and no abuse has yet surfaced. It’s like a kid having access to a marshmallow, and choosing not to eat the marshmallow. A wise and thoughtful person might think “marshmallows are bad for us, let’s not eat it” or “there are greater returns to waiting”, but if you give 10 kids 10 marshmallows, someone is going to eat one immediately, unless you’re dealing with an exceptional group of kids.

But yeah, I don’t see an obvious way for Anthropic as a company to benefit from exploiting a bunch of security vulnerabilities more than it will benefit by maintaining public goodwill towards the organization, and it does seem that whatever way people may be thinking feels pretty obvious to them, rather than being devious and convoluted. “Blackmail a bunch of powerful people with information they would prefer not become public” is a strategy an individual could try, but not something that would work as corporate strategy for an organization looking to make billions in revenue, and “hacking --> ransomware/extortion” or “hacking --> blackmail key people --> gain political power” are likewise strategies that make sense for individuals or small groups, but not Anthropic.

Well… actually blackmailing key people to achieve political outcomes is a thing Anthropic could do, and I wouldn’t have evidence it had happened yet. But I don’t expect they would do that, as a company, because I’d think the ratio of people with a conscience who have qualms about blackmail to sociopaths who do not, at Anthropic is such that any sociopaths have to hide most of the time.