I don’t expect Anthropic to stick to any of their policies when competitive pressure means they have to train and deploy and release or be left behind. None of their commitments are of a kind they wouldn’t be able to walk back on.
Anthropic accelerates capabilities more than safety; they don’t even support regulation, with many people internally being misled about Anthropic’s efforts. None of their safety efforts meaningfully contributed to solving any of the problems you’d have to solve to have a chance of having something much smarter than you that doesn’t kill you.
I’d be mildly surprised if there’s a consensus at Anthropic that they can solve superalignment. The evidence they’re getting shows, according to them, that we live in an alignment-is-hard world.
If any of these arguments are Anthropic’s, I would love for them to say that out loud.
I don’t expect Anthropic to stick to any of their policies when competitive pressure means they have to train and deploy and release or be left behind. None of their commitments are of a kind they wouldn’t be able to walk back on.
Anthropic accelerates capabilities more than safety; they don’t even support regulation, with many people internally being misled about Anthropic’s efforts. None of their safety efforts meaningfully contributed to solving any of the problems you’d have to solve to have a chance of having something much smarter than you that doesn’t kill you.
I’d be mildly surprised if there’s a consensus at Anthropic that they can solve superalignment. The evidence they’re getting shows, according to them, that we live in an alignment-is-hard world.
If any of these arguments are Anthropic’s, I would love for them to say that out loud.