For anyone reading this: if you’re ever in a situation where you wore a suit when others are less formal, Do Not Panic.
Remove tie (if wearing tie) and place in inside jacket pocket. Remove jacket and place out of view. Unbutton top button of collar, if buttoned. Roll up sleeves. Sit in a way that pants aren’t visible, if possible.
Boom, you’re now down to something much less formal in under 30 seconds.
davekasten
One thing to consider is a strategy used in Jewish singing contexts (which I see relatively rarely done in other contexts—but maybe it’s super common and I just don’t know the word to describe it!) Before singing the first verse or the chorus, you do a wordless verse where you’re just using nonsense syllables like “lai-dai-lai” or “yai-dah-dai-dai” that matches the song. This enables folks to pre-load the music before having to learn the words, and gives implicit social permission to, if you forget the words, to just do nonsense syllables that match the words. (A common problem if it’s a Hebrew text you’re unfamiliar with and you fall behind in reading it!)
(These are sometimes called niggunim, from the Hebrew for “tune” or “melody” for those wanting to google; for what should be fairly obvious reasons about a false cognate, I didn’t lead with that vocabulary term)
So for example, for The Circle, you’d start with something:
”Yah dai dai dai daiiii?
Dai dai, daaaai dah daaaai.
Yah dai dai dai daiiii?
Dai dai, daaaai dah daaaai.Yah dai dai dai daiiii
Lah dah dai lai lai
Bah dah baiiih bah bah
Bah da bah baii bah
Bah da baiiii da baiiiDai dai, daaaai dah daaaai.”
To presage the beat of verses like:
“So will we bring our families in,
Circle, grow and grow.
those whom Nature made our kin?
Circle, grow and grow.
Countless likenesses we find,
by our common blood bestowed.
What a debt of care is owed;
what a blesséd tie that binds!
Circle, circle, grow and grow.”
(The Circle is a little tricky in that the first verse starts slightly different, but trust me, this social technology extends to that use case as well, it’s not uncommon to have that in Jewish songs)
To be clear, I think people should feel free to block freely and for any reason, including literally no reason at all. I’m open to ways of describing people’s block decisions in the future that better convey that, but I definitely didn’t think others reading this would assume “oh, Zach’s the bad guy here” as opposed to the reverse.
By the “whitepaper,” are you referring to the RSP v2.2 that Zach linked to, or something else? If so, I don’t understand how a generic standard can “call out what kind of changes would need to be required” to their current environment if they’re also claiming they meet their current standard.
Also, just to cut a little more to brass tacks here, can you describe the specific threat model that you think they are insufficiently responding to? By that, I don’t mean just the threat actor (insiders within their compute provider) and their objective to get weights, but rather the specific class or classes of attacks that you expect to occur, and why you believe that existing technical security + compensating controls are insufficient given Anthropic’s existing standards.
For example, AIUI the weights aren’t just sitting naively decrypted at inference, they’re running inside a fairly locked down trusted execution environment, with keys provided only as-needed (probably with an ephemeral keying structure?) from a HSM, and those trusted execution environments are operating inside a physical security perimeter of a data center that already is designed to mitigate insider risk. Which parts of this are you worried are attackable? To what degree are organizational boundaries between Anthropic and its compute providers salient to increasing this risk? Why should we expect that the compute providers don’t already have sufficient compensatory controls here, given that, e.g., these compute providers also provide classified compute to the US government that is secured at the Top Secret / SCI level and presumably therefore have best-in-class anti-insider-threat capabilities?
I’m extremely willing to buy into a claim that they’re not doing enough, but I would actually need to have an argument here that’s more specific.
I’m really not following your argument here. Of course in many instances compute providers don’t offer zero trust relationships with those running on their systems. This is just not news. There’s a reason why we have an entire universe of compensating technical and non-technical controls to mitigate risk in such circumstances.
You have done zero analysis to identify any reason to believe that those compensating controls are insufficient. You could incredibly easily get me to flip sides in this discussion if you offered any of that, but simply saying that someone’s running zero trust isn’t sufficient. As a hypothetical, if Anthropic is expending meaningful effort to be highly confident that they can ensure that Amazon’s own security processes are securing against insiders, they would have substantial risk reduction (as long as they can have high confidence said processes are continuing to be executed.
Separately, though it probably cuts against my argument above [1], I would politely disagree with the perhaps-unintended implication in your comment above that “implement zero trust” is a sufficient definition of defenses to defend against compute providers like Amazon, MSFT, etc. After all, Anthropic’s proper threat modeling of them should include things like, “Amazon, Microsoft, etc. employ former nation-state hackers who considered attacking zero trust networks to be part of the cost of doing business.”
[1] Scout mindset, etc.
Huh? Simply using someone else’s hosting doesn’t mean that Amazon has a threat-modeled ability to steal Claude’s model weights.
For example, it could be the case (not saying it is, this is just illustrative) that Amazon has given Anthropic sufficient surveillance capabilities inside their data centers that combined with other controls the risk is low.
Where’s the “almost certainly” coming from? I feel like everyone responding to this is seeing something I’m not seeing.
Zach Stein-Perlman’s recent quick take is confusing. It just seems like an assertion, followed by condemnation of Anthropic conditioned on us accepting his assertion blindly as true.
It is definitely the case that “insider threat from a compute provider” is a key part of Anthropic’s threat model! They routinely talk about it in formal and informal settings! So what precisely is his threat model here that he thinks they’re not defending adequately against?
(He has me blocked from commenting on his posts for some reason, which is absolutely his right, but insofar as he hasn’t blocked me from seeing his posts, I wanted to explicitly register in public my objection to this sort of low-quality argument.)
My opinion, FWIW, is that both treaty and international agreement (or “deal”, etc.) have upsides and downsides. And it’s hard to predict those considerations’ political salience or direction in the long term—e.g., just a few years ago, Republicans’ main complaint against the JCPOA (aka “the Iran Nuclear Deal”) was that it wasn’t an actual treaty, and should have been, which would be a very odd argument in 2025.
I think as long as MIRI says things like “or other international agreement or set of customary norms” on occasion it should be fine. It certainly doesn’t nails on the chalkboard me to hear “treaty” on a first glance, and in any long convo I model MIRI as saying something like “or look, we’d be open to other things that get this done too, we think a treaty is preferable but are open to something else that solves the same problem.”
The big challenge here is getting national security officials to respond to your survey! Probably easier with former officials, but unclear how much that’s predictive of current officials’ beliefs.
I’m pretty sure that p(doom) is much more load-bearing for this community than policymakers generally. And frankly, I’m like this close to commissioning a poll of US national security officials where we straight up ask “at percent X of total human extinction would you support measures A, B, C, D, etc.”
I strongly, strongly, strongly suspect based on general DC pattern recognition that if the US government genuinely belived that the AI companies had a 25% chance of killing us all, FBI agents would rain out of the sky like a hot summer thunderstorm, sudden, brilliant, and devastating.
Heads up—if you’re 1. on a H1-B visa AND 2. currently outside the US, there is VERY IMPORTANT, EXTREMELY TIME SENSITIVE stuff going on that might prevent you from getting back into the US after 21 September.
If this applies to you, immediately stop looking at LessWrong and look at the latest news. (I’m not providing a summary of it here because there are conflicting stories about who it will apply to and it’s evolving hour by hour and I don’t want this post to be out of date)
Ivy Style Any% Speedrun Complete
If you’re someone who has[1], or will have, read If Anyone Builds It, Everyone Dies, I encourage you to post your sincere and honest review of the book on Amazon once you have read it—I think it would be useful to the book’s overall reputation.
But be a rationalist! Give your honest opinion.
When:
If you’ve already read it: Once Amazon accepts reviews, likely starting on the book launch date tomorrow.
If you haven’t read it: Once you’ve read it. Especially if you’ve ordered a copy from Amazon so they know the review is coming from a verified purchaser of the book.
- ^
Advance reader copies.
- ^
I also think this is likely to cause folks to look into the situation and ask, “is it really this bad?” I think it’s helpful to point them to the fact that yes, Yudkowsky and Soares are accurately reporting that the AI CEOs think they’re roughly russian-roulette odds gambling with the world [1]. I also think it’s important to emphasize that a bunch of us have a bunch of disagreements, whether nuanced or blunt, with them, and still are worried.
Why? Because lots of folks live in denial that it’s even possible for AI as smart as humans to exist one day, much less superintelligent AI soon. Often their defense mechanism is to pick at bits of the story. Reinforcing that even if you pick at bits of the story you still are worried is a helpful thing.[1] Not trying to pick round ninety zillion of the fight about whether this is a good or bad idea, etc.!
I honestly haven’t thought especially in depth or meaningfully about the LTBT and this is zero percent a claim about the LTBT, but as someone who has written a decent number of powerpoint decks that went to boards and used to be a management consultant and corporate strategy team member, I would generally be dissatisfied with the claim that a board’s most relevant metric is how many seats it currently has filled (so long as it has enough filled to meet quorum).
As just one example, it is genuinely way easier than you think for a board to have a giant binder full of “people we can emergency appoint to the board, if we really gotta” and be choosing not to exercise that binder because, conditional on no-emergency, they genuinely and correctly prefer waiting for someone being appointed to the board who has an annoying conflict that they’re in the process of resolving (e.g., selling off shares in a competitor or waiting out a post-government-employment “quiet period” or similar).
I’m about to embark on the classic exercise of “think a bunch about AI policy.”
Does anyone actually have an up to date collection of “here are all the existing AI safety policy proposals out there”?
(Yes, I know, your existing proposal is already great and we should just implement it as-is. Think of the goal of this exercise being to convince someone else who needs to see a spreadsheet of “here are all the ideas, here is why idea number three is the best one”)
I think this is somewhat true, but also think in Washington it’s also about becoming known as “someone to go talk to about this” whether or not they’re your ally. Being helpful and genial and hosting good happy hours is surprisingly influential.
I agree with all of this—but also do think that there’s a real aspect here about some of the ideas lying around embedded existing policy constraints that were true both before and after the policy window changed. For example, Saudi Arabia was objectively a far better target for a 9/11-triggered casus belli than Iraq (15 of the 19 hijackers were Saudi citizens, as was bin Laden himself!), but no one had a proposal to invade Saudi Arabia on the shelf because in a pre-fracking United States, invading Saudi Arabia would essentially mean “shatter the US economy into a third Arab Oil Embargo.”
“in December 2024, Jack Clark tried to push Congressman Jay Obernolte (CA-23) for federal preemption of state AI laws” is a very strong claim, and one that I think is impossible for me to evaluate without context we don’t have.
I would encourage you to give context on what kinds of advocacy he was purportedly engaged in and what your sources allege to have believed the Congressman’s preferences on preemption were already at that time. I would not, for example, be especially surprised if the Congressman was already thinking hard about pushing for preemption at that time and Jack Clark was engaging him in a conversation where he had been made aware of (hypothetically) Congressman Obernolte’s plans. For example, I would be very dubious if you were claiming that Jack Clark came up with the idea and pitched it to Congress.
(I personally have no strong public opinion on preemption being good or bad in the abstract; the specific terms of what you’re preempting at the state level and what you’re doing at the federal level are most of the ballgame here.)