In February 2024, Sam Altman tweeted that ChatGPT generates 100B words per day, which is about 200 GB of text per day (though this has likely grown significantly). At scale, exfiltration over the output token channel becomes viable, and it’s the one channel you can’t just turn off. If model weights are 1000Gb, and egress limits are 800GB/day, it will take just 1.25 days to exfiltrate the weights across any channel.
I don’t quite follow this—why would the egress limits be at 800GB/day? Presumably the 200GB of text is spread across multiple data centers, each of which could have its own (lower) egress limit. (I assume you’re adding more traffic for non-text data—is that the other 600GB?) I imagine this could make a difference of an OOM or so, making egress limits quite a bit more appealing (e.g., slowing exfiltration to 10 days rather than 1 day) if true?
Apologies, this could have been better explained. The largest factor is expected growth in inference. Google went from processing ~500 trillion tokens per month in May to ~1 quadrillion tokens per month in July. A bit of napkin math:
It’s uncertain how many of these are output tokens. Based on OpenRouter’s numbers, it seems like a plausible estimate is 10%, or 100 trillion output tokens per month (back in July). This is 200 TB per month, or 7 TB per day. I’m not sure how many data centers they have—if there’s 10 data centers, then it would be 700GB per day.
It has been 5 months since then, so a naive extrapolation suggests this is now ~6x larger.
Great work!
I don’t quite follow this—why would the egress limits be at 800GB/day? Presumably the 200GB of text is spread across multiple data centers, each of which could have its own (lower) egress limit. (I assume you’re adding more traffic for non-text data—is that the other 600GB?) I imagine this could make a difference of an OOM or so, making egress limits quite a bit more appealing (e.g., slowing exfiltration to 10 days rather than 1 day) if true?
Apologies, this could have been better explained. The largest factor is expected growth in inference. Google went from processing ~500 trillion tokens per month in May to ~1 quadrillion tokens per month in July. A bit of napkin math:
It’s uncertain how many of these are output tokens. Based on OpenRouter’s numbers, it seems like a plausible estimate is 10%, or 100 trillion output tokens per month (back in July). This is 200 TB per month, or 7 TB per day. I’m not sure how many data centers they have—if there’s 10 data centers, then it would be 700GB per day.
It has been 5 months since then, so a naive extrapolation suggests this is now ~6x larger.