in part because it might throw off other information they don’t necessarily want to.
What other information does it release? All I can see is that it tells you how many agent trajectories they’re running. They could obfuscate that by also generating a bunch of hashes of random strings and publishing those too.
You don’t even have to publish one hash per trajectory. You can just publish a commitment every 50ms. ChatGPT thinks it is doable with zero-knowledge proofs. (transcript)
This is a good one-paragraph description:
We keep a private append-only log of system events. Every 50 milliseconds, regardless of activity, we publish a short opaque signed anchor that commits to the current hidden log state without revealing its size or whether new events occurred. Later, if an incident needs to be disclosed, we can reveal a selected entry or short segment along with a zero-knowledge proof showing that it really was part of the previously anchored log and that the log was extended only by appending, not editing or deletion.
Nice nice, yeah I’m sure people could design something clever here & I’d be excited to learn if people are working on it (not a good fit for me directly)
Yeah that’s fair that you could probably obscure the signal—maybe this is just FUD from me. But two things I was imagining:
Absolute volumes and trends over time (how much AI labor are they using)
Relative volumes at certain times/days (tells you something about how hands-on a role the human employees need to be playing? based on how much it dips when they aren’t online)
Also just a general concern of ‘you might be leaking information that you didn’t anticipate, and so it’s more secure to say less’
Also, you’d be creating a pretty valuable target if certain hash-functions do eventually fall (think how much the companies want to avoid distillation today), but maybe that’s such a bad scenario that it’s not worth accounting for?
What other information does it release? All I can see is that it tells you how many agent trajectories they’re running. They could obfuscate that by also generating a bunch of hashes of random strings and publishing those too.
You don’t even have to publish one hash per trajectory. You can just publish a commitment every 50ms. ChatGPT thinks it is doable with zero-knowledge proofs. (transcript)
This is a good one-paragraph description:
Nice nice, yeah I’m sure people could design something clever here & I’d be excited to learn if people are working on it (not a good fit for me directly)
Yeah that’s fair that you could probably obscure the signal—maybe this is just FUD from me. But two things I was imagining:
Absolute volumes and trends over time (how much AI labor are they using)
Relative volumes at certain times/days (tells you something about how hands-on a role the human employees need to be playing? based on how much it dips when they aren’t online)
Also just a general concern of ‘you might be leaking information that you didn’t anticipate, and so it’s more secure to say less’
Also, you’d be creating a pretty valuable target if certain hash-functions do eventually fall (think how much the companies want to avoid distillation today), but maybe that’s such a bad scenario that it’s not worth accounting for?