The baseline for proof burden is just lines of proof / lines of code. For production-grade software verification projects this is 10×--100×.
Models that are bad at verification will do worse.
On ambitious projects (e.g., AlphaProof when it came out) verification might increase capabilities, leading to a verification burden < 1
The baseline for proof burden is just lines of proof / lines of code. For production-grade software verification projects this is 10×--100×.
Models that are bad at verification will do worse.
On ambitious projects (e.g., AlphaProof when it came out) verification might increase capabilities, leading to a verification burden < 1