In my understanding there’s a missing step between upgraded verification (of software, algorithms, designs) and a “defence wins” world: what the specifications for these proofs need to be isn’t a purely mathematical thing. The missing step is how to figure out what the specs should say. Better theorem proving isn’t going to help much with the hard parts of that.
I think that’s right that upgraded verification by itself is insufficient for ‘defense wins’ worlds. I guess I’d thought that was apparent but you’re right it’s definitely worth saying explicitly.
A big wish of mine is that we end up doing more planning/thinking-things-through for how researchers working on AI today could contribute to ‘defense wins’ progress.
My implicit other take here that wasn’t said out loud is that I don’t really know of other pathways where good theorem proving translates to better AI x-risk outcomes. I’d be eager to know of these.
In my understanding there’s a missing step between upgraded verification (of software, algorithms, designs) and a “defence wins” world: what the specifications for these proofs need to be isn’t a purely mathematical thing. The missing step is how to figure out what the specs should say. Better theorem proving isn’t going to help much with the hard parts of that.
I think that’s right that upgraded verification by itself is insufficient for ‘defense wins’ worlds. I guess I’d thought that was apparent but you’re right it’s definitely worth saying explicitly.
A big wish of mine is that we end up doing more planning/thinking-things-through for how researchers working on AI today could contribute to ‘defense wins’ progress.
My implicit other take here that wasn’t said out loud is that I don’t really know of other pathways where good theorem proving translates to better AI x-risk outcomes. I’d be eager to know of these.