Copying some brief thoughts on what I think about working on automated theorem proving relating to working on aligned AGI:
I think a pure-mathematical theorem prover is more likely to be beneficial and less likely to be catastrophic than STEM-AI / PASTA
I think it’s correspondingly going to be less useful
I’m optimistic that it could be used to upgrade formal software verification and cryptographic algorithm verification
With this, i think you can tell a story about how development in better formal theorem provers can help make information security a “defense wins” world—where information security and privacy are a globally strong default
There are some scenarios (e.g. ANI surveillance of AGI development) where this makes things worse, I think in expectation it makes things better
There are some ways this could be developed where it ends up accelerating AGI research significantly (i.e. research done to further theorem proving ends up unlocking key breakthroughs to AGI) but I think this is unlikely
One of the reasons I think this is unlikely is that current theorem proving environments are much closer to “AlphaGo on steriods” than “read and understand all mathematics papers ever written”
I think if we move towards the latter, then I’m less differentially-optimistic about theorem proving as a direction of beneficial AI research (and it goes back to the general background level of AGI research more broadly)
In my understanding there’s a missing step between upgraded verification (of software, algorithms, designs) and a “defence wins” world: what the specifications for these proofs need to be isn’t a purely mathematical thing. The missing step is how to figure out what the specs should say. Better theorem proving isn’t going to help much with the hard parts of that.
I think that’s right that upgraded verification by itself is insufficient for ‘defense wins’ worlds. I guess I’d thought that was apparent but you’re right it’s definitely worth saying explicitly.
A big wish of mine is that we end up doing more planning/thinking-things-through for how researchers working on AI today could contribute to ‘defense wins’ progress.
My implicit other take here that wasn’t said out loud is that I don’t really know of other pathways where good theorem proving translates to better AI x-risk outcomes. I’d be eager to know of these.
Copying some brief thoughts on what I think about working on automated theorem proving relating to working on aligned AGI:
I think a pure-mathematical theorem prover is more likely to be beneficial and less likely to be catastrophic than STEM-AI / PASTA
I think it’s correspondingly going to be less useful
I’m optimistic that it could be used to upgrade formal software verification and cryptographic algorithm verification
With this, i think you can tell a story about how development in better formal theorem provers can help make information security a “defense wins” world—where information security and privacy are a globally strong default
There are some scenarios (e.g. ANI surveillance of AGI development) where this makes things worse, I think in expectation it makes things better
There are some ways this could be developed where it ends up accelerating AGI research significantly (i.e. research done to further theorem proving ends up unlocking key breakthroughs to AGI) but I think this is unlikely
One of the reasons I think this is unlikely is that current theorem proving environments are much closer to “AlphaGo on steriods” than “read and understand all mathematics papers ever written”
I think if we move towards the latter, then I’m less differentially-optimistic about theorem proving as a direction of beneficial AI research (and it goes back to the general background level of AGI research more broadly)
In my understanding there’s a missing step between upgraded verification (of software, algorithms, designs) and a “defence wins” world: what the specifications for these proofs need to be isn’t a purely mathematical thing. The missing step is how to figure out what the specs should say. Better theorem proving isn’t going to help much with the hard parts of that.
I think that’s right that upgraded verification by itself is insufficient for ‘defense wins’ worlds. I guess I’d thought that was apparent but you’re right it’s definitely worth saying explicitly.
A big wish of mine is that we end up doing more planning/thinking-things-through for how researchers working on AI today could contribute to ‘defense wins’ progress.
My implicit other take here that wasn’t said out loud is that I don’t really know of other pathways where good theorem proving translates to better AI x-risk outcomes. I’d be eager to know of these.