Straining the analogy, the mole-hunters get stronger and faster each time they whack a mole (because the AI gets stronger). My claim is that it isn’t so implausible that this process could asymptote soon, even if the mole-mother (the latent generator) doesn’t get uncovered (until very late in the process, anyway).
This is highly disanalogous to the AI safety case, where playing whack-a-mole carries a very high risk of doom, so the hunt for the mole-mother is clearly important.
In the AI safety case, making the mistake of going after a baby mole instead of the mole-mother is a critical error.
In the AI capabilities case, you can hunt for baby moles and look for patterns and learn and discover the mole-mother that way.
A frontier-lab safety researcher myopically focusing on whacking baby moles is bad news for safety in a way that a frontier-lab capabilities researcher myopically focusing on whacking baby moles isn’t such bad news for capabilities.
A frontier-lab safety researcher myopically focusing on whacking baby moles is bad news for safety in a way that a frontier-lab capabilities researcher myopically focusing on whacking baby moles isn’t such bad news for capabilities.
Thanks for clarifying.
Straining the analogy, the mole-hunters get stronger and faster each time they whack a mole (because the AI gets stronger). My claim is that it isn’t so implausible that this process could asymptote soon, even if the mole-mother (the latent generator) doesn’t get uncovered (until very late in the process, anyway).
I do feel some pull in this direction, but it’s probably because the weight of the disvalue of the consequences of this “asymptoting” warps my assessment of plausibilities. When I try to disentangle these factors, I’m left with a vague “Rather unlikely to asymptote, but I surely rather not have anyone test this hypothesis.”.
Straining the analogy, the mole-hunters get stronger and faster each time they whack a mole (because the AI gets stronger). My claim is that it isn’t so implausible that this process could asymptote soon, even if the mole-mother (the latent generator) doesn’t get uncovered (until very late in the process, anyway).
This is highly disanalogous to the AI safety case, where playing whack-a-mole carries a very high risk of doom, so the hunt for the mole-mother is clearly important.
In the AI safety case, making the mistake of going after a baby mole instead of the mole-mother is a critical error.
In the AI capabilities case, you can hunt for baby moles and look for patterns and learn and discover the mole-mother that way.
A frontier-lab safety researcher myopically focusing on whacking baby moles is bad news for safety in a way that a frontier-lab capabilities researcher myopically focusing on whacking baby moles isn’t such bad news for capabilities.
Thanks for clarifying.
I do feel some pull in this direction, but it’s probably because the weight of the disvalue of the consequences of this “asymptoting” warps my assessment of plausibilities. When I try to disentangle these factors, I’m left with a vague “Rather unlikely to asymptote, but I surely rather not have anyone test this hypothesis.”.