abramdemski comments on Do confident short timelines make sense?

abramdemski 17 Jul 2025 20:38 UTC
2 points
0
Straining the analogy, the mole-hunters get stronger and faster each time they whack a mole (because the AI gets stronger). My claim is that it isn’t so implausible that this process could asymptote soon, even if the mole-mother (the latent generator) doesn’t get uncovered (until very late in the process, anyway).
This is highly disanalogous to the AI safety case, where playing whack-a-mole carries a very high risk of doom, so the hunt for the mole-mother is clearly important.
In the AI safety case, making the mistake of going after a baby mole instead of the mole-mother is a critical error.
In the AI capabilities case, you can hunt for baby moles and look for patterns and learn and discover the mole-mother that way.
A frontier-lab safety researcher myopically focusing on whacking baby moles is bad news for safety in a way that a frontier-lab capabilities researcher myopically focusing on whacking baby moles isn’t such bad news for capabilities.
- Mateusz Bagiński 28 Jul 2025 9:11 UTC
  3 points
  0
  Parent
  A frontier-lab safety researcher myopically focusing on whacking baby moles is bad news for safety in a way that a frontier-lab capabilities researcher myopically focusing on whacking baby moles isn’t such bad news for capabilities.
  Thanks for clarifying.
  Straining the analogy, the mole-hunters get stronger and faster each time they whack a mole (because the AI gets stronger). My claim is that it isn’t so implausible that this process could asymptote soon, even if the mole-mother (the latent generator) doesn’t get uncovered (until very late in the process, anyway).
  I do feel some pull in this direction, but it’s probably because the weight of the disvalue of the consequences of this “asymptoting” warps my assessment of plausibilities. When I try to disentangle these factors, I’m left with a vague “Rather unlikely to asymptote, but I surely rather not have anyone test this hypothesis.”.