interstice comments on Legible vs. Illegible AI Safety Problems

interstice 8 Nov 2025 6:19 UTC
4 points
0
Isn’t a version of this logic kinda implicit in what people are already doing? Like the MIRI switch to outreach could be seen as trying to make arguments already understood in the AI safety community legible to the wider public. Or put another way, legibility is a two-place word, and the degree of “legibility of AI concerns” present in the xrisk-adjacent community is already sufficient to imply that we shouldn’t be building AI given our current level of knowledge. Like if the median voter had the degree of legible-understanding-of-AI-xrisk of Dario(probably, behind closed doors at least? or even Sam Altman?), civilization probably wouldn’t permit people to try building AGI. The issue is that the general public, as well as powerful decision makers, don’t even have this degree of legible understanding, so the bottleneck is convincing them.
- Wei Dai 8 Nov 2025 6:32 UTC
  8 points
  3
  Parent
  Yes, some people are already implicitly doing this, but if we don’t make it explicit:
  1. We can’t explain to the people not doing it (i.e., those working on already legible problems) why they should switch directions.
  2. Even MIRI is doing it suboptimally because they’re not reasoning about it explicitly. I think they’re focusing too much on one particular x-safety problem (AI takeover caused by misalignment) that’s highly legible to themselves and not to the public/policymakers, and that’s problematic because what happens if someone comes up with an alignment breakthrough? Their arguments become invalidated and there’s no reason to stop holding back AGI/ASI anymore (in the public/policymakers’ eyes), but still plenty of illegible x-safety problems left.