DirectedEvolution comments on AGI Safety FAQ / all-dumb-questions-allowed thread

DirectedEvolution 7 Jun 2022 15:12 UTC
15 points
2
Who is well-incentivized to check if AGI is a long way off? Right now, I see two camps: AI capabilities researchers and AI safety researchers. Both groups seem incentivized to portray the capabilities of modern systems as “trending toward generality.” Having a group of credible experts focused on critically examining that claim of “AI trending toward AGI,” and in dialog with AI and AI safety researchers, seems valuable.
- Adam Jermyn 7 Jun 2022 18:17 UTC
  10 points
  5
  Parent
  This is a slightly orthogonal answer, but “humans who understand the risks” have a big human-bias-incentive to believe that AGI is far off (in that it’s aversive to thinking that bad things are going to happen to you personally).
  A more direct answer is: There is a wide range of people who say they work on “AI safety” but almost none of them work on “Avoiding doom from AGI”. They’re mostly working on problems like “make the AI more robust/less racist/etc.”. These are valuable things to do, but to the extent that they compete with the “Avoid doom” researchers for money/status/influence they have an incentive to downplay the odds of doom. And indeed this happens a fair amount with e.g. articles on how “Avoid doom” is a distraction from problems that are here right now.
  - DirectedEvolution 7 Jun 2022 19:36 UTC
    6 points
    1
    Parent
    To put it in appropriately Biblical terms, let’s imagine we have a few groups of civil engineers. One group is busily building the Tower of Babel, and bragging that it has grown so tall, it’s almost touching heaven! Another group is shouting “if the tower grows too close to heaven, God will strike us all down!” A third group is saying, “all that shouting about God striking us down isn’t helping us keep the tower from collapsing, which is what we should really be focusing on.”
    I’m wishing for a group of engineers who are focused on asking whether building a taller and taller tower really gets us closer and closer to heaven.
  - DirectedEvolution 7 Jun 2022 19:08 UTC
    4 points
    0
    Parent
    That’s a good point.
    I’m specifically interested in finding people who are well-incentivized to gather, make, and evaluate arguments about the nearness of AGI. This task should be their primary professional focus.
    I see this activity as different from, or a specialized subset of, measurements of AI progress. AI can progress in capabilities without progressing toward AGI, or progressing in a way that is likely to succeed in producing AGI. For example, new releases of an expert system for making medical diagnoses might show constant progress in capabilities, without showing any progress toward AGI.
    Likewise, I see it as distinct from making claims about the risk of AGI doom. The risk that an AGI would be dangerous seems, to me, mostly orthogonal to whether or not it is close at hand. This follows naturally with Eliezer Yudkowsky’s point that we have to get AGI right on the “first critical try.”
    Finally, I also see this activity as being distinct from the activity of accepting and repeating arguments or claims about AGI nearness. As you point out, AI safety researchers who work on more prosaic forms of harm seem biased or incentivized to downplay AI risk, and perhaps also of AGI nearness. I see this as a tendency to accept and repeat such claims, rather than a tendency to “gather, make, and evaluate arguments,” which is what I’m interested in.
    It seems to me that one of the challenges here is the “no true Scotsman” fallacy, a tendency to move goalposts, or to be disappointed in realizing that a task thought to be hard for AI and achievable only with AGI turns out to be easy for AI, yet achievable by a non-general system.
    Scott wrote a post that seems quite relevant to this question just today. It seems to me that his argument is “AI is advancing in capabilities faster than you think.” However, as I’m speculating here, we can accept that claim, while still thinking “AI is moving toward AGI slower than it seems.” Or not! It just seems to me that making lists of what AI can or cannot do, and then tracking its success rate with successive program releases, is not clearly a way to track AGI progress. I’d like to see somebody who knows what they’re about examining that question, or perhaps synthesizing multiple perspectives on the way AI becomes AGI and showing how a given unit of narrow capabilities progress might fit into a narrative of AGI progress from each of those perspectives.

DirectedEvolution comments on AGI Safety FAQ /​ all-dumb-questions-allowed thread

DirectedEvolution comments on AGI Safety FAQ / all-dumb-questions-allowed thread