[discussion of Ryan’s point is ongoing on the MIRI slack, but I have a response to this comment that doesn’t weigh on that; other contributors likely disagree with me]
The FAR work totally rocks. However, I don’t think that ‘humans can use other AIs to discover successful adversarial strategies that work only if they know they’re playing against the AI’ is cleanly an example of the AI not being superhuman in the usual sense of superhuman. You’re changing what it means to be human to include the use of tools that are upstream of the AI (and it’s not at all inevitable that humans will do this in every case), and changing the definition of superhuman downstream of that.
In the context of the analogy, this looks to me like it ~commits you to a Vitalik-esque defensive tech view. This is a view that I at least intend to reject, and that it doesn’t feel especially important to kneel to in our framing (i.e. the definition of our central concept: superintelligence).
[discussion of Ryan’s point is ongoing on the MIRI slack, but I have a response to this comment that doesn’t weigh on that; other contributors likely disagree with me]
The FAR work totally rocks. However, I don’t think that ‘humans can use other AIs to discover successful adversarial strategies that work only if they know they’re playing against the AI’ is cleanly an example of the AI not being superhuman in the usual sense of superhuman. You’re changing what it means to be human to include the use of tools that are upstream of the AI (and it’s not at all inevitable that humans will do this in every case), and changing the definition of superhuman downstream of that.
In the context of the analogy, this looks to me like it ~commits you to a Vitalik-esque defensive tech view. This is a view that I at least intend to reject, and that it doesn’t feel especially important to kneel to in our framing (i.e. the definition of our central concept: superintelligence).