After thinking and talking about it more, I still think “AGI safety” is the best term I’ve got so far. Or, “AI safety,” in contexts where we don’t mind being less specific, and are speaking to an audience that doesn’t know what “AGI” means.
Basically, (1) I think your objections to “safe AGI” mostly don’t hold for “AGI safety,” and (2) I think the audience you seem most concerned about (technophiles) isn’t the right audience to be most concerned about.
Maybe Schneier wouldn’t get behind something called “safe computing” or “secure computing,” but he happily works in a field called “computer security.” The latter phrasing suggests the idea that we can get some degree of security (or safety) even though we can never make systems 100% safe or secure. Scientists don’t object to people working on “computer security,” and I haven’t seen technophiles object to it either. Heck, many of them work in computer security. “X security” and “X safety” don’t imply to anyone I know that “you must spend infinite money on infinitesimal risks.” It just implies you’re trying to provide some reasonable level of safety and security, and people like that. Technophiles want their autonomous car to be reasonably safe just like everyone else does.
I think your worry that “safety” implies there’s a small class of threat pathways that need to be patched, rather than implying that an AGI needs to be designed from the ground up to stably optimize for your idealized values, is more of a concern. But it’s a small concern. A term like “Friendly AI” is a non-starter for many smart and/or influential people, whereas “AGI safety” serves as a rung in Wittgeinstein’s ladder from which you can go on to explain that the challenge of AGI safety is not to patch a small class of threat pathways but instead to build a system from the ground to ensure desirable behavior.
(Here again, the analogy to other safety-critical autonomous systems is strong. Such systems are often, like FAI, built from the ground up for safety and/or security precisely because in such autonomous systems there isn’t a small class of threat pathways. Instead, almost all possible designs you might come up with don’t do what you intended in some system states or environments. See e.g. my interviews with Michael Fisher and Benjamin Pierce. But that’s not something even most computer scientists will know anything about — it’s an approach to AI safety work that would have to be explained after they’ve already got a foot on the “AGI safety” rung of the expository ladder.)
Moreover, you seem to be most worried about how our terminology will play to the technophile audience. But playing well to technophiles isn’t MIRI’s current or likely future bottleneck. Attracting brilliant researchers is. If we can attract brilliant researchers, funding (from technophiles and others) won’t be so hard. But it’s hard to attract brilliant researchers with a whimsical home-brewed term like “Friendly AI” (especially when it’s paired with other red flags like a shockingly-arrogant-for-academia tone and an apparent lack of familiarity with related work, but that’s a different issue).
As Toby reports, it’s also hard to get the ear of policy-makers with a term like “Friendly AI,” but I know you are less interested in reaching policy-makers than I am.
Anyway, naming things is hard, and I certainly don’t fault you (or was it Bostrom?) for picking “Friendly AI” back in the day, but from our current vantage point we can see better alternatives. Even LWers think so, and I’d expect them to be more sympathetic to “Friendly AI” than anyone else.
After thinking and talking about it more, I still think “AGI safety” is the best term I’ve got so far. Or, “AI safety,” in contexts where we don’t mind being less specific, and are speaking to an audience that doesn’t know what “AGI” means.
Basically, (1) I think your objections to “safe AGI” mostly don’t hold for “AGI safety,” and (2) I think the audience you seem most concerned about (technophiles) isn’t the right audience to be most concerned about.
Maybe Schneier wouldn’t get behind something called “safe computing” or “secure computing,” but he happily works in a field called “computer security.” The latter phrasing suggests the idea that we can get some degree of security (or safety) even though we can never make systems 100% safe or secure. Scientists don’t object to people working on “computer security,” and I haven’t seen technophiles object to it either. Heck, many of them work in computer security. “X security” and “X safety” don’t imply to anyone I know that “you must spend infinite money on infinitesimal risks.” It just implies you’re trying to provide some reasonable level of safety and security, and people like that. Technophiles want their autonomous car to be reasonably safe just like everyone else does.
I think your worry that “safety” implies there’s a small class of threat pathways that need to be patched, rather than implying that an AGI needs to be designed from the ground up to stably optimize for your idealized values, is more of a concern. But it’s a small concern. A term like “Friendly AI” is a non-starter for many smart and/or influential people, whereas “AGI safety” serves as a rung in Wittgeinstein’s ladder from which you can go on to explain that the challenge of AGI safety is not to patch a small class of threat pathways but instead to build a system from the ground to ensure desirable behavior.
(Here again, the analogy to other safety-critical autonomous systems is strong. Such systems are often, like FAI, built from the ground up for safety and/or security precisely because in such autonomous systems there isn’t a small class of threat pathways. Instead, almost all possible designs you might come up with don’t do what you intended in some system states or environments. See e.g. my interviews with Michael Fisher and Benjamin Pierce. But that’s not something even most computer scientists will know anything about — it’s an approach to AI safety work that would have to be explained after they’ve already got a foot on the “AGI safety” rung of the expository ladder.)
Moreover, you seem to be most worried about how our terminology will play to the technophile audience. But playing well to technophiles isn’t MIRI’s current or likely future bottleneck. Attracting brilliant researchers is. If we can attract brilliant researchers, funding (from technophiles and others) won’t be so hard. But it’s hard to attract brilliant researchers with a whimsical home-brewed term like “Friendly AI” (especially when it’s paired with other red flags like a shockingly-arrogant-for-academia tone and an apparent lack of familiarity with related work, but that’s a different issue).
As Toby reports, it’s also hard to get the ear of policy-makers with a term like “Friendly AI,” but I know you are less interested in reaching policy-makers than I am.
Anyway, naming things is hard, and I certainly don’t fault you (or was it Bostrom?) for picking “Friendly AI” back in the day, but from our current vantage point we can see better alternatives. Even LWers think so, and I’d expect them to be more sympathetic to “Friendly AI” than anyone else.
I’ll say again, “high assurance AI” better captures everything you described than “AI safety”.