Karnofsky’s focus on “tool AI” is useful but also his statement of it may confuse matters and needs refinement. I don’t think the distinction between “tool AI” and “agent AI” is sharp, or in quite the right place.
For example, the sort of robot cars we will probably have in a few years are clearly agents—you tell them to “come here and take me there” and they do it without further intervention on your part (when everything is working as planned). This is useful in a way that any amount and quality of question answering is not. Almost certainly there will be various flavors of robot cars available and people will choose the ones they like (that don’t drive in scary ways, that get them where they want to go even if it isn’t well specified, that know when to make conversation and when to be quiet, etc.) As long as robot cars just drive themselves and people around, can’t modify the world autonomously to make their performance better, and are subject to continuing selection by their human users, they don’t seem to be much of a threat.
The key points here seem to be (1) limited scope, (2) embedding in a network of other actors and (3) humans in the loop as evaluators. We could say these define “tool AIs” or come up with another term. But either way the antonym doesn’t seem to be “agent AIs” but maybe something like “autonomous AIs” or “independent AIs”—AIs with the power to act independently over a very broad range, unchecked by embedding in a network of other actors or by human evaluation.
Framed this way, we can ask “Why would independent AIs exist?” If the reason is mad scientists, an arms race, or something similar then Karnofsky has a very strong argument that any study of friendliness is beside the point. Outside these scenarios, the argument that we are likely to create independent AIs with any significant power seems weak; Karnofsky’s survey more or less matches my own less methodical findings. I’d be interested in strong arguments if they exist.
Given this analysis, there seem to be two implications:
We shouldn’t build independent AIs, and should organize to prevent their development if they seem likely.
We should thoroughly understand the likely future evolution of a patchwork of diverse tool AIs, to see where dangers arise.
For better or worse, neither of these lend themselves to tidy analytical answers, though analytical work would be useful for both. But they are very much susceptible to investigation, proposals, evangelism, etc.
These do lend themselves to collaboration with existing AI efforts. To the extent they perceive a significant risk of development of independent AIs in the foreseeable future, AI researchers will want to avoid that. I’m doubtful this is an active risk but could easily be convinced by evidence—not just abstract arguments—and I’m fairly sure they feel the same way.
Understanding the long term evolution of a patchwork of diverse tool AIs should interest just about all major AI developers, AI project funders, and long term planners who will be affected (which is just about all of them). Short term bias and ceteris paribus bias will lead to lots of these folks not engaging with the issue, but I think it will seem relevant to an increasing number as the hits keep coming.
First, my own observation agrees with GreenRoot. My view is less systematic but much longer, I’ve been watching this area since the 70s. (Perhaps longer, I was fascinated in my teens by Leibnitz’s injunction “Let us calculate”.)
Empirically I think several decades of experiment have established that no obvious or simple approach will work. Unless someone has a major new idea we should not pursue straightforward graphical representations.
On the other hand we do have a domain where machine usable representation of thought has been successful, and where in fact that representation has evolved fairly rapidly. That domain is “programming” in a broad sense.
Graphical representations of programs have been tried too, and all such attempts have been failures. (I was a project manager for such an attempt in the 80s.) The basic problem is that a program is naturally a high-dimensional object, and when mapped down into a two dimensional picture it is about as comprehensible as a bowl of spagetti.
The really interesting aspect of programming for representing arguments isn’t the mainstream “get it done” perspective, but the background work that has been done on tools for analyzing, transforming, optimizing, etc. code. These tools all depend on extracting and maintaining the semantics of the code through a lot of non-trivial changes. Furthermore over time the representations they use have evolved from imperative, time-bound ones toward declarative ones that describe relationships in a timeless way.
At the same time programming languages have evolved to move more of the “mechanical” semantics into runtimes or implicit operations during compile time, such as type inference. This turns out to be essential to keep down the clutter in the code, and to maintain global consistency.
The effect is that programming languages are moving closer to formal symbolic calculi, and program transformations are moving closer to automated proof checking (while automated proof checking is evolving to take advantage of some of these same ideas).
In my opinion, all of that is necessary for any kind of machine support of the semantics of rational discussion. But it is not sufficient. The problem is that our discussion allows, and realistically has to allow a wide range of vagueness, while existing programming semantics are never nearly vague enough. In our arguments we have to refer to only partially specified, or in some cases nearly unspecified “things”, and then refine our specification of those things over time as necessary. (An extremely limited but useful form of this is already supported in advanced programming languages as “lazy”, potentially infinite data structures. These are vague only about how many terms of a sequence will be calculated—as many as you ask for, plus possibly more.)
For example look at the first sentence of my paragraph above. What does “all of that” refer to? You know enough from context to understand my point. But if we actually ended up pursuing this as a project, by the time we could build anything that works we’d have an extremely complex understanding of the previous relevant work, and how to tie back to it. In the process we would have looked at a lot of stuff that initially seemed relevant (i.e.currently included in “all of that”) but that after due consideration we found we needed to exclude. If we had to specify “all of that” in advance (even in terms of sharp criteria for inclusion) we’d never get anywhere.
So any representation of arguments has to allow vagueness in all respects, and also allow the vagueness to be challenged and elaborated as necessary. The representation has to allow multiple versions of the argument, so different approaches can be explored. It has to allow different (partial) successes to be merged, resolving any inconsistencies by some combination of manual and machine labor. (We have pretty good tools for versioning and merging in programming, to the extent the material being manipulated has machine-checkable semantics.)
The tools for handling vagueness are coming along (in linguistic theory and statistical modeling) but they are not yet at the engineering cookbook level. However if an effort to build semantic argumentation tools on a programming technology base got started now, the two trajectories would probably intersect in a fairly useful way a few years out.
The implications of all of this for AI would be interesting to discuss, but perhaps belong in another context.