You’re fundamentally assuming opaque AI, and ascribing intentions to it; this strikes me as generalizing from fictional evidence.
So, let’s talk about currently operational strong super-human AIs.
Take, for example, Bayesian-based spam filtering, which has the strong super-human ability to filter e-mails into categories of “spam”, and “not spam”. While the actual parameters of every token are opaque for a human observer, the algorithm itself is transparent: we know why it works, how it works, and what needs tweaking.
This is what Holden talks about, when he says:
Among other things, a tool-AGI would allow transparent views into the AGI’s reasoning and predictions without any reason to fear being purposefully misled
In fact, the operational AI R&D problem, is that you can not outsource understanding. See tried eg. neural networks, when trained with evolutionary algorithms: you can achieve a number of different tasks with these, but once you finish the training, there is no way to reverse-engineer how the actual algorithm works, making it impossible for humans to recognize conceptual shortcuts, and thereby improve performance.
Neural networks, for instance, are in the dock not only because they have been hyped to high heaven, (what hasn’t?) but also because you could create a successful net without understanding how it worked: the bunch of numbers that captures its behaviour would in all probability be “an opaque, unreadable table...valueless as a scientific resource”. ref
I think the assumption of at least a relatively opaque AI is justified. Except for maybe k-NN, decision trees, and linear classifiers, everything else we currently have to work with is more opaque than Naïve Bayes.
For spam filtering, if we wanted to bump up the ROC AUC a few percent, the natural place to go might be a Support Vector Machine classifier. The solution is transparent in that it boils down to optimizing a quadratic function over a convex domain, something that we can do efficiently and non-mysteriously. On the other hand, the solution produced is either a linear decision boundary in a potentially infinite-dimensional space or an unspeakably complicated decision surface in the original feature space.
Something like Latent Dirichlet Allocation is probably a better example of what a mid-level tool-A(not G)I looks like today.
Edit: Please explain the downvote? I’d like to know if I’m making a technical mistake somewhere, because this is material I really ought to be able to get right.
Off topic question: Why do you believe the ability to sort email into spam and non-spam is super-human? The computerized filter is much, much faster, but I suspect that if you could get 10M sorts from me and 10M from the filter, I’d do better. Yes, that assumes away tiredness, inattention, and the like, but I think that’s more an issue of relative speed than anything else. Eventually, the hardware running the spam filter will break down, but not on a timescale relevant to the spam filtering task.
Yes, that assumes away tiredness, inattention, and the like, but I think that’s more an issue of relative speed than anything else
Exactly for those reasons. From the relevant utilitarianism perspective, we care about those things much more deeply.
(also, try differentiating between “不労所得を得るにはまずこれ” and “スラッシュドット・”)
You’re fundamentally assuming opaque AI, and ascribing intentions to it; this strikes me as generalizing from fictional evidence. So, let’s talk about currently operational strong super-human AIs. Take, for example, Bayesian-based spam filtering, which has the strong super-human ability to filter e-mails into categories of “spam”, and “not spam”. While the actual parameters of every token are opaque for a human observer, the algorithm itself is transparent: we know why it works, how it works, and what needs tweaking.
This is what Holden talks about, when he says:
In fact, the operational AI R&D problem, is that you can not outsource understanding. See tried eg. neural networks, when trained with evolutionary algorithms: you can achieve a number of different tasks with these, but once you finish the training, there is no way to reverse-engineer how the actual algorithm works, making it impossible for humans to recognize conceptual shortcuts, and thereby improve performance.
I think the assumption of at least a relatively opaque AI is justified. Except for maybe k-NN, decision trees, and linear classifiers, everything else we currently have to work with is more opaque than Naïve Bayes.
For spam filtering, if we wanted to bump up the ROC AUC a few percent, the natural place to go might be a Support Vector Machine classifier. The solution is transparent in that it boils down to optimizing a quadratic function over a convex domain, something that we can do efficiently and non-mysteriously. On the other hand, the solution produced is either a linear decision boundary in a potentially infinite-dimensional space or an unspeakably complicated decision surface in the original feature space.
Something like Latent Dirichlet Allocation is probably a better example of what a mid-level tool-A(not G)I looks like today.
Edit: Please explain the downvote? I’d like to know if I’m making a technical mistake somewhere, because this is material I really ought to be able to get right.
Off topic question: Why do you believe the ability to sort email into spam and non-spam is super-human? The computerized filter is much, much faster, but I suspect that if you could get 10M sorts from me and 10M from the filter, I’d do better. Yes, that assumes away tiredness, inattention, and the like, but I think that’s more an issue of relative speed than anything else. Eventually, the hardware running the spam filter will break down, but not on a timescale relevant to the spam filtering task.
Exactly for those reasons. From the relevant utilitarianism perspective, we care about those things much more deeply. (also, try differentiating between “不労所得を得るにはまずこれ” and “スラッシュドット・”)