I agree that there is a serious problem about what to do in possible futures where we have an AI that’s smart enough to be dangerous, but not powerful enough to implement something like CEV. Unfortunately I don’t think this analysis offers any help.
Of your list of ways to avoid a FOOM, option 1 isn’t really relevant (since if we’re all dead we don’t have to worry about how to program an AI). Option 2 is already ruled out with very high probability, because you don’t need any exotic physics to reach mind-boggling levels of hardware performance. For instance, performance estimates for nanotech rod-logic computers come in at around a 10^9 op/sec per cubic micron, and electronic devices should beat that by several orders of magnitude. For comparison, the human brain seems to turn out around 10^15 op/sec in 1500cc, or ~10 op/sec per cubic micron). So a specific technology like microchip manufacturing might top out, but one way or another ordinary human efforts to improve computer performance will eventually carry us far beyond any plausible FOOM requirement.
Option 3 hinges on issues we won’t fully understand until we’re close to having a working AGI. But it’s hard to come up with a theory of intelligence that doesn’t boil down to a system of heuristic engines searching various abstract solution spaces, where at worst an exponential improvement in hardware yields a linear improvement in size of search space covered per unit time. In real applications you can usually get O(n) or even O(log n) solutions for the problems you actually care about, which implies that at a certain point a hard takeoff is inevitable.
But we have no way to know where that point is, and the complexity of CEV does make this an important issue. If an infrahuman AI can suddenly go FOOM and turn into an SI then something like CEV might be practical. But if the FOOM moment doesn’t come until the AI is well into transhumant territory we could spend years in a world of moderately-superhuman AIs that need a much less complex approach to Friendliness.
Unfortunately this just leads us back to all the problems that Eliezer was trying to dodge in proposing CEV. If you want to make an AI reliably Friendly you have to be able to describe Friendliness in a way that is unambiguous, complete, and not susceptible to gaming, which can’t be done with sentences of English text. You’d have to first understand how human language processing works, then build an AI implementation which handles unnatural categories and other fun complications in a predictable way, and then compose your instructions with the design of the ‘parser’ in mind.
Which is a lot less fun than the traditional ‘let’s talk about what orders to give the godlike AI’ debate, but it’s about the least rigorous approach that has any chance of working.
I agree that there is a serious problem about what to do in possible futures where we have an AI that’s smart enough to be dangerous, but not powerful enough to implement something like CEV. Unfortunately I don’t think this analysis offers any help.
Of your list of ways to avoid a FOOM, option 1 isn’t really relevant (since if we’re all dead we don’t have to worry about how to program an AI). Option 2 is already ruled out with very high probability, because you don’t need any exotic physics to reach mind-boggling levels of hardware performance. For instance, performance estimates for nanotech rod-logic computers come in at around a 10^9 op/sec per cubic micron, and electronic devices should beat that by several orders of magnitude. For comparison, the human brain seems to turn out around 10^15 op/sec in 1500cc, or ~10 op/sec per cubic micron). So a specific technology like microchip manufacturing might top out, but one way or another ordinary human efforts to improve computer performance will eventually carry us far beyond any plausible FOOM requirement.
Option 3 hinges on issues we won’t fully understand until we’re close to having a working AGI. But it’s hard to come up with a theory of intelligence that doesn’t boil down to a system of heuristic engines searching various abstract solution spaces, where at worst an exponential improvement in hardware yields a linear improvement in size of search space covered per unit time. In real applications you can usually get O(n) or even O(log n) solutions for the problems you actually care about, which implies that at a certain point a hard takeoff is inevitable.
But we have no way to know where that point is, and the complexity of CEV does make this an important issue. If an infrahuman AI can suddenly go FOOM and turn into an SI then something like CEV might be practical. But if the FOOM moment doesn’t come until the AI is well into transhumant territory we could spend years in a world of moderately-superhuman AIs that need a much less complex approach to Friendliness.
Unfortunately this just leads us back to all the problems that Eliezer was trying to dodge in proposing CEV. If you want to make an AI reliably Friendly you have to be able to describe Friendliness in a way that is unambiguous, complete, and not susceptible to gaming, which can’t be done with sentences of English text. You’d have to first understand how human language processing works, then build an AI implementation which handles unnatural categories and other fun complications in a predictable way, and then compose your instructions with the design of the ‘parser’ in mind.
Which is a lot less fun than the traditional ‘let’s talk about what orders to give the godlike AI’ debate, but it’s about the least rigorous approach that has any chance of working.