Programmers operating with partial insight, create a mind that performs a number of tasks very well, but can’t really handle self-modification let alone AI theory [...] This scenario seems less likely to my eyes, but it is not ruled out by any effect I can see.
Twelve and a half years later, does new evidence for the scaling hypothesis make this scenario more plausible? If we’re in the position of being able to create increasingly capable systems without really understanding how they work by throwing lots of compute at gradient descent, then won’t those systems themselves likely also be in the position of not understanding themselves enough to “close the loop” on recursive self-improvement?
Twelve and a half years later, does new evidence for the scaling hypothesis make this scenario more plausible? If we’re in the position of being able to create increasingly capable systems without really understanding how they work by throwing lots of compute at gradient descent, then won’t those systems themselves likely also be in the position of not understanding themselves enough to “close the loop” on recursive self-improvement?