Steven Byrnes comments on [missing post]

Steven Byrnes 18 Feb 2020 16:03 UTC
LW: 1 AF: 1
0
AF
Here are three areas where I think I have a different perspective:
1. I think we should be careful to quantify how compute-efficient, how compressible, etc. I agree that vision relies on lots of heuristics. Vision is not possible in 100 lines of python code with no ML! But are we talking about megabytes of heuristics, or gigabytes, or terabytes? I don’t know.
2. It also seems to me that we can plausibly declare vision, audio, etc. to be “compute-efficient enough”—that explicit, high-level reasoning is the only bottleneck, and that the other 6 boxes are “solved problems”, or close enough anyway.
3. I think we shouldn’t confuse two types of “simple”: A “simple core learning algorithm” can still learn arbitrarily complicated heuristics from data.
I don’t think we currently know of an algorithm which would be an AGI if only we had enough compute. Yes we have search algorithms, but I don’t think we have data structures that can encode the kind of arbitrary abstract understanding of the world that is needed, and I don’t think we have a way to build and sort through those data structures. (But I do think there are ideas out there that are getting a lot closer.) Supposedly, the human brain does about as much calculation as a supercomputer. But we can’t take a supercomputer and have it be a remote-work software engineer, doing all the things that software engineers do, like answering emails and debugging code. (Equivalently, we can’t take a normal computer and have it be a software engineer running in slow-motion.)

Therefore, I think we have to say that we’re not putting our computer cycles to the maximally-efficient use in creating intelligence.

The line of research I would most expect to lead to AGI soon would be algorithms explicitly trying to emulate (at a high level) the algorithms of human intelligence. While these tend to be remarkably sample efficient (just as humans can learn a new word from a single exposure), they are not particularly easy computations to do today (they’re vaguely related to probabilistic programming), and they are also substantially different from ResNets and the other most popular ML approaches. Are they “compute efficient”? I wouldn’t know how to answer that, because I think you can only fairly compare it to other algorithms “doing AGI”, and I don’t think there are any other such algorithms today. I would expect a pretty simple core learning algorithm, that is (at first) painfully slow and inefficient to run, and which creates a not-at-all-simple mess of heuristics as it learns about the world.

For example, one fast-takeoff-ish scenario I might imagine would be that someone gets AGI-level algorithms just barely working inefficiently on small toy problems, and then gets a 1000× speedup by translating the algorithms to CUDA-C++ / FPGA / whatever (or opening a collaboration with google, or finding a better implementation, or whatever), and the result takes people by surprise—maybe even including the programmers, and certainly people outside the group.