Finally, even if we later find out that lo and behold, the inference algorithm we hard-coded into our AGI circuits was actually not so great, and somebody comes along with a much better one . . . that is still not an argument for simulating the algorithm in software.
If we have many generations of rapid improvement of the algorithms this will be much easier if one doesn’t need to make new hardware each time.
Not at all true. The class of statistical inference algorithms including Bayesian Networks and the cortex are both extremely flexible and greatly benefit from ‘hard-wiring’ it.
The general trend should still occur this way. I’m also not sure that you can reach that conclusion about the cortex given that we don’t have a very good understanding of how the brain’s algorithms function.
he cortex doesn’t have a specialized vision circuit—there appears to be just one general purpose circuit it uses for everything. The visual regions become visual regions on account of . . processing visual input data.
That seems plausibly correct but we don’t actually know that. Given how much humans rely on vision it isn’t at all implausible that there have been subtle genetic tweaks that make our visual regions more effective in processing visual data (I don’t know the literature in this area at all).
To be more precise we should speak of computational complexity and bitops. The best known factorization algorithms are running time exponential for the number of input bits.
Incorrect, the best factoring algorithms are subexponential. See for example the quadratic field sieve and the number field sieve both of which have subexponential running time. This has been true since at least the early 1980s (there are other now obsolete algorithms that were around before then that may have had slightly subexponential running time. I don’t know enough about them in detail to comment.)
But factoring small primes is still easy in the absolute cost sense.
Factoring primes is always easy. For any prime p, it has no non-trivial factorizations. You seem to be confusing factorization with primality testing. The second is much easier than the first; we’ve had Agrawal’s algorithm which is provably polynomial time for about a decade. Prior to that we had a lot of efficient tests that were empirically faster than our best factorization procedures. We can determine the primality of numbers much larger than those we can factor.
Factoring is also easy in the algorithmic sense, as the best algorithms are very simple and short.
Really? The general number field sieve is simple and short? Have you tried to understand it or write an implementation? Simple and short compared to what exactly?
I’m having a hard time understanding what you really mean by saying “the narrow set of tasks which humans can do decently such as vision”. What about quantum mechanics, computer science, mathematics, game design, poetry, economics, sports, art, or comedy? One could probably fill a book with the narrow set of tasks that humans can do decently.
There are some tasks where we can argue that humans are doing a good job by comparison to others in the animal kingdom. Vision is a good example of this (we have some of the best vision of any mammal.) The rest are tasks which no other entities can do very well, and we don’t have any good reason to think humans are anywhere near good at them in an absolute sense. Note also that most humans can’t do math very well (Apparently 10% or so of my calculus students right now can’t divide one fraction by another). And the vast majority of poetry is just awful. It isn’t even obvious to me that the “good” poetry isn’t labeled that way in part simply from social pressure.
I’m not sure what you mean by this or how it relates. If you could do face recognition that fast . . it’s not impressive?
A lot of the tasks that humans have specialized in are not generally bottlenecks for useful computation. Improved facial recognition isn’t going to help much with most of the interesting stuff, like recursive self-improvement, constructing new algorithms, making molecular nanotech, finding a theory of everything, figuring out how Fred and George tricked Rita, etc.
The main computational cost of every main competing AGI route I’ve seen involves some sort of deep statistical inference, and this amounts to a large matrix multiplication possibly with some non-linear stepping or a normalization. Neural nets, bayesian nets, whatever—if you look at the mix of required instructions, it amounts to a massive repetition of simple operations that are well suited to hardware optimization.
Incorrect, the best factoring algorithms are subexponential.
To clarify, subexponential does not mean polynomial, but super-polynomial.
(Interestingly, while factoring a given integer is hard, there is a way to get a random integer within [1..N] and its factorization quickly. See Adam Kalai’s paper Generating Random Factored Numbers, Easily (PDF).
This is mostly irrelevant, but think complexity theorists use a weird definition of exponential according to which GNFS might still be considered exponential—I know when they say “at most exponential” they mean O(e^(n^k)) rather than O(e^n), so it seems plausible that by “at least exponential” they might mean Omega(e^(n^k)) where now k can be less than 1.
They like keeping things invariant under polynomial transformations of the input, since that’s has been observed to be a somewhat “natural” class. This is one of the areas where it seems to not quite.
Hmm, interesting in the notation that Scott says is standard to complexity theory my earlier statement that factoring is “subexponential” is wrong even though it is slower growing than exponential. But apparently Greg Kuperberg is perfectly happy labeling something like 2^(n^(1/2)) as subexponential.
If we have many generations of rapid improvement of the algorithms this will be much easier if one doesn’t need to make new hard-ware each time.
Yes, and this tradeoff exists today with some rough mix between general processors and more specialized ASICs.
I think this will hold true for a while, but it is important to point out a few subpoints:
If moore’s law slows down this will shift the balance farther towards specialized processors.
Even most ‘general’ processors today are actually a mix of CISC and vector processing, with more and more performance coming from the less-general vector portion of the chip.
For most complex real world problems algorithms eventually tend to have much less room for improvement than hardware—even if algorithmic improvements intially dominate. After a while algorithmic improvements end within the best complexity class and then further improvements are just constants and are swamped by hardware improvement.
Modern GPUs for example have 16 or more vector processors for every general logic processor.
The brain is like a very slow processor with massively wide dedicated statistical inference circuitry.
As a result of all this (and the point at the end of my last post) I expect that future AGIs will be built out of a heterogeneous mix of processors but with the bulk being something like a wide-vector processor with alot of very specialized statistical inference circuitry.
This type of design will still have huge flexibility by having program-ability at the network architecture level—it could for example simulate humanish and various types of mammalian brains as well as a whole range of radically different mind architectures all built out of the same building blocks.
The cortex doesn’t have a specialized vision circuit—there appears to be just one general purpose circuit it uses for everything. The visual regions become visual regions on account of . . processing visual input data.
That seems plausibly correct but we don’t actually know that.
We have pretty good maps of the low-level circuitry in the cortex at this point and it’s clearly built out of a highly repetitive base circuit pattern, similar to how everything is built out of cells at a lower level. I don’t have a single good introductory link, but it’s called the laminar cortical pattern.
Given how much humans rely on vision it isn’t at all implausible that there have been subtle genetic tweaks that make our visual regions more effective in processing visual data (I don’t know the literature in this area at all).
Yes, there are slight variations, but slight is the keyword. The cortex is highly general—the ‘visual’ region develops very differently in deaf people, for example, creating a entirely different audio processing networks much more powerful than what most people have.
The flexibility is remarkable—if you hook up electrodes to the tongue that send a rough visual signal from a camera, in time the cortical regions connected to the tongue start becoming rough visual regions and limited tongue based vision is the result.
Incorrect, the best factoring algorithms are subexponential.
I stand corrected on prime factorization—I saw the exp(....) part and assumed exponential before reading into it more.
Vision is a good example of this (we have some of the best vision of any mammal.) The rest are tasks which no other entities can do very well, and we don’t have any good reason to think humans are anywhere near good at them in an absolute sense.
This is a good point, but note the huge difference between the abilities or efficiency of an entire human mind vs the efficiency of the brain’s architecture or the efficiency of the lower level components from which it is built—such as the laminar cortical circuit.
I think this discussion started concerning your original point:
It isn’t at all obvious that the brain is even highly efficient at solving AI-type problems, especially given that humans have only needed to solve much of what we consider standard problems for a very short span of evolutionary history (and note that general mammal brain architecture looks very similar to ours
The cortical algorithm appears to be a pretty powerful and efficient low level building block. In evolutionary terms it has been around for much longer than human brains and naturally we can expect that it is much closer to optimality in the design configuration space in terms of the components it is built from.
As we go up a level to higher level brain architectures that are more recent in evolutionary terms we should expect there to be more room for improvement.
A lot of the tasks that humans have specialized in are not generally bottlenecks for useful computation.
The mammalian cortex is not specialized for particular tasks—this is the primary advantage of it’s architecture over it’s predecessors (at the cost of a much larger size than more specialized circuitry).
The mammalian cortex is not specialized for particular tasks—this is the primary advantage of it’s architecture over it’s predecessors (at the cost of a much larger size than more specialized circuitry).
How do you reconcile this claim with the fact that some people are faceblind from an early age and never develop the ability to recognize faces? This would suggest that there’s at least one aspect of humans that is normally somewhat hard-wired.
I’ve read a great deal about the cortex, and my immediate reaction to your statement was “no, that’s just not how it works”. (strong priors)
About one minute later on the Prosopagnosia wikipedia article, I find the first reference to this idea (that of congenital Prosopagnosia):
The idea of congenital prosopagnosia appears to be a new theory supported by one researcher and one? study:
Dr Jane Whittaker, writing in 1999, described the case of a Mr. C. and referred to other similar cases (De Haan & Campbell, 1991, McConachie, 1976 and Temple, 1992).[7] The reported cases suggest that this form of the disorder may be heritable and much more common than previously thought (about 2.5% of the population may be affected), although this congenital disorder is commonly accompanied by other forms of visual agnosia, and may not be “pure” prosopagnosia
The last part about it being “commonly accompanied by other forms of visual agnosia” gives it away—this is not anything close to what you originally thought/claimed, even if this new research is actually correct.
Known cases of true prosopagnosia are caused by brain damage—what this research is describing is probably a disorder of the higher region (V4 I believe) which typically learns to recognize faces and other complex objects.
However, there is an easy way to cause prosopagnosia during development—prevent the creature from ever seeing faces.
I dont have the link on hand, but there have been experiments in cats where you mess with their vision—by using grating patterns or carefully controlled visual environments, and you can create cats that literally can’t even see vertical lines.
So even the simplest most basic thing which nature could hard-code—a vertical line feature detector, actually develops from the same extremely flexible general cortical circuit—the same circuit which can learn to represent everything from sounds to quantum mechanics.
Humans can represent a massive number of faces, and in general the brain’s vast information storage capacity over the genome (10^15 ish vs 10^9 ish) more or less require a generalized learning circuit.
The cortical circuits do basically nothing but fire randomly when you are born—you really are a blank slate in that respect (although obviously the rest of the brain has plenty of genetically fixed functionality).
Of course the arrangement of the brain’s regions with respect to sensory organs and it’s overall wiring architecture do naturally lead to the familiar specializations of brain regions, but really one should consider this a developmental attractor—information is colonizing each cortex anew, but the similar architecture and similarity of information ensures that two brains end up having largely overlapping colonizations.
How do you reconcile this claim with the fact that some people are faceblind from an early age and never develop the ability to recognize faces? This would suggest that there’s at least one aspect of humans that is normally somewhat hard-wired.
There are all sorts of aspects of humans that are normally somewhat—or nearly entirely—hard-wired. The cortex just doesn’t tend to be. Even the parts of the cortex that are similarly specialised in most humans seem to be so due to what they are connected to. (As can be seen by looking at how the atypical cases have adapted differently.) It would surprise me if the inability to recognise faces was caused by a dysfunction in the cortex specifically.
Disclaimer: I disagree with nearly everything else Jacob has said in this thread. This position specifically appears to be well researched.
If we have many generations of rapid improvement of the algorithms this will be much easier if one doesn’t need to make new hardware each time.
The general trend should still occur this way. I’m also not sure that you can reach that conclusion about the cortex given that we don’t have a very good understanding of how the brain’s algorithms function.
That seems plausibly correct but we don’t actually know that. Given how much humans rely on vision it isn’t at all implausible that there have been subtle genetic tweaks that make our visual regions more effective in processing visual data (I don’t know the literature in this area at all).
Incorrect, the best factoring algorithms are subexponential. See for example the quadratic field sieve and the number field sieve both of which have subexponential running time. This has been true since at least the early 1980s (there are other now obsolete algorithms that were around before then that may have had slightly subexponential running time. I don’t know enough about them in detail to comment.)
Factoring primes is always easy. For any prime p, it has no non-trivial factorizations. You seem to be confusing factorization with primality testing. The second is much easier than the first; we’ve had Agrawal’s algorithm which is provably polynomial time for about a decade. Prior to that we had a lot of efficient tests that were empirically faster than our best factorization procedures. We can determine the primality of numbers much larger than those we can factor.
Really? The general number field sieve is simple and short? Have you tried to understand it or write an implementation? Simple and short compared to what exactly?
There are some tasks where we can argue that humans are doing a good job by comparison to others in the animal kingdom. Vision is a good example of this (we have some of the best vision of any mammal.) The rest are tasks which no other entities can do very well, and we don’t have any good reason to think humans are anywhere near good at them in an absolute sense. Note also that most humans can’t do math very well (Apparently 10% or so of my calculus students right now can’t divide one fraction by another). And the vast majority of poetry is just awful. It isn’t even obvious to me that the “good” poetry isn’t labeled that way in part simply from social pressure.
A lot of the tasks that humans have specialized in are not generally bottlenecks for useful computation. Improved facial recognition isn’t going to help much with most of the interesting stuff, like recursive self-improvement, constructing new algorithms, making molecular nanotech, finding a theory of everything, figuring out how Fred and George tricked Rita, etc.
This seems to be a good point.
To clarify, subexponential does not mean polynomial, but super-polynomial.
(Interestingly, while factoring a given integer is hard, there is a way to get a random integer within [1..N] and its factorization quickly. See Adam Kalai’s paper Generating Random Factored Numbers, Easily (PDF).
Interesting. I had not seen that paper before. That’s very cute.
This is mostly irrelevant, but think complexity theorists use a weird definition of exponential according to which GNFS might still be considered exponential—I know when they say “at most exponential” they mean O(e^(n^k)) rather than O(e^n), so it seems plausible that by “at least exponential” they might mean Omega(e^(n^k)) where now k can be less than 1.
EDIT: Nope, I’m wrong about this. That seems kind of inconsistent.
They like keeping things invariant under polynomial transformations of the input, since that’s has been observed to be a somewhat “natural” class. This is one of the areas where it seems to not quite.
Hmm, interesting in the notation that Scott says is standard to complexity theory my earlier statement that factoring is “subexponential” is wrong even though it is slower growing than exponential. But apparently Greg Kuperberg is perfectly happy labeling something like 2^(n^(1/2)) as subexponential.
Yes, and this tradeoff exists today with some rough mix between general processors and more specialized ASICs.
I think this will hold true for a while, but it is important to point out a few subpoints:
If moore’s law slows down this will shift the balance farther towards specialized processors.
Even most ‘general’ processors today are actually a mix of CISC and vector processing, with more and more performance coming from the less-general vector portion of the chip.
For most complex real world problems algorithms eventually tend to have much less room for improvement than hardware—even if algorithmic improvements intially dominate. After a while algorithmic improvements end within the best complexity class and then further improvements are just constants and are swamped by hardware improvement.
Modern GPUs for example have 16 or more vector processors for every general logic processor.
The brain is like a very slow processor with massively wide dedicated statistical inference circuitry.
As a result of all this (and the point at the end of my last post) I expect that future AGIs will be built out of a heterogeneous mix of processors but with the bulk being something like a wide-vector processor with alot of very specialized statistical inference circuitry.
This type of design will still have huge flexibility by having program-ability at the network architecture level—it could for example simulate humanish and various types of mammalian brains as well as a whole range of radically different mind architectures all built out of the same building blocks.
We have pretty good maps of the low-level circuitry in the cortex at this point and it’s clearly built out of a highly repetitive base circuit pattern, similar to how everything is built out of cells at a lower level. I don’t have a single good introductory link, but it’s called the laminar cortical pattern.
Yes, there are slight variations, but slight is the keyword. The cortex is highly general—the ‘visual’ region develops very differently in deaf people, for example, creating a entirely different audio processing networks much more powerful than what most people have.
The flexibility is remarkable—if you hook up electrodes to the tongue that send a rough visual signal from a camera, in time the cortical regions connected to the tongue start becoming rough visual regions and limited tongue based vision is the result.
I stand corrected on prime factorization—I saw the exp(....) part and assumed exponential before reading into it more.
This is a good point, but note the huge difference between the abilities or efficiency of an entire human mind vs the efficiency of the brain’s architecture or the efficiency of the lower level components from which it is built—such as the laminar cortical circuit.
I think this discussion started concerning your original point:
The cortical algorithm appears to be a pretty powerful and efficient low level building block. In evolutionary terms it has been around for much longer than human brains and naturally we can expect that it is much closer to optimality in the design configuration space in terms of the components it is built from.
As we go up a level to higher level brain architectures that are more recent in evolutionary terms we should expect there to be more room for improvement.
The mammalian cortex is not specialized for particular tasks—this is the primary advantage of it’s architecture over it’s predecessors (at the cost of a much larger size than more specialized circuitry).
How do you reconcile this claim with the fact that some people are faceblind from an early age and never develop the ability to recognize faces? This would suggest that there’s at least one aspect of humans that is normally somewhat hard-wired.
I’ve read a great deal about the cortex, and my immediate reaction to your statement was “no, that’s just not how it works”. (strong priors)
About one minute later on the Prosopagnosia wikipedia article, I find the first reference to this idea (that of congenital Prosopagnosia):
The idea of congenital prosopagnosia appears to be a new theory supported by one researcher and one? study:
The last part about it being “commonly accompanied by other forms of visual agnosia” gives it away—this is not anything close to what you originally thought/claimed, even if this new research is actually correct.
Known cases of true prosopagnosia are caused by brain damage—what this research is describing is probably a disorder of the higher region (V4 I believe) which typically learns to recognize faces and other complex objects.
However, there is an easy way to cause prosopagnosia during development—prevent the creature from ever seeing faces.
I dont have the link on hand, but there have been experiments in cats where you mess with their vision—by using grating patterns or carefully controlled visual environments, and you can create cats that literally can’t even see vertical lines.
So even the simplest most basic thing which nature could hard-code—a vertical line feature detector, actually develops from the same extremely flexible general cortical circuit—the same circuit which can learn to represent everything from sounds to quantum mechanics.
Humans can represent a massive number of faces, and in general the brain’s vast information storage capacity over the genome (10^15 ish vs 10^9 ish) more or less require a generalized learning circuit.
The cortical circuits do basically nothing but fire randomly when you are born—you really are a blank slate in that respect (although obviously the rest of the brain has plenty of genetically fixed functionality).
Of course the arrangement of the brain’s regions with respect to sensory organs and it’s overall wiring architecture do naturally lead to the familiar specializations of brain regions, but really one should consider this a developmental attractor—information is colonizing each cortex anew, but the similar architecture and similarity of information ensures that two brains end up having largely overlapping colonizations.
There are all sorts of aspects of humans that are normally somewhat—or nearly entirely—hard-wired. The cortex just doesn’t tend to be. Even the parts of the cortex that are similarly specialised in most humans seem to be so due to what they are connected to. (As can be seen by looking at how the atypical cases have adapted differently.) It would surprise me if the inability to recognise faces was caused by a dysfunction in the cortex specifically.
Disclaimer: I disagree with nearly everything else Jacob has said in this thread. This position specifically appears to be well researched.