One really does wonder whether the topical collapse of American finance, systemic underestimation of risk, and overconfidence in being able to NEGOTIATE risk in the face of enormous complexity should figure into these conversations more than just a couple of sarcastic posts about short selling.
Aron
Of course you can make an inference about the evidenced skill of the scientists. Scientist 1 was capable of picking out of a large set of models that covered the first 10 variables, the considerably smaller set of models that also covered the second 10. He did that by reference to principles and knowledge he brought to the table about the nature of inference and the problem domain. The second scientist has not shown any of this capability. I think our prior expectation for the skill of the scientists would be irrelevant, assuming that the prior was at least equal for both of them.
Peter: “The first theorist had less data to work with, and so had less data available to insert into the theory as parameters. This is evidence that the first theory will be smaller than the second theory”
The data is not equivalent to the model parameters. A linear prediction model of [PREDICT_VALUE = CONSTANT * DATA_POINT_SEQUENCE_NUMBER] can model an infinite number of data points. Adding more data points does not increase the model parameters. If there is a model that predicts 10 variables, and subsequently predicts another 10 variables there is no reason to add complexity unless one prefers complexity.
So reviewing the other comments now I see that I am essentially in agreement with M@ (on David’s blog) who posted prior to Eli. Therefore, Eli disagrees with that. Count me curious.
There are an infinite number of models that can predict 10 variables, or 20 for that matter. The only probable way for scientist A to predict a model out of the infinite possible ones is to bring prior knowledge to the table about the nature of that model and the data. This is also true for the second scientist, but only slightly less so.
Therefore, scientist A has demonstrated a higher probability of having valuable prior knowledge.
I don’t think there is much more to this than that. If the two scientists have equal knowledge there is no reason the second model need be more complicated than the first since the first fully described the extra revealed data in the second.
If it was the same scientist with both sets of data then you would pick the second model.
I agree there should be a strong prior belief that anyone pursuing AGI at our current level of overall human knowledge, is likely quite ordinary or at least failing to make reasonably obvious conclusions.
Let me give a shout out to my 1:50 peeps! I can’t even summarize what EY has notably accomplished beyond highlighting how much more likely he is to accomplish something. All I really want is for Google to stop returning pages that are obviously unhelpful to me, or for a machine to disentangle how the genetic code works, or a system that can give absolute top notch medical advice, or something better than the bumbling jackasses[choose any] that manage to make policy in our country. Give me one of those things and you will be one in a million, baby.
It would seem you have to take away pure rationality, and add natural selection, before seeing the emergence of the decision-making standards of humanity !!
It’s likely deliberate that prisoners were selected in the visualization to imply a relative lack of unselfish motivations.
“You want to hack evolution’s sphagetti code? Good luck with that. Let us know if you get FDA approval.”
I think I’ve seen Eli make this same point. How can you be certain at this point, when we are nowhere near achieving it, that AI won’t be in the same league of complexity as the spaghetti brain? I would admit that there are likely artifacts of the brain that are unnecessarily kludgy (or plain irrelevent) but not necessarily in a manner that excessively obfuscates the primary design. It’s always tempting for programmers to want to throw away a huge tangled code set when they first have to start working on it, but it is almost always not the right approach.
I expect advances in understanding how to build intelligence to serve as the groundwork for hypothesis of how the brain functions and vice-versa.
On the friendliness issue, isn’t the primary logical way to avoid problems to create a network of competitive systems and goals? If one system wants to tile the universe with smileys that is almost certainly going to get in the way of the goal sets of the millions of other intelligences out there. They logically then should see value in reporting or acting upon their belief that a rival AI is making their jobs harder. I’d be suprised if humans don’t have half their cognitive power devoted to anticipating and manipulating their expectations of rival’s actions.
Breaks are good for perspective. Have a margarita. Those are good for a single tiled smiley. I look forward to the talk on AIXI.
intelligence:compression::friendliness:???.
What can a human do to a superhuman AI that a human + infrahuman AI can’t do?
“This is why, when you encounter the AGI wannabe who hasn’t planned out a whole technical approach to FAI, and confront them with the problem for the first time, and they say, “Oh, we’ll test it to make sure that doesn’t happen, and if any problem like that turns up we’ll correct it, now let me get back to the part of the problem that really interests me,” know then that this one has not yet leveled up high enough to have interesting opinions.”
There is an overwhelming assumption here in a Terminator series hard takeoff. Whereas the plausible reality IMO seems to be more like an ecosystem of groups of intelligences of varying degrees all of which will likely have survival rationale for disallowing a peer to hit nutso escape velocity. And at any rate, someone in 2025 with a infrahuman intelligent AI is likely to be much better off at solving the 100th meta-derivative of these toy problems than someone working with 200 hz neurons alone. Now I gotta go, I think windows needs to reboot or something..
I blinked and almost missed this.
Eliezer, I suspect that was rhetorical. However.. top algorithms that avoid overtraining can benefit from adding model parameters (though in massively decreasing returns of scale). There are top-tier monte carlo algorithms that take weeks to converge, and if you gave them years and more parameters they’d do better (if slight). It may ultimately prove to be a non-zero advantage for those that have the algorithmic expertise and the hardware advantage particularly in a contest where people are fighting for very small quantitative differences. I mentioned this for Dan’s benefit and didn’t intend to connect it directly to strong AI.
I’m not imagining a scenario where someone in a lab is handed a computer that runs at 1 exaflop and this person throws a stacked RBM on there and then finally has a friend. However, I am encouraged by the steps that Nvidia and AMD have taken towards scientific computing and Intel (though behind) is simultaneously headed the same direction. Suddenly we may have a situation where for commodity prices, applications can be built that do phenomenally interesting things in video and audio processing (and others I’m unaware of). These applications aren’t semantic powerhouses of abstraction, but they are undeniably more AI-like than what came before, utilizing statistical inferences and deep parallelization. Along the way we learn the basic nuts and bolts engineering basics of how to distribute work among different hardware architectures, code in parallel, develop reusable libraries and frameworks, etc.
If we take for granted that strong AI is so fricking hard we can’t get there in one step, we have to start looking at what steps we can take today that are productive. That’s what I’d really love to see your brain examine: the logical path to take. If we find a killer application today along the lines above, then we’ll have a lot more people talking about activation functions and log probabilities. In contrast, the progress of hardware from 2001-2006 was pretty disappointing (to me at least) outside of the graphics domain.
Dan, I’ve implemented RBMs and assorted statistical machine learning algorithms in context with the NetflixPrize. I’ve also recently adapted some of these to work on Nvidia cards via their CUDA platform. Performance improvements have been 20-100x and this is hardware that has only taken a few steps away from pure graphics specialization. Fine-grained parallelization, improved memory bandwidth, less chip logic devoted to branch prediction, user-controlled shared memory, etc. help.
I’m seeing a lot of interesting applications in multimedia processing, many of which have statistical learning elements. One project at Siggraph allowed users to modify a single frame of video and have that modification automatically adapt across the entire video. Magic stuff. If we are heading towards hardware that is closer to what we’d expect as the proper substrate for AI, and we are finding commercial applications that promote this development, then I think we are building towards this fricking hard problem the only way possible: in small steps. It’s not the conjugate gradient, but we’ll get there.
Eliezer, do you work on coding AI? What is the ideal project that intersects practical value and progress towards AGI? How constrained is the pursuit of AGI by a lack of hardware optimized for it’s general requirements? I’d love to hear more nuts and bolts stuff.
Well observed Seinfeldski.
re: Debugging is twice as hard as writing the code in the first place.
Yeah—still not a fan of this quote. It relies on ‘writing the code in the first place’ meaning something of any importance. I could say that my 12 line strong AI program is written and just needs to be debugged a little… given these terms.
Sounds more like an entropy argument where incorrect code is much more highly probable than correct code, and while our caffeine and cool ranch dorito fueled minds may get into the neighborhood of order it only comes at great cost: every bug we move out of code, ends up munching in the dorito bag we left under the table several weeks before.
It seems the fundamental point, or a way to better define ‘writing code’ is to say that you can write code that works for some input/outputs much more easily than for ALL input/outputs. Therefore, Skilling et al can create a house of cards that works for some time but once reality random walks to something unanticipated then someone is left shuffling out the door.
the debugging quote is silly. Unless I missed the point, it is considerably more clear and just as instructive to say ‘it’s easier to write broken code than correct code, therefore if broken code is the best you can do...’
Not overwhelmingly on-topic?
So we have a small minority of financial wizards and their supporting frameworks convincing everyone that they can take an overwhelmingly, inhumanly complex system and quantify the risk of ALL scenarios. Then when this is proven out as hubris, the broader system appears to exhibit cascading failures that impact direct and non-direct participants. The given leaders with vast resources could essentially flip a coin on the post-hoc solution, biased by the proximity of their next election.
So yes, MBS’s aren’t active agents with goals, but their caretakers with profit-maximizing motives are. Should we have better engineered the macro-system or the mortgage backed securities?
Maybe we need a Friendly Mortgage Securitization project.