[one argument of people who think that the superintelligence alginment problem is incredibly important] is that just as a chimpanzee would be unable to predict what a human intelligence would do or how we would make decisions (aside: how would we know? Were any chimps consulted?), we would be equally inept in the face of a super-intelligence… The human capacity for abstract reasoning over mathematical models is in principle a fully general intelligent behaviour, as the scientific revolution has shown: there is no aspect of the natural world which has remained beyond the reach of human understanding, once a sufficient amount of evidence is available.
Yes, when you wrote this, I had a funny feeling that this argument was muhc weaker than I had previously thought. I think the relevant points are as follows: The people interested in Superintelligence Alignment tend to use this as an example of agents, with a given optimization power, being outperformed by agents with greater levels of optimization power, of which I think the example is apt. Perhaps you would like to argue that optimization power is not so coherent a concept, as humans’ mathematical modelling and scientific reasoning is qualitatively different to anything that a chimpanzee can do, rather than just a proportional increase in this notion of ‘optimization power’. I think that science is a powerful insight, and perhaps there is not another insight of equivalent power, but I don’t think this invalidates the concept of optimization power. You can still see the semblence of a scale on which you put mice, humans, and Superintelligent AI, to do with how well the agent can effect its goals in the world. Humans being able to reason about AI does not invalidate this.
Then you go on to say that this is a specific example of a general problem. As to the general problem of reasoning by analogy rather than forming predictive models… I assign a high probability to the statement that most Superintelligence Alignment folk use analogies to communicwte intuition to the public, as opposed to those being their main reasons for holding their beliefs.
There’s lip service done to empiricism throughout, but in all the “applied” sequences relating to quantum physics and artificial intelligence it appears to be forgotten. We get instead definitive conclusions drawn from thought experiments only. It is perhaps not surprising that these sequences seem the most controversial.
I think that, between nuanced abstract reasoning and using clear empirical evidence to resolve disputes, humanity is better at the second. I agree that many people on this planet need to learn to do it much better, but there are other places to learn to do that. Eliezer is here trying to write about the more complex problem of using your reasoning to ‘not shoot yourslf in the foot’, repeatedly. However, to say he does little more than ‘pay lip service’ to empiricism seems to miss all of his posts that just explain a particular psychological study.
There is work that could be done now to alleviate both of these issues. But within the LessWrong community there is actually outright hostility to work that has a reasonable chance of alleviating suffering (e.g. artificial general intelligence applied to molecular manufacturing and life-science research) due to concerns arrived at by flawed reasoning.
I think the first point here, is that most people in this world are probably hostile to important work due to their flawed epistemologies. Or at least, they have a very flawed set of priorities. Arbitrary political issues dominate discussion. I think that if you were to offer a post with considerable substance, focusing on just this claim, or any of your above claims, most good rationalists would change their minds (‘good rationalists’ here defined as people who have gone out of their way in their life to do EA work, like been to meetings or done research). You’d get a better chance of changing people’s minds by poeting here than on Reddit, I bet.
...much of our uncertainty about the actions of an unfriendly AI could be resolved if we were to know more about how such agents construct their thought models, and relatedly what language were used to construct their goal systems. We could also stand to benefit from knowing more practical information (experimental data) about in what ways AI boxing works and in what ways it does not, and how much that is dependent on the structure of the AI itself. Thankfully there is an institution that is doing that kind of work: the Future of Life institute (not MIRI).
If you are not familiar with MIRI’s current technical agenda, then you may wish to retract this claim, specifically regarding the point about how agents construct their thought models: this is currently their entire research project, loosely speaking. Figuring out how agents model logical uncertainty in the environment, and model the environment with itself as a subset, are two of it’s main focus areas. If you are familiar with their research, you may wish to flesh out that argument a little more. I personally didn’t think that FLI did any research atm, just worked on making the cause better-appreciated in the public eye, so I’ll go read up on that at some point.
I will no longer be running the Rationality: From AI to Zombies reading group
Alas. I was enjoying the series.
I hope this comment is taken in good faith, your post suggested you had good reasons behind your claims, and I’d be interested in hearing them.
If you are not familiar with MIRI’s current technical agenda, then you may wish to retract this claim.
I am familiar with MIRI’s technical agenda and I stand by my words. The work MIRI is choosing for itself is self-isolating and not relevant to the problems at hand in practical AGI work.
The work MIRI is choosing for itself is self-isolating
AFAIK, part of why the technical agenda contains the questions it does is that they’re problems that are of interest to people to mathematicians and logicians even if those people aren’t interested in AI risk. (Though of course, that doesn’t mean that AI researchers would be interested in that work, but it’s at least still more connecting with the academic community than “self-isolating” would imply.)
AFAIK, part of why the technical agenda contains the questions it does is that they’re problems that are of interest to people to mathematicians and logicians even if those people aren’t interested in AI risk.
This is concerning if true—the goal of the technical agenda should be to solve AI risk, not appeal to mathematicians and logicians (by say making them feel important).
That sounds like an odd position to me. IMO, getting as many academics from other fields as possible working on the problems is essential if one wants to make maximal progress on them.
The academic field which is most conspicuously missing is artificial intelligence. I agree with Jacob that it is and should be concerning that the machine intelligence research institute has adopted a technical agenda which is non-inclusive of machine intelligence researchers.
I agree with Jacob that it is and should be concerning
That depends on whether you believe that machine intelligence researchers are the people who are currently the most likely to produce valuable progress on the relevant research questions.
One can reasonably disagree on MIRI’s current choices about their research program, but I certainly don’t think that their choices are concerning in the sense of suggesting irrationality on their part. (Rather the choices only suggest differing empirical beliefs which are arguable, but still well within the range of non-insane beliefs.)
On the contrary, my core thesis is that AI risk advocates are being irrational. It’s implied in the title of the post ;)
Specifically I think they are arriving at their beliefs via philosophical arguments about the nature of intelligence which are severely lacking in empirical data, and then further shooting themselves in the foot by rationalizing reasons to not pursue empirical tests. Taking a belief without evidence, and then refusing to test that belief empirically—I’m willing to call a spade a spade: that is most certainly irrational.
I largely agree, but to be fair we should consider that MIRI started working on AI safety theory long before the technology required for practical experimentation with human-level AGI—to do that you need to be close to AGI in the first place.
Now that we are getting closer, the argument for prioritizing experiments over theory becomes stronger.
There are many types of academics—does your argument extend to french literature experts?
Clearly, if there is a goal behind the technical agenda, changing the technical agenda to appeal to certain groups detracts from that goal. You could argue that enlisting the help of mathematicians and logicians is so important it justifies changing the agenda … but I doubt there is much historical support for such a strategy.
I suspect part of the problem is that the types of researchers/academics which could most help (machine learning, statistics, comp sci types) are far too valuable to industry and thus are too expensive for non-profits such as MIRI.
There are many types of academics—does your argument extend to french literature experts?
Well, if MIRI happened to know of technical problems they thought were relevant for AI safety and which they thought French literature experts could usefully contribute to, sure.
I’m not suggesting that they would have taken otherwise uninteresting problems and written those up simply because they might be of interest to mathematicians. Rather my understanding is that they had a set of problems that seemed about equally important, and then from that set, used “which ones could we best recruit outsiders to help with” as an additional criteria. (Though I wasn’t there, so anything I say about this is at best a combination of hearsay and informed speculation.)
I’d like to offer some responses.
Yes, when you wrote this, I had a funny feeling that this argument was muhc weaker than I had previously thought. I think the relevant points are as follows: The people interested in Superintelligence Alignment tend to use this as an example of agents, with a given optimization power, being outperformed by agents with greater levels of optimization power, of which I think the example is apt. Perhaps you would like to argue that optimization power is not so coherent a concept, as humans’ mathematical modelling and scientific reasoning is qualitatively different to anything that a chimpanzee can do, rather than just a proportional increase in this notion of ‘optimization power’. I think that science is a powerful insight, and perhaps there is not another insight of equivalent power, but I don’t think this invalidates the concept of optimization power. You can still see the semblence of a scale on which you put mice, humans, and Superintelligent AI, to do with how well the agent can effect its goals in the world. Humans being able to reason about AI does not invalidate this.
Then you go on to say that this is a specific example of a general problem. As to the general problem of reasoning by analogy rather than forming predictive models… I assign a high probability to the statement that most Superintelligence Alignment folk use analogies to communicwte intuition to the public, as opposed to those being their main reasons for holding their beliefs.
I think that, between nuanced abstract reasoning and using clear empirical evidence to resolve disputes, humanity is better at the second. I agree that many people on this planet need to learn to do it much better, but there are other places to learn to do that. Eliezer is here trying to write about the more complex problem of using your reasoning to ‘not shoot yourslf in the foot’, repeatedly. However, to say he does little more than ‘pay lip service’ to empiricism seems to miss all of his posts that just explain a particular psychological study.
I think the first point here, is that most people in this world are probably hostile to important work due to their flawed epistemologies. Or at least, they have a very flawed set of priorities. Arbitrary political issues dominate discussion. I think that if you were to offer a post with considerable substance, focusing on just this claim, or any of your above claims, most good rationalists would change their minds (‘good rationalists’ here defined as people who have gone out of their way in their life to do EA work, like been to meetings or done research). You’d get a better chance of changing people’s minds by poeting here than on Reddit, I bet.
If you are not familiar with MIRI’s current technical agenda, then you may wish to retract this claim, specifically regarding the point about how agents construct their thought models: this is currently their entire research project, loosely speaking. Figuring out how agents model logical uncertainty in the environment, and model the environment with itself as a subset, are two of it’s main focus areas. If you are familiar with their research, you may wish to flesh out that argument a little more. I personally didn’t think that FLI did any research atm, just worked on making the cause better-appreciated in the public eye, so I’ll go read up on that at some point.
Alas. I was enjoying the series.
I hope this comment is taken in good faith, your post suggested you had good reasons behind your claims, and I’d be interested in hearing them.
I am familiar with MIRI’s technical agenda and I stand by my words. The work MIRI is choosing for itself is self-isolating and not relevant to the problems at hand in practical AGI work.
AFAIK, part of why the technical agenda contains the questions it does is that they’re problems that are of interest to people to mathematicians and logicians even if those people aren’t interested in AI risk. (Though of course, that doesn’t mean that AI researchers would be interested in that work, but it’s at least still more connecting with the academic community than “self-isolating” would imply.)
This is concerning if true—the goal of the technical agenda should be to solve AI risk, not appeal to mathematicians and logicians (by say making them feel important).
That sounds like an odd position to me. IMO, getting as many academics from other fields as possible working on the problems is essential if one wants to make maximal progress on them.
The academic field which is most conspicuously missing is artificial intelligence. I agree with Jacob that it is and should be concerning that the machine intelligence research institute has adopted a technical agenda which is non-inclusive of machine intelligence researchers.
That depends on whether you believe that machine intelligence researchers are the people who are currently the most likely to produce valuable progress on the relevant research questions.
One can reasonably disagree on MIRI’s current choices about their research program, but I certainly don’t think that their choices are concerning in the sense of suggesting irrationality on their part. (Rather the choices only suggest differing empirical beliefs which are arguable, but still well within the range of non-insane beliefs.)
On the contrary, my core thesis is that AI risk advocates are being irrational. It’s implied in the title of the post ;)
Specifically I think they are arriving at their beliefs via philosophical arguments about the nature of intelligence which are severely lacking in empirical data, and then further shooting themselves in the foot by rationalizing reasons to not pursue empirical tests. Taking a belief without evidence, and then refusing to test that belief empirically—I’m willing to call a spade a spade: that is most certainly irrational.
That’s a good summary of your post.
I largely agree, but to be fair we should consider that MIRI started working on AI safety theory long before the technology required for practical experimentation with human-level AGI—to do that you need to be close to AGI in the first place.
Now that we are getting closer, the argument for prioritizing experiments over theory becomes stronger.
There are many types of academics—does your argument extend to french literature experts?
Clearly, if there is a goal behind the technical agenda, changing the technical agenda to appeal to certain groups detracts from that goal. You could argue that enlisting the help of mathematicians and logicians is so important it justifies changing the agenda … but I doubt there is much historical support for such a strategy.
I suspect part of the problem is that the types of researchers/academics which could most help (machine learning, statistics, comp sci types) are far too valuable to industry and thus are too expensive for non-profits such as MIRI.
Well, if MIRI happened to know of technical problems they thought were relevant for AI safety and which they thought French literature experts could usefully contribute to, sure.
I’m not suggesting that they would have taken otherwise uninteresting problems and written those up simply because they might be of interest to mathematicians. Rather my understanding is that they had a set of problems that seemed about equally important, and then from that set, used “which ones could we best recruit outsiders to help with” as an additional criteria. (Though I wasn’t there, so anything I say about this is at best a combination of hearsay and informed speculation.)