(I am not satisfied with the state of discourse on this question (to be specific, I don’t think MIRI proponents have adequately addressed concerns like those expressed by Wei Dai here and elsewhere), so I don’t want to be seen as endorsing what might naively seem to be the immediate policy-relevant implications of this argument, but:) Bla bla philosophical problems once solved are no longer considered philosophical bla bla [this part of the argument is repeated ad nauseam and is universally overstated], and then this. Steve’s comment also links to quite similar arguments made by Luke Muehlhauser. Also it links to one of Wei Dai’s previous posts closely related to this question.
Wei Dai, I noticed on the MIRI website that you’re slotted to appear at some future MIRI workshop. I find this a little bit strange—given your reservations, aren’t you worried about throwing fuel on the fire?
Thanks for the link. I don’t think I’ve seen that comment before. Steve raises the examples of Bayesian decision theory and Solomonoff induction to support his position, but to me both of these are examples of philosophical ideas that looked really good at some point but then turned out to be incomplete / not quite right. If the FAI team comes up with new ideas that are in the same reference class as Bayesian decision theory and Solomonoff induction, then I don’t know how they can gain enough confidence that those ideas can be the last words in their respective subjects.
Wei Dai, I noticed on the MIRI website that you’re slotted to appear at some future MIRI workshop. I find this a little bit strange—given your reservations, aren’t you worried about throwing fuel on the fire?
Well I’m human which means I have multiple conflicting motivations. I’m going because I’m really curious what direction the participants will take decision theory.
Wei Dai, I noticed on the MIRI website that you’re slotted to appear at some future MIRI workshop. I find this a little bit strange—given your reservations, aren’t you worried about throwing fuel on the fire?
I don’t see why that would be strange. Maybe Wei Dai thinks he can improve the way MIRI handles their FAI endgame from the inside, or that helping MIRI make progress on decision theory will not make MIRI more likely to screw up and develop UFAI, or both.
Showing up at a workshop is probably a good way to check the general sanity level/technical proficiency of MIRI. I would anticipate having a largish update (although I don’t know in which direction, obviously).
That’s not my main motivation for attending. I think I already have a rough idea of their sanity level/technical proficiency from online discussions and their technical writings, so I’m not anticipating having a particularly large update. (I’ve also attended a SIAI decision theory workshop a few years ago and met some SIAI people at that time.)
(to be specific, I don’t think MIRI proponents have adequately addressed concerns like those expressed by Wei Dai here and elsewhere)
I do wonder why MIRI people often do not respond to my criticisms about their strategy. For example the only MIRI-affiliated person who responded to this post so far is Paul Christiano (but given his disagreements with Eliezer, he isn’t actually part of my intended audience for this post). The upcoming workshop might be a good opportunity to see if I can get MIRI people to take my concerns more seriously, if I talk to them face to face. If you or anyone else has any ideas on what else I should try, please let me know.
I do wonder why MIRI people often do not respond to my criticisms about their strategy.
Speaking for myself...
Explaining strategic choices, and replying to criticisms, takes enormous amounts of time. For example, Nick Bostrom set out to explain what MIRI/FHI insiders might consider to be “10% of the basics about AI risk” in a clear and organized way, and by the time he’s done with the Superintelligence book it will have taken him something like 2.5 years of work just to do that, with hundreds of hours of help from other people — and he was already an incredibly smart, productive academic writer who had a strong comparative advantage writing exactly that book. It would’ve taken me, or Carl, or anybody else besides Nick a lot more time and effort to write that book at a similar level of quality.
Which of your many discussion threads on AI risk strategy do you most wish would be engaged further by somebody on staff at MIRI?
This seems like a generic excuse you’ve developed, and it’s not a bad one to use when waving off random comments from people who have little idea what they’re talking about. But my particular arguments already share most of the same assumptions as MIRI, with each post focusing only on one or two key points of disagreement. If it’s not worth your time to reply to my criticisms, then I don’t see whose criticisms you could possibly find it worthwhile to respond to.
Which of your many discussion threads on AI risk strategy do you most wish would be engaged further by somebody on staff at MIRI?
I was going to say “this post” but now that Eliezer has responded I’m satisfied with MIRI’s level of engagement (assuming he doesn’t abruptly disappear at some point as he occasionally does during our previous discussions).
Looking at my other top level posts, I’d be interested to know what MIRI thinks about this and this.
Which other writings of yours would you most like at least an initial reply to? Or, if there were discussions that were dropped by the MIRI party too soon (from your perspective), I could try to continue them, at least from my own perspective.
Personally, I didn’t respond to this post because my reaction to it was mostly “yes, this is a problem, but I don’t see a way by which talking about it will help at this point; we’ll just have to wait and see”. In other words, I feel that MIRI will just have to experiment with a lot of different strategies and see which ones look like they’ll have promise, and then that experimentation will maybe reveal a way by which issues like this one can be solved, or maybe MIRI will end up pursuing an entirely different strategy. But I expect that we’ll actually have to try out the different strategies before we can know.
For instance, MIRI’s 2013 strategy mostly involves making math progress and trying to get mathematicians in academia interested in these kinds of problems, which is a different approach from the “small FAI team” one that you focus on in your post. As another kind of approach, the considerations outlined in AGI Impact Experts and Friendly AI Experts would suggest a program of generally training people with an expertise in AI safety questions, in order to have safety experts involved in many different AI projects. There have also been various proposals about eventually pushing for regulation of AI, though MIRI’s comparative advantage is probably more on the side of technical research.
I thought “making math progress and trying to get mathematicians in academia interested in these kinds of problems” was intended to be preparation for eventually doing the “small FAI team” approach, by 1) enlarging the talent pool that MIRI can eventually hire from, and 2) offloading the subset of problems that Eliezer thinks are safe onto the academic community. If “small FAI team” is not a good idea, then I don’t see what purpose “making math progress and trying to get mathematicians in academia interested in these kinds of problems” serves, or how experimenting with it is useful. The experiment could be very “successful” in making lots of math progress and getting a lot of mathematicians interested, but that doesn’t help with the endgame problem that I point out in the OP.
Generally training people with an expertise in AI safety questions and pushing for regulation of AI both sound good to me, and I’d be happy to see MIRI try them. You could consider my post as an argument for redirecting resources away from preparing for “small FAI team” and into such experiments.
I thought “making math progress and trying to get mathematicians in academia interested in these kinds of problems” was intended to be preparation
Yes, I believe that is indeed the intention, but it’s worth noting that the things that MIRI’s currently doing really allow them to pursue either strategy in the future. So if they give up on the “small FAI team” strategy because it turns out to be too hard, they may still pursue the “big academic research” strategy, based on the information collected at this and other steps.
If “small FAI team” is not a good idea, then I don’t see what purpose “making math progress and trying to get mathematicians in academia interested in these kinds of problems” serves, or how experimenting with it is useful.
“Small FAI team” might turn out to be a bad idea because the problem is too difficult for a small team to solve alone. In that case, it may be useful to actually offload most of the problems to a broader academic community. Of course, this may or may not be safe, but there may come a time when it turns out that it is the least risky alternative.
I think “big academic research” is almost certainly not safe, for reasons similar to my argument to Paul here. There are people who do not care about AI safety due to short planning horizons or because they think they have simple, easy to transmit values, and will deploy the results of such research before the AI safety work is complete.
Of course, this may or may not be safe, but there may come a time when it turns out that it is the least risky alternative.
This would be a fine argument if there weren’t immediate downsides to what MIRI is currently doing, namely shortening AI timelines and making it harder to create a singleton (or get significant human intelligence enhancement, which could help somewhat in the absence of a singleton) before AGI work starts ramping up.
immediate downsides to what MIRI is currently doing, namely shortening AI timelines
To be clear, based on what I’ve seen you write elsewhere, you think they are shortening AI timelines because the mathematical work on reflection and decision theory would be useful for AIs in general, and are not specific to the problem of friendliness. Is that right?
This isn’t obvious to me. In particular, the reflection work seems much more relevant to creating stable goal structures than to engineering intelligence / optimization power.
I do wonder why MIRI people often do not respond to my criticisms about their strategy.
I’m not sure why you’re wondering, when in the history of MIRI and its predecessor they’ve only ever responded to about two criticisms about their strategy.
(I am not satisfied with the state of discourse on this question (to be specific, I don’t think MIRI proponents have adequately addressed concerns like those expressed by Wei Dai here and elsewhere), so I don’t want to be seen as endorsing what might naively seem to be the immediate policy-relevant implications of this argument, but:) Bla bla philosophical problems once solved are no longer considered philosophical bla bla [this part of the argument is repeated ad nauseam and is universally overstated], and then this. Steve’s comment also links to quite similar arguments made by Luke Muehlhauser. Also it links to one of Wei Dai’s previous posts closely related to this question.
Wei Dai, I noticed on the MIRI website that you’re slotted to appear at some future MIRI workshop. I find this a little bit strange—given your reservations, aren’t you worried about throwing fuel on the fire?
Thanks for the link. I don’t think I’ve seen that comment before. Steve raises the examples of Bayesian decision theory and Solomonoff induction to support his position, but to me both of these are examples of philosophical ideas that looked really good at some point but then turned out to be incomplete / not quite right. If the FAI team comes up with new ideas that are in the same reference class as Bayesian decision theory and Solomonoff induction, then I don’t know how they can gain enough confidence that those ideas can be the last words in their respective subjects.
Well I’m human which means I have multiple conflicting motivations. I’m going because I’m really curious what direction the participants will take decision theory.
I don’t see why that would be strange. Maybe Wei Dai thinks he can improve the way MIRI handles their FAI endgame from the inside, or that helping MIRI make progress on decision theory will not make MIRI more likely to screw up and develop UFAI, or both.
Showing up at a workshop is probably a good way to check the general sanity level/technical proficiency of MIRI. I would anticipate having a largish update (although I don’t know in which direction, obviously).
That’s not my main motivation for attending. I think I already have a rough idea of their sanity level/technical proficiency from online discussions and their technical writings, so I’m not anticipating having a particularly large update. (I’ve also attended a SIAI decision theory workshop a few years ago and met some SIAI people at that time.)
I do wonder why MIRI people often do not respond to my criticisms about their strategy. For example the only MIRI-affiliated person who responded to this post so far is Paul Christiano (but given his disagreements with Eliezer, he isn’t actually part of my intended audience for this post). The upcoming workshop might be a good opportunity to see if I can get MIRI people to take my concerns more seriously, if I talk to them face to face. If you or anyone else has any ideas on what else I should try, please let me know.
Speaking for myself...
Explaining strategic choices, and replying to criticisms, takes enormous amounts of time. For example, Nick Bostrom set out to explain what MIRI/FHI insiders might consider to be “10% of the basics about AI risk” in a clear and organized way, and by the time he’s done with the Superintelligence book it will have taken him something like 2.5 years of work just to do that, with hundreds of hours of help from other people — and he was already an incredibly smart, productive academic writer who had a strong comparative advantage writing exactly that book. It would’ve taken me, or Carl, or anybody else besides Nick a lot more time and effort to write that book at a similar level of quality.
Which of your many discussion threads on AI risk strategy do you most wish would be engaged further by somebody on staff at MIRI?
This seems like a generic excuse you’ve developed, and it’s not a bad one to use when waving off random comments from people who have little idea what they’re talking about. But my particular arguments already share most of the same assumptions as MIRI, with each post focusing only on one or two key points of disagreement. If it’s not worth your time to reply to my criticisms, then I don’t see whose criticisms you could possibly find it worthwhile to respond to.
I was going to say “this post” but now that Eliezer has responded I’m satisfied with MIRI’s level of engagement (assuming he doesn’t abruptly disappear at some point as he occasionally does during our previous discussions).
Looking at my other top level posts, I’d be interested to know what MIRI thinks about this and this.
My initial replies are here and here.
Which other writings of yours would you most like at least an initial reply to? Or, if there were discussions that were dropped by the MIRI party too soon (from your perspective), I could try to continue them, at least from my own perspective.
Thanks, I don’t currently have a list of “writings that I wish MIRI would respond to”, but I’ll certainly keep your offer in mind in the future.
It’s also enormously important.
Personally, I didn’t respond to this post because my reaction to it was mostly “yes, this is a problem, but I don’t see a way by which talking about it will help at this point; we’ll just have to wait and see”. In other words, I feel that MIRI will just have to experiment with a lot of different strategies and see which ones look like they’ll have promise, and then that experimentation will maybe reveal a way by which issues like this one can be solved, or maybe MIRI will end up pursuing an entirely different strategy. But I expect that we’ll actually have to try out the different strategies before we can know.
I’m not sure what kind of strategies you are referring to. Can you give some examples of strategies that you think MIRI should experiment with?
For instance, MIRI’s 2013 strategy mostly involves making math progress and trying to get mathematicians in academia interested in these kinds of problems, which is a different approach from the “small FAI team” one that you focus on in your post. As another kind of approach, the considerations outlined in AGI Impact Experts and Friendly AI Experts would suggest a program of generally training people with an expertise in AI safety questions, in order to have safety experts involved in many different AI projects. There have also been various proposals about eventually pushing for regulation of AI, though MIRI’s comparative advantage is probably more on the side of technical research.
I thought “making math progress and trying to get mathematicians in academia interested in these kinds of problems” was intended to be preparation for eventually doing the “small FAI team” approach, by 1) enlarging the talent pool that MIRI can eventually hire from, and 2) offloading the subset of problems that Eliezer thinks are safe onto the academic community. If “small FAI team” is not a good idea, then I don’t see what purpose “making math progress and trying to get mathematicians in academia interested in these kinds of problems” serves, or how experimenting with it is useful. The experiment could be very “successful” in making lots of math progress and getting a lot of mathematicians interested, but that doesn’t help with the endgame problem that I point out in the OP.
Generally training people with an expertise in AI safety questions and pushing for regulation of AI both sound good to me, and I’d be happy to see MIRI try them. You could consider my post as an argument for redirecting resources away from preparing for “small FAI team” and into such experiments.
Yes, I believe that is indeed the intention, but it’s worth noting that the things that MIRI’s currently doing really allow them to pursue either strategy in the future. So if they give up on the “small FAI team” strategy because it turns out to be too hard, they may still pursue the “big academic research” strategy, based on the information collected at this and other steps.
“Small FAI team” might turn out to be a bad idea because the problem is too difficult for a small team to solve alone. In that case, it may be useful to actually offload most of the problems to a broader academic community. Of course, this may or may not be safe, but there may come a time when it turns out that it is the least risky alternative.
I think “big academic research” is almost certainly not safe, for reasons similar to my argument to Paul here. There are people who do not care about AI safety due to short planning horizons or because they think they have simple, easy to transmit values, and will deploy the results of such research before the AI safety work is complete.
This would be a fine argument if there weren’t immediate downsides to what MIRI is currently doing, namely shortening AI timelines and making it harder to create a singleton (or get significant human intelligence enhancement, which could help somewhat in the absence of a singleton) before AGI work starts ramping up.
To be clear, based on what I’ve seen you write elsewhere, you think they are shortening AI timelines because the mathematical work on reflection and decision theory would be useful for AIs in general, and are not specific to the problem of friendliness. Is that right?
This isn’t obvious to me. In particular, the reflection work seems much more relevant to creating stable goal structures than to engineering intelligence / optimization power.
I’m not sure why you’re wondering, when in the history of MIRI and its predecessor they’ve only ever responded to about two criticisms about their strategy.