Alice: So… you’re saying that the reasoning doesn’t actually make sense to you?
Bob: I guess so.
Alice: Wait, so then why would you believe in this sea monster stuff?
I kind of think Alice is actually correct here? If you have time to evaluate the actual arguments then you pick the ones that make the most sense, taking into account all the empirical evidence you have. If you don’t have time, pick the one that’s made the best predictions so far. And if there’s insufficient data for even that, you go with the “default mainstream consensus” position. If the reasoning about AI ruin didn’t make sense to me, I wouldn’t believe in it anyways, I’d believe something like “predicting the outcome of a future technology is nearly impossible. Though we can’t rule out human extinction from AI, it’s merely one of a large number of possibilities.”
To steelman what Bob’s saying here, maybe he can’t understand the detailed mechanisms that cause machine learning systems to be difficult to point at subtle goals, but he can empirically verify it by observing how they sometimes misbehave. And he can empirically observe ML systems getting progressively more powerful over time. And so he can understand the argument that if the systems continue to become more powerful, as they have done in the past, eventually the fact that we can’t point them at subtle things like “not ruining everything humans care about” becomes a problem. That would be fine, you don’t have to understand every last detail if you can check the predictions of the parts you don’t understand. But if Bob’s just like, “I dunno, these Less Wrong guys seem smart”, that seems like a bad way to form beliefs.
I disagree. I think it’s hard to talk about this in the abstract, so how about I propose a few different concrete situations?
I don’t know much about tennis. Suppose Bill was playing Joe. I spend some time watching Bill train and then some more time watching Joe train. Bill looks like he’s a bit better to me — a stronger swing and whatnot — but the 80% of the experts think Joe would win. In this situation, I would adopt the belief of the experts and predict Joe to win. My instinct that Bill was better counts for something, but it isn’t stronger than the evidence of knowing that 80% of experts think Joe would win.
I know a lot about basketball. Suppose the Aces were playing the Spades. It’s similar to Bill vs Joe. The Aces look better to me when I watch them practice, but 80% of experts think the Spades are better and would win. Since I know more about basketball than tennis I’d lean closer to the Aces than I do to Bill, but ultimately I don’t put much weight behind my feelings.
I’m one of two developers working in a codebase that we built from the ground up at work. I feel good about my skill level and ability here. Suppose there’s the question of how to write the code for a certain feature: approach A or approach B. To me A seems like it’s better by a fair margin. We run it by a few expert programmers, but those expert programmers don’t have much context, they just get a quick two minute overview of the situation. The experts think B is better. Here, I think I’d still lean towards A over B, despite it going against what the experts think. Given that they don’t have the full context I don’t place too much weight behind what they say.
It seems like hypothetical-you is making a reasonable decision in all those situations. I guess my point was that us people who worry about AI ruin don’t necessarily get to be seen as the group of experts who are calling these sports matches. Maybe the person most analogous to the tennis expert predicting Joe will win the match is Yann LeCun, not Eliezer Yudkowsky.
Consider two tennis pundits, one of whom has won prestigious awards for tennis punditry, and is currently employed by a large TV network to comment on matches. The other is a somewhat-popular tennis youtuber. If these two disagree about who’s going to win the match, with the famous TV commentator favouring Joe and the youtuber favouring Bill, a total tennis outsider would probably do best by going with the opinion of the famous guy. In order to figure out that you should instead go with the opinion of the youtuber, you’d need to know at least a little bit about tennis in order to determine that the youtuber is actually more knowledgeable.
Gotcha, that makes sense. It sounds like we agree that the main question is of who the experts are and how much weight one should give to each of them. It also sounds like we disagree about what the answer to that question is. I give a lot of weight to Yudkowsky and adjacent people whereas it sounds like you don’t.
Consider two tennis pundits, one of whom has won prestigious awards for tennis punditry, and is currently employed by a large TV network to comment on matches. The other is a somewhat-popular tennis youtuber. If these two disagree about who’s going to win the match, with the famous TV commentator favouring Joe and the youtuber favouring Bill, a total tennis outsider would probably do best by going with the opinion of the famous guy. In order to figure out that you should instead go with the opinion of the youtuber, you’d need to know at least a little bit about tennis in order to determine that the youtuber is actually more knowledgeable.
Agreed. But I think the AIS situation is more analogous to if the YouTuber studied some specific aspect of tennis very extensively that traditional tennis players don’t really pay much attention to. Even so, it still might be difficult for an outsider to place much weight behind the YouTuber. It’s probably pretty important that they can judge for themself that the YouTuber is smart and qualified and stuff. It also helps that the YouTuber received support and funding and endorsements from various high-prestige people.
I give a lot of weight to Yudkowsky and adjacent people whereas it sounds like you don’t.
To be clear, I do give a lot of weight to Yudkowsky in the sense that I think his arguments make sense and I mostly believe them. Similarly, I don’t give much weight to Yann LeCun on this topic. But that’s because I can read what Yudkowsky has said and what LeCun has said and think about whether it made sense. If I didn’t speak English, so that their words appeared as meaningless noise to me, then I’d be much more uncertain about who to trust, and would probably defer to an average of the opinions of the top ML names, eg. Sutskever, Goodfellow, Hinton, LeCun, Karpathy, Benigo, etc. The thing about closely studying a specific aspect of AI (namely alignment) would probably get Yudkowsky and Christiano’s names onto that list, but it wouldn’t necessarily give Yudkowsky more weight than everyone else combined. (I’m guessing, for hypothetical non-English-speaking me, who somehow has translations for what everyone’s bottom line position is on the topic, but not what their arguments are. Basically the intuition here is that difficult technical achievements like Alexnet, GANs, etc. are some of the easiest things to verify from the outside. It’s hard to tell which philosopher is right, but easy to tell which scientist can build a thing for you that will automatically generate amusing new animal pictures.)
If I didn’t speak English, so that their words appeared as meaningless noise to me, then I’d be much more uncertain about who to trust, and would probably defer to an average of the opinions of the top ML names, eg. Sutskever, Goodfellow, Hinton, LeCun, Karpathy, Benigo, etc.
Do you think it’d make sense to give more weight to people in the field of AI safety than to people in the field of AI more broadly?
I would, and I think it’s something that generally makes sense. Ie. I don’t know much about food science but on a question involving dairy, I’d trust food scientists who specialize in dairy more than I would trust food scientists more generally. But I would give some weight to non-specialist food scientists, as well as chemists in general, as well as physical scientists in general, with the weight decreasing as the person gets less specialized.
I kind of think Alice is actually correct here? If you have time to evaluate the actual arguments then you pick the ones that make the most sense, taking into account all the empirical evidence you have. If you don’t have time, pick the one that’s made the best predictions so far. And if there’s insufficient data for even that, you go with the “default mainstream consensus” position. If the reasoning about AI ruin didn’t make sense to me, I wouldn’t believe in it anyways, I’d believe something like “predicting the outcome of a future technology is nearly impossible. Though we can’t rule out human extinction from AI, it’s merely one of a large number of possibilities.”
To steelman what Bob’s saying here, maybe he can’t understand the detailed mechanisms that cause machine learning systems to be difficult to point at subtle goals, but he can empirically verify it by observing how they sometimes misbehave. And he can empirically observe ML systems getting progressively more powerful over time. And so he can understand the argument that if the systems continue to become more powerful, as they have done in the past, eventually the fact that we can’t point them at subtle things like “not ruining everything humans care about” becomes a problem. That would be fine, you don’t have to understand every last detail if you can check the predictions of the parts you don’t understand. But if Bob’s just like, “I dunno, these Less Wrong guys seem smart”, that seems like a bad way to form beliefs.
I disagree. I think it’s hard to talk about this in the abstract, so how about I propose a few different concrete situations?
I don’t know much about tennis. Suppose Bill was playing Joe. I spend some time watching Bill train and then some more time watching Joe train. Bill looks like he’s a bit better to me — a stronger swing and whatnot — but the 80% of the experts think Joe would win. In this situation, I would adopt the belief of the experts and predict Joe to win. My instinct that Bill was better counts for something, but it isn’t stronger than the evidence of knowing that 80% of experts think Joe would win.
I know a lot about basketball. Suppose the Aces were playing the Spades. It’s similar to Bill vs Joe. The Aces look better to me when I watch them practice, but 80% of experts think the Spades are better and would win. Since I know more about basketball than tennis I’d lean closer to the Aces than I do to Bill, but ultimately I don’t put much weight behind my feelings.
I’m one of two developers working in a codebase that we built from the ground up at work. I feel good about my skill level and ability here. Suppose there’s the question of how to write the code for a certain feature: approach A or approach B. To me A seems like it’s better by a fair margin. We run it by a few expert programmers, but those expert programmers don’t have much context, they just get a quick two minute overview of the situation. The experts think B is better. Here, I think I’d still lean towards A over B, despite it going against what the experts think. Given that they don’t have the full context I don’t place too much weight behind what they say.
How do you feel about each of those situations?
It seems like hypothetical-you is making a reasonable decision in all those situations. I guess my point was that us people who worry about AI ruin don’t necessarily get to be seen as the group of experts who are calling these sports matches. Maybe the person most analogous to the tennis expert predicting Joe will win the match is Yann LeCun, not Eliezer Yudkowsky.
Consider two tennis pundits, one of whom has won prestigious awards for tennis punditry, and is currently employed by a large TV network to comment on matches. The other is a somewhat-popular tennis youtuber. If these two disagree about who’s going to win the match, with the famous TV commentator favouring Joe and the youtuber favouring Bill, a total tennis outsider would probably do best by going with the opinion of the famous guy. In order to figure out that you should instead go with the opinion of the youtuber, you’d need to know at least a little bit about tennis in order to determine that the youtuber is actually more knowledgeable.
Gotcha, that makes sense. It sounds like we agree that the main question is of who the experts are and how much weight one should give to each of them. It also sounds like we disagree about what the answer to that question is. I give a lot of weight to Yudkowsky and adjacent people whereas it sounds like you don’t.
Agreed. But I think the AIS situation is more analogous to if the YouTuber studied some specific aspect of tennis very extensively that traditional tennis players don’t really pay much attention to. Even so, it still might be difficult for an outsider to place much weight behind the YouTuber. It’s probably pretty important that they can judge for themself that the YouTuber is smart and qualified and stuff. It also helps that the YouTuber received support and funding and endorsements from various high-prestige people.
To be clear, I do give a lot of weight to Yudkowsky in the sense that I think his arguments make sense and I mostly believe them. Similarly, I don’t give much weight to Yann LeCun on this topic. But that’s because I can read what Yudkowsky has said and what LeCun has said and think about whether it made sense. If I didn’t speak English, so that their words appeared as meaningless noise to me, then I’d be much more uncertain about who to trust, and would probably defer to an average of the opinions of the top ML names, eg. Sutskever, Goodfellow, Hinton, LeCun, Karpathy, Benigo, etc. The thing about closely studying a specific aspect of AI (namely alignment) would probably get Yudkowsky and Christiano’s names onto that list, but it wouldn’t necessarily give Yudkowsky more weight than everyone else combined. (I’m guessing, for hypothetical non-English-speaking me, who somehow has translations for what everyone’s bottom line position is on the topic, but not what their arguments are. Basically the intuition here is that difficult technical achievements like Alexnet, GANs, etc. are some of the easiest things to verify from the outside. It’s hard to tell which philosopher is right, but easy to tell which scientist can build a thing for you that will automatically generate amusing new animal pictures.)
Do you think it’d make sense to give more weight to people in the field of AI safety than to people in the field of AI more broadly?
I would, and I think it’s something that generally makes sense. Ie. I don’t know much about food science but on a question involving dairy, I’d trust food scientists who specialize in dairy more than I would trust food scientists more generally. But I would give some weight to non-specialist food scientists, as well as chemists in general, as well as physical scientists in general, with the weight decreasing as the person gets less specialized.