This got over 800 points on HN. Having a good reply seems important, even if a large portion of it is.. scattergun, admittedly and intentionally trying to push a point, and not good reasoning.
The core argument, that the several reference classes that Superintelligence and AI safety ideas fall into (promise of potential immortality, impending apocalypse, etc) are full of risks of triggering biases, that other sets of ideas in this area don’t hold up to scrutiny, and that it has other properties that should make you wary is correct. It is entirely reasonable to take this as Bayesian evidence against the ideas. I have run into this as the core reason for rejecting this cluster of beliefs several times, by people with otherwise good reasoning skills.
Given limited time to evaluate claims, I can see how relying on this kind of reference class heuristic seems like a pretty good strategy, especially if you don’t think black swans are something you should try hard to look for.
My reply is that:
This only provides some evidence. In particular, there is a one-time update available from being in a suspect reference class, not an endless stream from an experiment you can repeat to gain increasing confidence. Make it clear you have made this update (and actually make it).
There are enough outside-view things which indicate that it’s different from the other members of the suspect reference classes that strongly rejecting it seems unreasonable. Support from a large number of visibly intellectually impressive people is the core thing to point to here (not as an attempt to prove it or argue from authority, just to show it’s different from e.g. the 2012 stuff).
(only applicable if you personally have a reasonably strong model of AI safety) I let people zoom in on my map of the space, and attempt to break the ideas with nitpicks. If you don’t personally have a clear model, that’s fine, but be honest about where your confidence comes from.
To summarize: Yes, it pattern matches to some sketchy things. It also has characteristics they don’t, like being unusually appealing to smart thoughtful people who seem to be trying to seek truth and abandon wrong beliefs. Having a moderately strong prior against it based on this is reasonable, as is having a prior for it, depending on how strongly you weight overtly impressive people publicly supporting it. If you don’t want to look into it based on that, fair enough, but I have and doing so (including looking for criticism) caused me to arrive at my current credence.
This got over 800 points on HN. Having a good reply seems important, even if a large portion of it is.. scattergun, admittedly and intentionally trying to push a point, and not good reasoning.
The discussion at HN seems mostly critical of it, so it’s not clear to me how much else needs to be added.
I have run into this as the core reason for rejecting this cluster of beliefs several times, by people with otherwise good reasoning skills.
Sure, but… what can you do to convince someone who doesn’t evaluate arguments? You can’t use the inside view to convince someone else that they should abandon the outside view, because the outside view specifically ignores inside view arguments.
The discussion at HN seems mostly critical of it, so it’s not clear to me how much else needs to be added.
The memes got spread far and wide. A lot of AI safety people will run into arguments with this general form, and they mostly won’t have read enough comments to form a good reply (also, most criticism does not target the heart because the other parts are so much weaker, so will be unconvincing where it’s needed most). Some can come up with a reply to the heart on the fly, but it seems fairly positive to have this on LW to spread the antibody memes.
Sure, but… what can you do to convince someone who doesn’t evaluate arguments? You can’t use the inside view to convince someone else that they should abandon the outside view, because the outside view specifically ignores inside view arguments.
Show them outside view style arguments? People are bounded agents, and there are a bunch of things in the direction of epistemic learned helplessness which make them not want to load arbitrary complex arguments into their brain. This should not lead them to reject reference-class comparisons as evidence of it being worth looking at closer / not having an extreme prior against (though maybe in actual humans this mostly fails anyway).
Admittedly, this does not have an awesome hitrate for me, maybe 1/4? Am interested in ideas for better replies.
Show them evidence that is inconsistent with their world view?
That a piece of evidence is consistent or inconsistent with their world view relies on arguments. Remember, standard practice among pundits is to observe evidence, then fit it to their theory, rather than using theory to predict evidence, observing evidence, and then updating. If someone is in the first mode, where’s the step where they notice that they made a wrong prediction?
Show them how with your view of the world they can predict the world better.
Relatedly, that predictive accuracy is the thing to optimize for relies on arguments.
Pundits are probably not worth bothering with. But I think there are hardcore engineers that would be useful to convince.
I think that Andrew Ng probably optimizes for predictive accuracy (at least he has to whilst creating machine learning systems).
This was his answer to whether AI is an existential threat here. I don’t know why he objects to this line of thought, but the things I suggested that could be done above would be useful in his case.
AI has made tremendous progress, and I’m wildly optimistic about building a better society that is embedded up and down with machine intelligence. But AI today is still very limited. Almost all the economic and social value of deep learning is still through supervised learning, which is limited by the amount of suitably formatted (i.e., labeled) data. Even though AI is helping hundreds of millions of people already, and is well poised to help hundreds of millions more, I don’t see any realistic path to AI threatening humanity.
If the theories from MIRI about AI can help him make better machine learning systems, I think he would take note.
I think the fact that some of the famous people what people think of AI now are not the same people as the ones warning about the dangers is a red flag for people.
But I think there are hardcore engineers that would be useful to convince.
Sure, because it would be nice if there were 0 instead of 2 prominent ML experts who were unconvinced. But 2 people is not a consensus, and the actual difference of opinion between Ng, LeCun, and everyone else is very small, mostly dealing with emphasis instead of content.
From a surivey linked from that article (that that article cherry-picks a single number from… sigh). It looks like there is a disconnect between theorists and practitioners with theorists being more likely to believe in hard take off (theorists think we have a 15% chance likely that we will get super intelligence within 2 years of human intelligence and practitioners a 5%).
I think you would find nuclear physicists giving a higher probability in the idea of chain reactions pretty quickly once a realistic pathway that released 2 neutrons was shown.
mostly dealing with emphasis instead of content.
MIRI/FHI has captured the market for worrying about AI. If they are worrying about the wrong things, that could be pretty bad.
This got over 800 points on HN. Having a good reply seems important, even if a large portion of it is.. scattergun, admittedly and intentionally trying to push a point, and not good reasoning.
The core argument, that the several reference classes that Superintelligence and AI safety ideas fall into (promise of potential immortality, impending apocalypse, etc) are full of risks of triggering biases, that other sets of ideas in this area don’t hold up to scrutiny, and that it has other properties that should make you wary is correct. It is entirely reasonable to take this as Bayesian evidence against the ideas. I have run into this as the core reason for rejecting this cluster of beliefs several times, by people with otherwise good reasoning skills.
Given limited time to evaluate claims, I can see how relying on this kind of reference class heuristic seems like a pretty good strategy, especially if you don’t think black swans are something you should try hard to look for.
My reply is that:
This only provides some evidence. In particular, there is a one-time update available from being in a suspect reference class, not an endless stream from an experiment you can repeat to gain increasing confidence. Make it clear you have made this update (and actually make it).
There are enough outside-view things which indicate that it’s different from the other members of the suspect reference classes that strongly rejecting it seems unreasonable. Support from a large number of visibly intellectually impressive people is the core thing to point to here (not as an attempt to prove it or argue from authority, just to show it’s different from e.g. the 2012 stuff).
(only applicable if you personally have a reasonably strong model of AI safety) I let people zoom in on my map of the space, and attempt to break the ideas with nitpicks. If you don’t personally have a clear model, that’s fine, but be honest about where your confidence comes from.
To summarize: Yes, it pattern matches to some sketchy things. It also has characteristics they don’t, like being unusually appealing to smart thoughtful people who seem to be trying to seek truth and abandon wrong beliefs. Having a moderately strong prior against it based on this is reasonable, as is having a prior for it, depending on how strongly you weight overtly impressive people publicly supporting it. If you don’t want to look into it based on that, fair enough, but I have and doing so (including looking for criticism) caused me to arrive at my current credence.
The discussion at HN seems mostly critical of it, so it’s not clear to me how much else needs to be added.
Sure, but… what can you do to convince someone who doesn’t evaluate arguments? You can’t use the inside view to convince someone else that they should abandon the outside view, because the outside view specifically ignores inside view arguments.
The memes got spread far and wide. A lot of AI safety people will run into arguments with this general form, and they mostly won’t have read enough comments to form a good reply (also, most criticism does not target the heart because the other parts are so much weaker, so will be unconvincing where it’s needed most). Some can come up with a reply to the heart on the fly, but it seems fairly positive to have this on LW to spread the antibody memes.
Show them outside view style arguments? People are bounded agents, and there are a bunch of things in the direction of epistemic learned helplessness which make them not want to load arbitrary complex arguments into their brain. This should not lead them to reject reference-class comparisons as evidence of it being worth looking at closer / not having an extreme prior against (though maybe in actual humans this mostly fails anyway).
Admittedly, this does not have an awesome hitrate for me, maybe 1/4? Am interested in ideas for better replies.
Show them evidence that is inconsistent with their world view? Show them how with your view of the world they can predict the world better.
Otherwise you are expecting people to get on board with a abstract philosophical argument. Which I think people are inured against.
That a piece of evidence is consistent or inconsistent with their world view relies on arguments. Remember, standard practice among pundits is to observe evidence, then fit it to their theory, rather than using theory to predict evidence, observing evidence, and then updating. If someone is in the first mode, where’s the step where they notice that they made a wrong prediction?
Relatedly, that predictive accuracy is the thing to optimize for relies on arguments.
Pundits are probably not worth bothering with. But I think there are hardcore engineers that would be useful to convince.
I think that Andrew Ng probably optimizes for predictive accuracy (at least he has to whilst creating machine learning systems).
This was his answer to whether AI is an existential threat here. I don’t know why he objects to this line of thought, but the things I suggested that could be done above would be useful in his case.
If the theories from MIRI about AI can help him make better machine learning systems, I think he would take note.
I think the fact that some of the famous people what people think of AI now are not the same people as the ones warning about the dangers is a red flag for people.
Sure, because it would be nice if there were 0 instead of 2 prominent ML experts who were unconvinced. But 2 people is not a consensus, and the actual difference of opinion between Ng, LeCun, and everyone else is very small, mostly dealing with emphasis instead of content.
From a surivey linked from that article (that that article cherry-picks a single number from… sigh). It looks like there is a disconnect between theorists and practitioners with theorists being more likely to believe in hard take off (theorists think we have a 15% chance likely that we will get super intelligence within 2 years of human intelligence and practitioners a 5%).
I think you would find nuclear physicists giving a higher probability in the idea of chain reactions pretty quickly once a realistic pathway that released 2 neutrons was shown.
MIRI/FHI has captured the market for worrying about AI. If they are worrying about the wrong things, that could be pretty bad.