1. Past experience has shown that even when particular AI risk arguments don’t apply, often an AI design is still risky, we just haven’t thought of the reasons why yet. So we should make a pessimistic meta-induction and conclude that even if our standard arguments for risk don’t apply, the system might still be risky—we should think more about it.
I’ve heard this sentiment before, but I’m not aware of a standard reference supporting this claim (let me know if there’s something I’m not remembering), and I haven’t been totally satisfied when I probe people on it in the past.
I agree we should think a lot because so much is at stake, but sometimes the fact that so much is at stake means that it’s better to act quickly.
People are great at rationalizing, coming up with reasons to get to the conclusion they wanted. If the conclusion they want is “We finally did it and made a super powerful impressive AI, come on come on let’s take it for a spin!” then it’ll be easy to fool yourself into thinking your architecture is sufficiently different as to not be problematic, even when your architecture is just a special case of the architecture in the standard arguments.
Agreed, I just don’t want people to fall into the trap of rationalizing the opposite conclusion either.
I’m not operating under the assumption that I know more about the AI system someone is creating than the person who’s creating it knows. The fact that you said this dismays me, because it is such an obvious staw man. It makes me wonder if I touched a nerve somehow, or had the wrong tone or something, to raise your hackles.
It did. Part of me thought it was better not to comment, but then I figured the entire point of the post was how to do outreach to people we don’t agree with, so I decided it was better to express my frustration.
5. I agree that this is a possibility. This is why I said “say it buys us a month;” I meant that to be an average of the various possibilities. In retrospect I was unclear; I should have clarified that It might not be a good idea to delay at all, for the reasons you mention. I agree we have to learn more about the situation; in retrospect I shouldn’t have said “I think it would be better for these conversations to end X way” (even though that is what I think is most likely) but rather found some way to express the more nuanced position.
Thanks for clarifying.
Well said. I’m glad you spoke up. Yeah, I don’t want people to rationalize their way into thinking AI should never be developed or released either. Currently I think people are much more likely to make the opposite error, but I agree both errors are worth watching out for.
I don’t know of a standard reference for that claim either. Here is what I’d say in defense of it:
--AIXItl was a serious proposal for an “ideal” intelligent agent. I heard the people who came up with it took convincing, but eventually agreed that yes, AIXItl would seize control of its reward function and kill all humans.
--People proposed Oracle AI, thinking that it would be safe. Now AFAICT people mostly agree that there are various dangers associated with Oracle AI as well.
--People sometimes said that AI risk arguments were founded on these ideal models of AI as utility maximizers or something, and that they wouldn’t apply to modern ML systems. Well, now we have arguments for why modern ML systems are potentially dangerous too. (Whether these are the same arguments rephrased, or new arguments, is not relevant for this point.)
--In my personal experience at least, I keep discovering entirely new ways that AI designs could fail, which I hadn’t thought of before. For example, paul’s “The Universal Prior is Malign.” Or oracles outputting self-fulfilling prophecies. Or some false philosophical view on consciousness or something being baked into the AI. This makes me think maybe there are more which I haven’t yet thought of.