Daniel Kokotajlo comments on Preparing for “The Talk” with AI projects

Daniel Kokotajlo 19 Jun 2020 23:01 UTC
LW: 2 AF: 1
0
AF
It did. Part of me thought it was better not to comment, but then I figured the entire point of the post was how to do outreach to people we don’t agree with, so I decided it was better to express my frustration.
Well said. I’m glad you spoke up. Yeah, I don’t want people to rationalize their way into thinking AI should never be developed or released either. Currently I think people are much more likely to make the opposite error, but I agree both errors are worth watching out for.
I don’t know of a standard reference for that claim either. Here is what I’d say in defense of it:
--AIXItl was a serious proposal for an “ideal” intelligent agent. I heard the people who came up with it took convincing, but eventually agreed that yes, AIXItl would seize control of its reward function and kill all humans.
--People proposed Oracle AI, thinking that it would be safe. Now AFAICT people mostly agree that there are various dangers associated with Oracle AI as well.
--People sometimes said that AI risk arguments were founded on these ideal models of AI as utility maximizers or something, and that they wouldn’t apply to modern ML systems. Well, now we have arguments for why modern ML systems are potentially dangerous too. (Whether these are the same arguments rephrased, or new arguments, is not relevant for this point.)
--In my personal experience at least, I keep discovering entirely new ways that AI designs could fail, which I hadn’t thought of before. For example, paul’s “The Universal Prior is Malign.” Or oracles outputting self-fulfilling prophecies. Or some false philosophical view on consciousness or something being baked into the AI. This makes me think maybe there are more which I haven’t yet thought of.