Yeah, I agree that I am doing reasoning on people’s motivations here, which is iffy and given the pushback I will be a bit more hesitant to do, but also like, in this case reasoning about people’s motivations is really important, because what I care about is what the people working at OpenAI will actually do when they have extremely powerful AI in their hands, and that will depend a bunch on their motivations.
I am honestly a bit surprised to see that WebGPT was as much driven by people who I do know reasonably well and who seem to be driven primarily by safety concerns, since the case for it strikes me as so weak, and the risk seeming as somewhat obviously high, so I am still trying to process that and will probably make some kind of underlying update.
I do think overall I’ve had much better success at predicting the actions of the vast majority of people at OpenAI, including a lot of safety work, by thinking of them by being motivated by doing cool capability things, sometimes with a thin safety veneer on top, instead of being motivated primarily by safety. For example, I currently think that the release strategy for the GPT models of OpenAI is much better explained by OpenAI wanting a moat around their language model product instead of being motivated by safety concerns. I spent many hours trying to puzzle over the reasons for why they choose this release strategy, and ultimately concluded that the motivation was primarily financial/competetive-advantage related, and not related to safety (despite people at OpenAI claiming otherwise).
I also overall agree that trying to analyze motivations of people is kind of fraught and difficult, but I also feel pretty strongly that it’s now been many years where people have been trying to tell a story of OpenAI leadership being motivated by safety stuff, with very little action to actually back that up (and a massive amount of harm in terms of capability gains), and I do want to be transparent that I no longer really believe the stated intentions of many people working there.
Yeah, I agree that I am doing reasoning on people’s motivations here, which is iffy and given the pushback I will be a bit more hesitant to do, but also like, in this case reasoning about people’s motivations is really important, because what I care about is what the people working at OpenAI will actually do when they have extremely powerful AI in their hands, and that will depend a bunch on their motivations.
I am honestly a bit surprised to see that WebGPT was as much driven by people who I do know reasonably well and who seem to be driven primarily by safety concerns, since the case for it strikes me as so weak, and the risk seeming as somewhat obviously high, so I am still trying to process that and will probably make some kind of underlying update.
I do think overall I’ve had much better success at predicting the actions of the vast majority of people at OpenAI, including a lot of safety work, by thinking of them by being motivated by doing cool capability things, sometimes with a thin safety veneer on top, instead of being motivated primarily by safety. For example, I currently think that the release strategy for the GPT models of OpenAI is much better explained by OpenAI wanting a moat around their language model product instead of being motivated by safety concerns. I spent many hours trying to puzzle over the reasons for why they choose this release strategy, and ultimately concluded that the motivation was primarily financial/competetive-advantage related, and not related to safety (despite people at OpenAI claiming otherwise).
I also overall agree that trying to analyze motivations of people is kind of fraught and difficult, but I also feel pretty strongly that it’s now been many years where people have been trying to tell a story of OpenAI leadership being motivated by safety stuff, with very little action to actually back that up (and a massive amount of harm in terms of capability gains), and I do want to be transparent that I no longer really believe the stated intentions of many people working there.