I imagine that publishing any X-risk-related safety work draws attention to the whole X-risk thing, which is something OpenAI in particular (and the other labs as well to a degree) have been working hard to avoid doing. This doesn’t explain why they don’t publish mundane safety work though, and in fact it would predict more mundane publishing as part of their obfuscation strategy.
i have never experienced pushback when publishing research that draws attention to xrisk. it’s more that people are not incentivized to work on xrisk research in the first place. also, for mundane safety work, my guess is that modern openai just values shipping things into prod a lot more than writing papers.
(I did experience this at OpenAI in a few different projects and contexts unfortunately. I’m glad that Leo isn’t experiencing it and that he continues to be there)
I acknowledge that I probably have an unusual experience among people working on xrisk things at openai. From what I’ve heard from other people I trust, there probably have been a bunch of cases where someone was genuinely blocked from publishing something about xrisk, and I just happen to have gotten lucky so far.
it’s also worth noting that I am far in the tail ends of the distribution of people willing to ignore incentive gradients if I believe it’s correct not to follow them. (I’ve gotten somewhat more pragmatic about this over time, because sometimes not following the gradient is just dumb. and as a human being it’s impossible not to care a little bit about status and money and such. but I still have a very strong tendency to ignore local incentives if I believe something is right in the long run.) like I’m aware I’ll get promoed less and be viewed as less cool and not get as much respect and so on if I do the alignment work I think is genuinely important in the long run.
I’d guess for most people, the disincentives for working on xrisk alignment make openai a vastly less pleasant place. so whenever I say I don’t feel like I’m pressured not to do what I’m doing, this does not necessarily mean the average person at openai would agree if they tried to work on my stuff.
I imagine that publishing any X-risk-related safety work draws attention to the whole X-risk thing, which is something OpenAI in particular (and the other labs as well to a degree) have been working hard to avoid doing. This doesn’t explain why they don’t publish mundane safety work though, and in fact it would predict more mundane publishing as part of their obfuscation strategy.
i have never experienced pushback when publishing research that draws attention to xrisk. it’s more that people are not incentivized to work on xrisk research in the first place. also, for mundane safety work, my guess is that modern openai just values shipping things into prod a lot more than writing papers.
(I did experience this at OpenAI in a few different projects and contexts unfortunately. I’m glad that Leo isn’t experiencing it and that he continues to be there)
I acknowledge that I probably have an unusual experience among people working on xrisk things at openai. From what I’ve heard from other people I trust, there probably have been a bunch of cases where someone was genuinely blocked from publishing something about xrisk, and I just happen to have gotten lucky so far.
it’s also worth noting that I am far in the tail ends of the distribution of people willing to ignore incentive gradients if I believe it’s correct not to follow them. (I’ve gotten somewhat more pragmatic about this over time, because sometimes not following the gradient is just dumb. and as a human being it’s impossible not to care a little bit about status and money and such. but I still have a very strong tendency to ignore local incentives if I believe something is right in the long run.) like I’m aware I’ll get promoed less and be viewed as less cool and not get as much respect and so on if I do the alignment work I think is genuinely important in the long run.
I’d guess for most people, the disincentives for working on xrisk alignment make openai a vastly less pleasant place. so whenever I say I don’t feel like I’m pressured not to do what I’m doing, this does not necessarily mean the average person at openai would agree if they tried to work on my stuff.
Could you elaborate what do you mean by “mundane” safety work?