it’s also worth noting that I am far in the tail ends of the distribution of people willing to ignore incentive gradients if I believe it’s correct not to follow them. (I’ve gotten somewhat more pragmatic about this over time, because sometimes not following the gradient is just dumb. and as a human being it’s impossible not to care a little bit about status and money and such. but I still have a very strong tendency to ignore local incentives if I believe something is right in the long run.) like I’m aware I’ll get promoed less and be viewed as less cool and not get as much respect and so on if I do the alignment work I think is genuinely important in the long run.
I’d guess for most people, the disincentives for working on xrisk alignment make openai a vastly less pleasant place. so whenever I say I don’t feel like I’m pressured not to do what I’m doing, this does not necessarily mean the average person at openai would agree if they tried to work on my stuff.
it’s also worth noting that I am far in the tail ends of the distribution of people willing to ignore incentive gradients if I believe it’s correct not to follow them. (I’ve gotten somewhat more pragmatic about this over time, because sometimes not following the gradient is just dumb. and as a human being it’s impossible not to care a little bit about status and money and such. but I still have a very strong tendency to ignore local incentives if I believe something is right in the long run.) like I’m aware I’ll get promoed less and be viewed as less cool and not get as much respect and so on if I do the alignment work I think is genuinely important in the long run.
I’d guess for most people, the disincentives for working on xrisk alignment make openai a vastly less pleasant place. so whenever I say I don’t feel like I’m pressured not to do what I’m doing, this does not necessarily mean the average person at openai would agree if they tried to work on my stuff.