Tomás B. comments on Common misconceptions about OpenAI

Tomás B. 25 Aug 2022 14:54 UTC
107 points
93

Incorrect: OpenAI leadership is dismissive of existential risk from AI.

So the reason I think this is very high-level people have made claims like, “the orthogonality thesis is probably false”, and someone I know who talked to a very, very, very high-level person at OpenAI had to explain to them that inner alignment is a thing. If they actually cared, I would expect the leadership to have more familiarity with their critic’s arguments.

No one remembers now, but the founding rhetoric was also pretty bad, though walked back I suppose.

Also, I often see them claim their AI ethics work (train a model not to offend the average Berkeley humanities grad—possibly not useless, I suppose, but not exactly going to save our lightcone) is important alignment work. Obviously, what is going on inside is not legible to me, but what I see from the outside has mostly been disheartening. Their recent blog on alignment was an exception to this.

Though there are people with their priorities straight at OpenAI, I see little evidence that this is true of their leadership. I’m not confident an organization can be net beneficial when this is the case.
- Quadratic Reciprocity 26 Aug 2022 4:05 UTC
  12 points
  6
  Parent
  If we’re thinking about the same “very, very, very high-level person at OpenAI”, it does seem like this person now buys that inner alignment is a thing and is concerned about it (or says he’s concerned). It is scary because people at these AI labs don’t know all that much about AI alignment but also hopeful because they don’t seem to disagree with it and maybe just need to be given the arguments in a good way by someone they would listen to?
  - Tomás B. 26 Aug 2022 5:29 UTC
    7 points
    0
    Parent
    I suspect we are thinking about the same person and that is heartening that they changed their mind.
- Wei Dai 26 Aug 2022 17:16 UTC
  9 points
  2
  Parent
  
  Also, I often see them claim their AI ethics work (train a model not to offend the average Berkeley humanities grad—possibly not useless, I suppose, but not exactly going to save our lightcone) is important alignment work.
  
  Wait, you don’t think this (I mean the training, not the offending) is a safety problem in and of itself? (See also my previous comment about this.)