I found this extra information very useful, thanks for revealing what you did.
Of course, to me this makes OpenAI look quite poor. This seems like an incredibly obvious conflict of interest.
I’m surprised that the contract didn’t allow Epoch to release this information until recently, but that it does allow Epoch to release the information after. This seems really sloppy for OpenAI. I guess they got a bit extra publicity when o3 was released (even though the model wasn’t even available), but now it winds up looking worse (at least for those paying attention). I’m curious if this discrepancy was maliciousness or carelessness.
Hiding this information seems very similar to lying to the public. So at very least, from what I’ve seen, I don’t feel like we have many reasons to trust their communications—especially their “tweets from various employees.”
> However, we have a verbal agreement that these materials will not be used in model training. I imagine I can speak for a bunch of people here when I can say I’m pretty skeptical. At very least, it’s easy for me to imagine situations where the data wasn’t technically directly used in the training, but was used by researchers when iterating on versions, to make sure the system was going in the right direction. This could lead to a very blurry line where they could do things that aren’t [literal LLM training] but basically achieve a similar outcome.
I found this extra information very useful, thanks for revealing what you did.
Of course, to me this makes OpenAI look quite poor. This seems like an incredibly obvious conflict of interest.
I’m surprised that the contract didn’t allow Epoch to release this information until recently, but that it does allow Epoch to release the information after. This seems really sloppy for OpenAI. I guess they got a bit extra publicity when o3 was released (even though the model wasn’t even available), but now it winds up looking worse (at least for those paying attention). I’m curious if this discrepancy was maliciousness or carelessness.
Hiding this information seems very similar to lying to the public. So at very least, from what I’ve seen, I don’t feel like we have many reasons to trust their communications—especially their “tweets from various employees.”
> However, we have a verbal agreement that these materials will not be used in model training.
I imagine I can speak for a bunch of people here when I can say I’m pretty skeptical. At very least, it’s easy for me to imagine situations where the data wasn’t technically directly used in the training, but was used by researchers when iterating on versions, to make sure the system was going in the right direction. This could lead to a very blurry line where they could do things that aren’t [literal LLM training] but basically achieve a similar outcome.