If you want to ban or monopolize such models, push for that directly. Indirectly banning them is evil.
They’re already illegal. GPT-3 is based in large part on what appear to be pirated books. (I wonder if google’s models are covered by its settlements with publishers.)
The basic idea of copyright is that if you want to acquire a copy of a book you need to buy that copy. If they just downloaded lib-gen, they didn’t buy the copies of the book they use and that would be a copyright violation.
That’s true whether or not you afterward do something transformative.
What a bizarre normative assertion. That copyright violation would be true whether or not they used it to train a model or indeed, deleted it immediately after downloading it. The copyright violation is one thing, and the model is another thing. The license that one would buy has nothing to do with any transformative ML use, and would deny that use if possible (and likely already contains language to the effect of denying as much as possible). There is no more connection than there is in the claim “if you rob a Starbucks, you should buy a pastry first”.
Yes, the copyright violation is true whether or not they used it to train a model. Douglas_Knight’s claim is that the copyright violation occurred. If that’s true, that makes it possible to sue them over it.
No, OpenAI is not arguing this. They are not arguing anything, but just hiding their sources. Maybe they’re arguing this about using the public web as training data, but that doesn’t cover pirated books.
Yes, a model is transformative, not infringement. But the question was about the training data. Is that infringement? Distributing the Pile is a tort and probably a crime by quantity. Acquiring the training data was a tort and probably a crime. I’m not sure about possessing it. Even if OpenAI is shielded from criminal responsibility, a crime was necessary for the creation and that was not enough to deter it.
If you want to ban or monopolize such models, push for that directly. Indirectly banning them is evil.
They’re already illegal. GPT-3 is based in large part on what appear to be pirated books. (I wonder if google’s models are covered by its settlements with publishers.)
Which is argued to be “transformative” and thus not illegal.
Even if it’s transformative they should still have to buy one license for each book to be able to use it.
Why?
The basic idea of copyright is that if you want to acquire a copy of a book you need to buy that copy. If they just downloaded lib-gen, they didn’t buy the copies of the book they use and that would be a copyright violation.
That’s true whether or not you afterward do something transformative.
What a bizarre normative assertion. That copyright violation would be true whether or not they used it to train a model or indeed, deleted it immediately after downloading it. The copyright violation is one thing, and the model is another thing. The license that one would buy has nothing to do with any transformative ML use, and would deny that use if possible (and likely already contains language to the effect of denying as much as possible). There is no more connection than there is in the claim “if you rob a Starbucks, you should buy a pastry first”.
Yes, the copyright violation is true whether or not they used it to train a model. Douglas_Knight’s claim is that the copyright violation occurred. If that’s true, that makes it possible to sue them over it.
No, OpenAI is not arguing this. They are not arguing anything, but just hiding their sources. Maybe they’re arguing this about using the public web as training data, but that doesn’t cover pirated books.
Yes, a model is transformative, not infringement. But the question was about the training data. Is that infringement? Distributing the Pile is a tort and probably a crime by quantity. Acquiring the training data was a tort and probably a crime. I’m not sure about possessing it. Even if OpenAI is shielded from criminal responsibility, a crime was necessary for the creation and that was not enough to deter it.
OpenAI is in fact arguing this and wrote one of the primary position papers on the transformative position.
Does this link say anything about their illegal acquisition of the sources?
It sure looks to me like you and they are lying to distract. I condemn this lying, just as I condemned Christian’s proposed lies.
How would you do that? How would you write the laws?