Replying to: pratik @pratik

@pratik @stupendousman It’s at least worth noting that it hasn’t been established as a matter of law that using copyrighted material for LLM training is automatically fair use. It’s a plausible interpretation, but one of the factors used to determine whether a given reproduction is fair use is “the effect of the use upon the potential market for or value of the copyrighted work”; generative AI can clearly create works that do compete with the copyrighted work they’re trained on.

I publish a lot of things on the web and explicitly specify the Creative Commons “Attribution NonCommercial ShareAlike” license, which means the material cannot be used for commercial purposes without my explicit permission; if my work is being used to train an LLM that the LLM’s owners are charging access for, that seems on its face to violate the terms of the license. “But you made it available on the internet, too bad so sad” is not and has never been a valid defense against copyright infringement, legally speaking. :)

Watts Martin @chipotle