The rise of generative artificial intelligence has triggered a debate about the appropriate protections for copyrighted data. This column examines the economic incentives and social welfare implications of different copyright approaches.

 

Copyirght: cepr.org – “Copyright Policy Options For Generative Artificial Intelligence”


 

SwissCognitive_Logo_RGBFor ‘small’ AI models (trained using an identifiable corpus of content), it shows that giving content owners full copyright protection leads to higher investments in both content quality and AI model quality. For larger models, there is a trade-off between the benefits of training data access against the risk of harm to content owners. Policymakers should take these into account and craft copyright rules that promote both flourishing creative ecosystems and cutting-edge artificial intelligence.

In recent years, powerful generative artificial intelligence (AI) models have emerged, including large language models like ChatGPT, which can produce human-like text outputs from prompts, and image generation models like DALL-E, which creates images from text descriptions. While there is a broad debate regarding various economic issues associated with such models, from the environmental impact of power consumption (Abeliansky et al. 2023) to more standard liability issues (Kretschmer et al. 2023), a more recent flashpoint for discussion surrounds copyright protections. This is because training data used to build these AI models often include copyrighted content like books, articles, and online media. Should AI companies have to license and pay for the copyrighted data used to train their models? Or does such usage fall under fair use provisions? (See Samuelson 2023, for an excellent overview).

This issue received new prominence as a lawsuit was filed by a leading content provider against a leading AI provider. In 2023, the New York Times (NYT) filed a lawsuit alleging that OpenAI had used the newspaper’s copyrighted content to train its GPT large language models without permission. As evidence, the NYT demonstrated that both ChatGPT (created by OpenAI) and Bing Chat (which licenses GPT from OpenAI) were able to reproduce some NYT articles nearly verbatim when prompted in certain ways.

The NYT argued this showed the models were trained on its copyrighted material. It asked the court to prevent OpenAI from using models trained on NYT content and requested statutory damages for the alleged copyright infringement.[…]

Read more: www.cepr.org


Thank you for reading this post, don't forget to subscribe to our AI NAVIGATOR!