IPR Newsletter – A Collision Course – Understanding Tussle Between Copyright Laws And OpenAI : September 2024

Introduction

As generative AI technologies enter the market, their use raises substantial legal concerns under current copyright laws. Courts are now deciding how to apply these standards to AI-generated content, including issues like infringement, usage rights, and the ambiguity of AI-generated work ownership. A key issue is whether users should be able to direct AI systems by referencing other writers’ copyrighted and trademarked works without their permission.

The Impact of AI on Copyright Law

The rise of generative AI, which often requires little human input, calls into question the conventional content creation framework. This raises concerns about who owns the copyright and whether AI-generated content can be protected by it. 

Should the developer of the AI model be considered the owner, considering their involvement in allowing the ‘creative’ process? Or does the owner of the AI model have any rights? Alternatively, it may be claimed that the person who provided the suggestions that guided the AI tool in developing the material should be identified as the copyright holder. Finally, the question is whether there was a significant human contribution to deserve copyright protection in the first place.

Given the advent of generative AI, courts and intellectual property authorities may need to reexamine traditional copyright law. This could result in changes to regulations governing the ownership of computer created content. In the future, copyright law may need to develop to recognize machines as independent entities capable of owning content, but this is unlikely to occur anytime soon. In such a case, the issue of culpability for harmful events generated by generative AI outputs is unclear.

Does Training AI constitute a Copyright Infringement?

Safe Habour for Data Mining

A safe harbor for data mining is required since using data to construct effective AI technology is basically not an act of infringement. Baker v. Selden is a notable United States Supreme Court copyright case that differentiated a copyrighted work from its material form and demonstrated that not all uses of a work’s material form constitute copyright infringement.

Copyright infringement entails not only the material form of a work, but also its illegal use for expressive purposes. Simply technical or non-communicative uses of a work do not constitute copyright infringement because they are not intended to represent the work. Similarly, downloading copyrighted photos and text for data mining involves making copies for a different purpose.

Training a machine learning model with this copyrighted data does not constitute an infringement because the data are not redistributed or transmitted to the public. Copyright protects creative expression, whereas model training extracts unprotectable ideas and patterns from data. As a result, data mining uses of copyrighted works do not require a fair use analysis.


Fair Use Principle  

The Fair Use Principle is mentioned in the Section 52 in the Indian Copyright Act and Section 107 of the Copyright Act of the US. The fair use doctrine seeks to strike a balance between the rights granted by copyright to its owners and the broader societal good, while also encouraging creativity, education, and free expression. Fair use is an exception to copyright that allows the use of copyrighted content without the owner’s permission for criticism, comment, news reporting, teaching, scholarship, or research. Fair use is a mixed legal and factual concern, therefore determining whether something is fair use is case-specific. There are no instances where fair usage is implied. Fair use is an affirmative defense in a copyright infringement lawsuit. The burden of proof of fair use is on the defendant.

Fair Use Criteria

In fair use instances, courts must evaluate the following elements with equal weight:

  1. The aim and character of the use, including whether it is commercial, transformational, or non-expressive.
  2. The nature of the copyrighted work;
  3. The amount and significance of the section used;
  4. The impact on the copyrighted work’s potential market or value.

India’s Stance on Copyright Infringement by Generative AI

Indian Law currently lacks provisions to address the complexities of AI-related copyright infringement. India’s copyright laws do not explicitly recognize AI as AUTHORS or CREATORS. While the Indian Copyright Act doesn’t explicitly list these purposes, courts have interpreted them based on judicial precedents. In any case, the legal structure in India may need to evolve to handle Generative AI. There is also a need for clarifications in legislation around such technologies. 

For example, In Navigators Logistics Ltd. v. Kashif Qureshi, a copyright claim made on a list compiled by a computer was rejected due to a lack of human intervention. It was held that human involvement in the creation process is essential for the grant of copyright protection in India.

Lawsuits Against OpenAI

In recent times, numerous similar cases have been filed in the USA. However, as of now, no decisions have been made on these matters, some of which have been discussed herewith. 

1)  Class Action Lawsuits

Recently, David Millette and the Authors Guild (The Authors Guild is America’s oldest and largest professional organization for writers and provides advocacy on issues of free expression and copyright protection) filed lawsuits against OpenAI. David Millette filed a complaint in 

federal court at San Francisco, stating that OpenAI breached its terms of service by utilizing its speech recognition engine, Whisper, to transcribe over 1 million hours of YouTube footage without approval. Millette is seeking at least $5 million in damages and a court order prohibiting OpenAI from using his content again. 

On the other hand, the Authors Guild, which represents prominent authors including George R.R. Martin, Jodi Picoult, and John Grisham, has also launched a class-action complaint against OpenAI. Their case accuses OpenAI of utilizing copyrighted works to train their AI models, notably ChatGPT, by downloading full novels from pirate websites.

2) The New York Times v. OpenAI

The current legal dispute between The New York Times (NYT) and OpenAI revolves upon charges of copyright violation. According to the New York Times, OpenAI utilized its content to train AI models such as ChatGPT without sufficient authorization or remuneration. The case highlights the broader legal and ethical concerns surrounding using copyrighted information to develop AI technologies.

According to the New York Times, OpenAI’s AI models may generate material that strongly resembles or even directly reproduces NYT content, possibly evading paywalls and affecting the newspaper’s subscription and advertising revenue. 

Furthermore, the NYT claims that AI-generated material could lead to deception and harm its brand’s reputation, notably through inaccuracies or falsified language ascribed to the NYT.

OpenAI is anticipated to defend its conduct by citing the “fair use” theory, claiming that its use of New York Times content is transformative and comes under legal exceptions. However, this position is problematic, as the New York Times says that OpenAI’s actions directly affect its company by diverting readers away from its platforms and potentially compromising its financial model. 

The outcome of this lawsuit could have a huge impact on how AI businesses use copyrighted materials in the future, potentially changing the legal environment surrounding AI and intellectual property. This case is being widely monitored because it could have far-reaching consequences for both the AI business and the realm of copyright law.

3)  GitHub Copilot Lawsuit

A California judge recently dismissed the majority of the accusations in a high-profile lawsuit brought against GitHub, Microsoft, and OpenAI. A group of developers filed the case, accusing these companies of unlawfully duplicating their work using GitHub Copilot, an AI-powered code completion tool. The plaintiffs claimed that Copilot was creating code snippets that were too close to their original work, so infringing the Digital Millennium Copyright Act (DMCA) and numerous open-source licenses. 

However, the judge determined that the code created by Copilot was not sufficiently identical to the developers’ original work to constitute a copyright violation under the DMCA. This verdict is regarded as a landmark success for the generative AI business, as it establishes a precedent for how AI-generated content is recognized under copyright law. Despite this, the matter is not completely concluded; two claims remain: one for suspected infringement of open-source licenses and another for breach of contract. The resolution of the remaining claims may still have an impact on the future use and development of AI technologies such as Copilot.

4)  RIAA vs. Suno and Udio Lawsuit

On June 24, 2024, the Recording Industry Association of America (RIAA), which represents major music companies Universal Music Group, Sony Music Entertainment, and Warner Music Group, filed two massive copyright infringement cases against AI music providers Suno and Udio. The cases, filed in federal court in Boston and New York, claim that Suno and Udio exploited copyrighted sound recordings without authorization to train their generative AI models. 

The plaintiffs allege that these AI services duplicated and exploited a large number of sound recordings from various genres and eras, breaking copyright laws on a broad scale. The lawsuits seek declarations of infringement, injunctions to prevent future unlawful use, and damages for prior infringements. The RIAA highlighted the significance of these lawsuits in ensuring that AI development respects artists’ rights and encourages ethical practices in the music industry.

Conclusion

Whether using copyrighted content to train AI violates the owner’s copyrights is a complex issue that hinges on legal interpretations of “fair use,” the transformative nature of AI, and the economic impact on the content’s market. Courts may allow such use if it is deemed transformative and doesn’t harm the market value of the original work. However, when AI outputs mimic or reproduce copyrighted content in ways that affect the content owner’s revenue or reputation, it could be considered an infringement. The balance between fostering innovation in AI and protecting intellectual property rights remains a contentious legal battleground, and ongoing cases like those involving OpenAI will likely shape future precedents.

Related Posts