
Syed Suhaib is a final-year law student at the University Institute of Legal Studies, Chandigarh University, specialising in Corporate Law. He presently serves as an Editor at the All India Commercial Law Review and is keenly engaged in legal research, editorial writing, and academic discourse.
Artificial intelligence is only as good as the data it learns from.
Harvard FAS Mignone Center for Career Success. (2025, January 23). What is AI: The pros and cons of artificial intelligence, and what its future holds. Harvard University.
The Generative Artificial Intelligence (AI) chatbots have accumulated a commercial heed considering that abundant conglomerated industries are capitalising upon the de facto applications of this technology and desiderate it to be inclusive within their prevailing business structure. The valuation of such inclusivity is envisaged to be adjuvant in reaping substantial profits whilst economising the cost of operations. In terms of innovation, the generative AI technology is malleable considering the inceptive predominance of the erstwhile Chat-GPT model series, which was hobbled by newer ingenious models like Perplexity & Gemini, making it reasonable to conceit that the generative AI industry will remain tenacious in pioneering out newer & better models. However, this has presented policymakers with a predicament because, as technological innovation transmutes, it arrives at an impasse with the prevailing legislative jurisprudence. Within the Indian Intellectual Property law framework, such a predicament of generative AI models is cognate with the concurrent Indian Copyright Act of 1957 & the Doctrine of Fair Use, which contemporarily is inapt to manoeuvre the digital jurisprudence of generative AI models specifically with regards to the data usage for training the generative AI models.
The Predicament of Generative AI Models
In order to ascertain the intricate nature of this predicament between the generative AI models & copyright law, foremost, it is imperative to contemplate how these generative AI models are trained. The generative AI technology can be substantiated to be an amalgamation of Large Language Models (LLMs) & Deep Learning[1], which constitute a subset of Machine Learning (ML) algorithms and consequently require a considerable amount of data & coding for their training. However, this data is exclusively cherry-picked from the public domain, which is open source but protected under copyright law, which routes the prevailing impasse of generative AI models infringing upon the intellectual rights of creative holders.
Within the Indian Copyright framework, such a standoff has been witnessed in the case of ANI v/s Open AI 2024 SCC OnLine Del 8120, wherein the Plaintiff ANI, a prominent news agency, sued Defendant Open AI for utilising its data to train its LLMs. On 19th November 2024, the matter was interimly adjudicated by the Delhi High Court through Justice Amit Bansal, by upholding the infringement and issuing an ad interim injunction against the Defendant. The court was also briefed by the Defendants’ counsel that Open AI had blocklisted the Plaintiff agency for further training of its model. However, such contention of data usage to train generative AI models is not limited to the Indian jurisdiction because similar instances have occurred across the jurisdictions of the United States, Europe, Canada & United Kingdom.[2]
Antecedent to the lawsuit by ANI, in Canada, on 24th November 2024, five Canadian media & newspaper publications in amalgamation, filed a lawsuit of copyright infringement against Open AI within the Ontario Superior Court of Justice and sought damages and permanent injunction to be issued against Open AI. Subsequently, in the U.S., such a suit was filed by Dow Jones & NYP Holdings against Perplexity AI in the Southern District Court of New York.Another such lawsuit in the U.S. was filed against Meta in the Northern District Court of California, San Francisco Division by Plaintiff Christopher Farnsworth, who provided arguments contending Meta’s use of training its LLaMa Model using his work.[3]
However, a substantial ruling which can eminently influence the aftermath of such cases has neighed from Raw Story Media Inc. v. Open AI Inc., S.D.N.Y., No. 24-cv-01514, 11/7/24 in US District Court, Southern District of New York. Herein, in accordance with the judicial interpretation of the Digital Millenium Copyright Act,[4] specific to Article III, Section 1202(b)(i), the District Judge held that Open AI has not misused the articles from news outlets to its L.L.Ms. The District Judge reasoned that Defendant can create & reproduce derivations of Plaintiff’s work without incurring liability under the Digital Millennium Copyright Act, and in order to seek monetary or injunctive relief or both, as the case may be, Plaintiff has to substantiate that the fabrication of their work has caused detrimental ramifications. Consequently, in response to this verdict, the Plaintiff sought a jury trial.
In Search of a Harmony
The real-world applications of generative AI models are now ubiquitous & undeniable within the sphere of innovation integration & assistance. Even with a substantial amount of constructive criticism, these models have remained persistent in narrowing down complex avocations. However, to substantiate whether the training of these models can cause a cognizable injury is a matter of imperative intricacy, as the contention of arguments fuels ambiguity. “Journalism is in the public interest. OpenAI using other companies’ journalism for their own commercial gain is not. It’s illegal”[5] was quoted by Torstar, Postmedia, The Globe and Mail, The Canadian Press, & CBC/Radio-Canada and was reported on Reuters.
On the other hand, the companies leveraging generative AI models argue ‘Fair Use’ in accordance with the jurisdictive and concurrent copyright law. However, such arguments pose a reasonable apprehension around the training of these models, since AI is becoming a billion-dollar industry, they should compensate copyright holders, whose work is being utilised to train these models. On the other hand, these models being open source also becomes a problem, since they are free to use publicly and can be leveraged by anyone for custom use. However, it is also pertinent to note that, in some jurisdictions, there is still a struggle in contemplating the digital jurisprudence that AI has brought. Some have taken arguably strict measures to mitigate such predicaments, like the European Union’s AI Act which is a luminary legislation in the contemporary era to mitigate such predicaments arising with the inclusivity of generative AI models.
Nonetheless, the protection of every copyright holder’s work is also sine qua non and cannot be disregarded.A middle ground between fair use and artistic freedom should be deliberated upon, by cogitating fair compensation for copyright holders, if training of such models is done for commercial gain, which can be reasoned to be a prudent way of outwitting & mitigating the prevailing predicament whilst protecting creative & artistic freedom because an individual should be entitled to fruits of their labour.[6]
[1] Ian Goodfellow, Yoshua Benigo and Aaron Courville, Deep Learning (MIT Press 2016) Pg. 6–12.
[2] ‘Generative AI – Intellectual Property Cases and Policy Tracker’ (Mischon de Reya, 12 August 2024).https://www.mishcon.com/generative-ai-intellectual-property-cases-and-policy-tracker
[3] Christopher Farnsworth v. Meta Platofrms Inc., 3:24-cv-6893 https://storage.courtlistener.com/recap/gov.uscourts.cand.437440/gov.uscourts.cand.437440.1.0_5.pdf
[4] Digital Millennium Copyright Act, Pub. L. No. 105-304, 112 Stat. 2860 (1998), https://www.copyright.gov/legislation/dmca.pdf
[5] Honderich, H. (2024, November 29). Major Canadian news outlets sue OpenAI. BBC News. https://www.bbc.com/news/articles/cm27247j6gno
[6]John Locke, Two Treatises of Government (A New Edition, Corrected, in Ten Volumes, vol V, Printed for Thomas Tegg; W Sharpe and Son; G Offor; G and J Robinson; J Evans and Co; Also R Griffin and Co Glasgow; and J Gumming, Dublin 1823) 115-25. https://www.yorku.ca/comninel/courses/3025pdf/Locke.pdf Also, John Locke, Two Treatises of Government (Whitmore and Fenn and C Brown 1821) Chapter V ‘Of Property’ Pg 208-229.