The illegal use of copyrighted material by generative AI
29 Nov 2025
Generative AI companies are not hiding that they have illegally used copyrighted material to train their programs. They know that they could not exist without it. They do not care, but you should.
Using copyrighted material without permission
Generative artificial intelligence programs, such as ChatGPT, Gemini, Midjourney etc. do not have the creative abilities of humans. In order to generate anything, they need to have examples which they can base their generated content on. Instead of training these Large Language Models (LLMs) on material that is no longer protected by copyright, the programmers decided that copyright law does not apply to them. Courts around the world are disagreeing with them, but more on that later.
Creative works that typically fall under copyright laws include literary, artistic, educational and musical works. So books, (online)magazines, paintings, drawings, photos, song texts, melodies and many other works. But in most cases copyright expires somewhere between 50 to 100 years after the creator has died, this differs per country or jurisdiction. That means that every creative work created before 1925 is no longer covered by copyright. You would think training their programs on these works would be enough for the LLMs, but no.
Instead, the programmers decided that they also needed the modern creative works. I do not know why, though I can speculate. Maybe they thought the programs would be incapable of generating works in modern styles, without training. There were no digitally created works in 1925 the way we know them today and languages constantly evolve as well. So the programmers needed to train their models on modern works as well as older pieces. They just did not want to pay for it or even ask for permission.
Asking for permission to use copyrighted material is just too difficult for the tech CEOs.
In 2025, former president of global affairs at Meta Nick Clegg stated that asking creatives for permission to use their works to train gen-AI would be too difficult. He said “I just don’t know how you go around, asking everyone first. I just don’t see how that would work.” So these programmers can write programs that can do incredible things, but asking for permission is just too hard or too much work.
That attitude just shows how being lazy is one of the problems of gen-AI, of both the programmers and the users. Clegg also believes that if a government would make it mandatory to ask for permission in advance it would “kill the AI industry in [that] country overnight”. Clearly indicating that the companies behind these programs believe it is a problem for the copyright owners, not for them.
A form of plagiarism
Anyone who has ever written an academic article, essay or thesis knows you have to provide your sources. That is when you quote someone else’s work or if you paraphrase them. If you do not cite your sources, you are committing plagiarism.
Gen-AI programs may tell you where they found their information, they may even supply you with a direct link, if you ask an informational question. But they won’t tell you whose work they are using or adapting when you ask them to create content, such as a caption or a visual.
LLMs see no issue with committing plagiarism.
It gets even worse when the user asks the program to mimic someone else’s work, and the program does so without complaint. Committing plagiarism on purpose. So, if you want to avoid plagiarism you had best avoid the use of generative AI tools when you are creating content of any sort.
In March 2025, ChatGPT’s ability to mimic the style of Studio Ghibli went viral while the founder, Hayao Miyazaki, has openly voiced how very opposed he is to the very existence of gen-AI. Now, the representatives of Studio Ghibli et al. have officially opted out of their works being used for training machine learning models. Whether OpenAI will honour that remains to be seen.
And in May 2025, several “authors” were caught having used gen-AI in their books. As they had left parts of the prompts in the published texts, including a prompt that asked the gen-AI to generate a text in the style of another author. Readers were angry, hurt and lost trust in the self-publishing industry.
Above, I have put authors between quotation marks because I believe people who use gen-AI to create content are not actual authors, or actual artists. They are prompters, who may or may not do some editing or “fine tuning” to the work generated by the gen-AI.
Copyright lawsuits against gen-AI
There is an abundance of lawsuits against gen-AI companies around the world concerning copyright law and other issues. Most are ongoing or likely to be appealed. As laws differ per jurisdiction, so have the results of lawsuits.
Anthropic PBC, the company behind gen-AI Claude, settled a class action lawsuit after a judge in the United States found that they had illegally acquired millions of copyrighted books to train the program. The settlement of $1.5 billion is the first of its kind according to Reuters.
In the United States, there is a fair use clause in the copyright law, that does not exist in most other countries. This clause gave Anthropic the right to train its AI on copyrighted material in the United States, but the company broke the law by acquiring the copyrighted works by downloading pirated material. Which indicates to me that in other countries, gen-AI companies might be liable for both theft and copyright breaches.
Insurers will not fully cover AI risks.
In Germany, OpenAI, the parent company of ChatGPT, lost a lawsuit concerning copyright in music in November 2025. While the case only referenced nine specific songs, the German music rights organisation which filed the lawsuit represents more than 100,000 composers, songwriters and publishers. And they have also filed similar lawsuits against several other gen-AI companies.
Most gen-AI companies are not making a profit at this time. And already some of these companies are considering using money from investors to settle these lawsuits, as insurers are refusing to cover them and politicians are already protesting against future bailouts. Meanwhile, Australia has confirmed that according to its copyright laws it is illegal to train gen-AI on copyrighted work without permission from the copyright holder.
Why you should care
So why should you care if these generative artificial intelligence programs are based on copyrighted material? There are several reasons. First, if you ever create something yourself, they will not hesitate to scrape that too. It will take them seconds to scrape what it took you hours to create.
Second, if you use one of these gen-AI tools, you will be complicit in the crime. That was actually the defence of OpenAI in the German court case. They argued that “since the chatbot’s outputs are dependent on prompts put in by users, it is not their responsibility, but rather the users’, who generate reproduced outputs.” They will through you under the bus while they try to make a profit.
The Impactful Quill
The English & Dutch SEO copywriter, storyteller, and content strategist, helping you to reach and engage with your clients.
© Copyright 2025-2026 The Impactful Quill, All Rights Reserved

