Introduction
Background Information
Generative AI has significantly changed the face of creative industries by
allowing the creation of realistic works of art, literature, and other forms of
media with unbelievable speed by platforms such as Midjourney, Stable
Diffusion, and DALL-E 2. The numerous large datasets, patterns, styles,
and relationships that these AI systems learn from are very often filled
with copyrighted material such as images, texts, and music. These
generative AI systems produce new content by processing these datasets,
which often mimics the styles and forms of works of human creators.
Additionally, this technology provides numerous opportunities, but with
many legal and ethical issues. One of the most pressing issues concerns
how the AI systems are accessing and utilizing copyrighted material while
training. These large-scale data lakes of publicly available content often
include unlicensed works and are very common in the use of such models.
AI will be recognizing patterns in that content and creating new material
which might also infringe on the copyright of the original creators.
The legal landscape for AI-generated content is constantly changing. This
is evident from lawsuits such as Andersen v. Stability AI and Getty's
lawsuit against Stability AI that are challenging the use of copyrighted
works in training datasets for AI. Cases such as these bring forward the
tension between the rapid development and advancement of these AI
technologies and the protection that intellectual property law provides.
This is due to the fact that intellectual property laws were never designed
to take into consideration the ways this sort of AI learns from and
replicates pre-existing creative works.
As time goes on and generative AI continues to evolve, a very relevant
question remains, how do we prevent AI from using copyrighted works
without permission? What are some of solutions that can prevent or
better, stop AI from using copyrighted works altogether?
Defining the Problem
Now, let’s define the problem at hand. The biggest problem with
generative AI is that these models have been trained on large datasets
that very frequently include copyrighted material for which the owners
have given no explicit consent. As a result, AI systems can generate works
that closely resemble protected and copyrighted works, or even generate
direct copies of those copyrighted works and thereby commit copyright
infringement.
However, we must also discuss about one of the most key legal questions
which is whether AI-generated content is a derivative work of the
copyrighted material used to train it. The various cases weighing in are
whether the content should be considered infringing or transformative
under pre-existing copyright law. For example, in the case of Andersen v.
Stability AI, artists contend that the use of their works to train AI
constitutes unauthorized derivative use because of a lack of
transformation done by the AI's output.
Furthermore, aside from the issue of infringement, there are also some
concerns around ownership rights. The fact is, AI is building on patterns it
has learned from copyrighted works to create new content, so who owns
the output? The user who has prompted the AI, or the AI developers, or
the original creators whose works were used to train the model?
This uncertainty presents major risks for enterprises and individuals who
utilise generative AI. Without a clear answer regarding these issues,
companies are risking legal exposure when their AI models generate
content that violates a copyright law. Moreover, current AI systems
cannot ensure that copyrighted works used in training are properly
licensed or that generated content won't inadvertently infringe IP laws.
Conclusively, the issue surrounding AI using copyrighted content poses a
major obstacle not just for enterprises that seek to use generative AI , but
also for the owners of the copyrighted material, whose value of their
creative works are not being protected from unlicensed used. As we move
on to the future of generative AI, major issues regarding generating AI
such as those listed above must be prevented and resolved, in order to
prevent harming the creators and companies due to generative AI.