Google is advocating for changes to copyright laws that would permit the use of copyrighted content for training its AI models. The company aims to argue “fair use” when faced with objections and plans to offer opt-outs for entities that do not wish to have their data trained using AI systems. Google expressed these intentions to the Australian government, which is deliberating new AI laws. The search giant seeks copyright systems that enable the training of AI models on a wide range of data while respecting the rights of content owners who prefer their data not to be used in AI systems.

Whether current “fair use” doctrines already encompass machine learning is uncertain. Several lawsuits against Google and OpenAI have been filed by publishers and authors who claim that scraping copyrighted content for training purposes is illegal. Fair use or fair dealing doctrines allow limited use of copyrighted materials without permission for purposes like criticism, commentary, or research. There are four criteria used by American courts to determine if a use is fair: the purpose of the work, the nature of the original work, the amount of work reproduced, and the effect on the market for the original work.

The debate centers on whether training AI on copyrighted materials qualifies as “fair use” under the law. The answer could depend on whether the content being used is creative or factual, with creative works enjoying more protection. Emory Professor Matthew Sag testified that training generative AI on copyrighted works is generally fair use as it falls into the category of non-expressive use. However, the output of AI models also plays a role in determining fair use. If the output closely resembles the works used for training, it may impact the fair use argument.

Google’s Search Generative Experience (SGE) is an example of AI that often copies text verbatim from its training data. The SGE displays extracts from different websites without proper citation or direct attribution. While defenders of Google’s practices may argue that these are citations or valuable backlinks, critics contend that they are neither. Proper citations alone do not absolve copyright infringement. However, plagiarism is not explicitly covered by copyright laws.

It remains to be seen whether Google will succeed in changing copyright laws to accommodate its use of copyrighted content for AI training. Lawsuits and legal interpretations in different jurisdictions will play a significant role in shaping the future of AI training practices.