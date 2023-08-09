Google has proposed an opt-out system for publishers to protect their content from being used for AI training. The tech giant made this suggestion in response to Australia’s proposal to ban “high-risk” AI applications. The proposal aims to prevent Google from scraping data without publishers’ consent.

Google’s Bard chatbot, released in Australia in May, has been the focal point of the company’s efforts to collect more data. The company has urged the Australian government to relax copyright laws to allow for more AI training. Now, Google is advocating for an AI-friendly internet that allows scraping by default. Under this proposal, publishers would have to implement the opt-out mechanism on their own websites.

The specific details of how this opt-out function would work have not been disclosed by Google. In a blog post, the company called for new “standards and protocols” to govern web publishers’ participation in the internet. Google referenced the robots.txt protocol as an example, which indicates which parts of a site web crawlers and bots are allowed to access. However, this protocol only applies to compliant bots and does not remove data that has already been scraped without consent.

Google’s Bard chatbot initially used the LaMDA AI model, but it was later upgraded to the PaLM 2 model. The content used to train these models includes data from public forums, as well as scraped content from Wikipedia and other websites.

Google’s proposal for an opt-out system is not limited to Australia and extends to the entire internet. Recently, Google updated its privacy policy to explicitly state that it can use users’ online posts for developing its AI tools, leading to a class action lawsuit alleging the unauthorized use of copyrighted material. OpenAI, the creator of ChatGPT, has faced a similar lawsuit. Both companies have gathered vast amounts of data from the internet to train their AI models, including articles, books, and online text.

This opt-out proposal is part of Google’s broader efforts to collaborate with large news organizations and provide them with AI tools. However, it also implies that Google believes it is acceptable for these organizations to scrape published articles for AI training purposes.