Google Launches Feature Allowing Publishers to Opt Out of AI Data Usage for Training Purposes, Not Search
Google has introduced a fresh feature called Google-Extended, which allows website publishers to prevent their data from being used in the advancement of Google’s AI models. Despite this, websites will continue to be accessible via Google Search. This tool grants publishers more authority over the utilization of their content for AI training, essentially ceasing the usage of data from publishers who choose to opt out.
AI input management
This move by Google addresses the concerns of online publishers who want to protect their data from being used in AI model training. With Google-Extended, publishers can control their websites’ participation in the improvement of artificial intelligence creation APIs such as Bard and Vertex AI. The Verge reports that publishers can now closely monitor the use of content on their sites and preserve their privacy rights.
Balancing visibility and data protection
Earlier this year, Google confirmed that it was training its AI chatbot, Bard, using publicly available data collected from the web. This announcement raised concerns and prompted publishers to look for ways to protect their content for AI training purposes, just like major news outlets such as the New York Times, CNN, Reuters and Medium.
Unlike other crawlers, Google’s indexing is an integral part of a website’s discoverability in search results. Therefore, completely blocking Google’s crawlers can adversely affect a website’s online presence. To meet this challenge, some publishers have resorted to legal action, such as updating their terms of service to prevent companies from using their content for AI training.
Google-Extended is available through the robots.txt file, which instructs the crawlers which parts of the site they can use. As AI applications expand, Google is committed to exploring other machine-readable options that offer more choice and control to online publishers. Further development on the matter is expected in the near future.
In short, Google’s release of Google-Extended provides publishers with a valuable tool to protect their data from participating in AI model training while benefiting from the indexing capabilities of Google search. This development represents a significant step towards addressing concerns about the use of online content in AI education and ensuring greater transparency and control for publishers.
One more thing! We are now on WhatsApp channels! Follow us there to never miss an update from the tech world. If you want to follow ReturnByte channel on WhatsApp, click here to join now!