Google Utilizes Publicly Accessible Data to Train AI Models Such as Bard
Google has recently declared changes to its privacy policy and AI training framework, enabling it to utilize publicly accessible data for training its models.
This is in line with the trend of top brands – including Google itself – releasing AI-focused products and data that could change the way we work. To do this, Google and other companies like OpenAI need large data sets to train chatbots or large language models in general.
As ReturnByte discovered, Google has changed the wording of its privacy policy, changing “AI models” to “language models,” and can now use publicly available data to create feature sets and even full products like Google Bard and others. . Simply put, anything and everything available online can now be used to train AI models like PaLM 2 and in the future even Gemini.
Google said it “uses the data to improve our services and develop new products, features and technologies that benefit our users and the public.” And citing examples, Google claims that “we use publicly available data to help train Google’s AI models and build products and features like Google Translate, Bard, and Cloud AI capabilities.”
Search Engine Journal notes that AI chatbots, including Google’s Bard and OpenAI’s ChatGPT, can rate and “reuse human posts, reviews and other online content.” The publication also claims that in the future, as more generative AI products become publicly available, lawsuits stemming from unauthorized training of AI models and use of original content will increase.