Google to use any public info online to train AI chatbot Bard

4 Jul 2023

Image: © Luis G. Vergara/Stock.adobe.com

Despite having to delay the launch of Bard in the EU over privacy concerns, the latest policy update gives Google more power than before.

The latest update to Google’s privacy policy includes a clause that allows the search giant to collect any information that’s publicly available online to help train its AI models.

The company said in its updated privacy policy that came into effect on 1 July that the information may be collected to build products and features such as Google Translate, Bard and Cloud AI capabilities.

A previous version of the policy that came into effect in December said the company had the right to collect the information to only “help train Google’s language models and build features like Google Translate”.

This means that Google is being open about scraping the internet for any information posted online and using it to train its AI tools – which includes, most notably, its AI chatbot Bard. However, this does not include private data linked to individual accounts.

Bard is Google’s competitor to OpenAI’s ChatGPT. However, the chatbot has not had good luck in the EU. While it was scheduled for launch in the market earlier this year, the Irish DPC flagged privacy concerns last month that prompted Google to halt the launch of its flagship chatbot.

According to the Irish DPC, Google has not provided the watchdog with sufficient information on how Bard will meet GDPR requirements within the bloc.

“Google recently informed the commission of its intention to launch Bard in the EU this week,” said deputy commissioner Graham Doyle on 13 June. “[We have] not had any detailed briefing nor sight of a data protection impact assessment or any supporting documentation at this point.”

Bard has been available in both the US and the UK since March.

In response to the Irish DPC’s latest decision, a Google spokesperson told Politico at the time that the company wanted to make Bard available in the EU “responsibly, after engagement with experts, regulators and policymakers”.

“As part of that process, we’ve been talking with privacy regulators to address their questions and hear feedback,” the spokesperson said.

In its latest privacy policy, Google also mentioned that, in some circumstances, it collects information about individuals from publicly accessible sources.

“For example, if your name appears in your local newspaper, Google’s search engine may index that article and display it to other people if they search for your name.”

In April 2022, the company expanded the list of personal details that can be removed upon request from search results, such as phone numbers and addresses, giving users more control over how sensitive or personally identifiable information can be found.

Meanwhile, OpenAI was slammed with a major class-action lawsuit from a US law firm on the grounds that it scraped the internet to train its generative AI chatbot, potentially violating the rights of millions. The complaint also targets Microsoft, which has invested billions in OpenAI.

10 things you need to know direct to your inbox every weekday. Sign up for the Daily Brief, Silicon Republic’s digest of essential sci-tech news.