Google has a new approach to sorting out all the flaws in its artificial intelligence (AI), forcing dozens of its personnel to spend several hours teasing and prodding the poor artificial intelligence technology until it won’t make the firm appear terrible when it’s eventually deployed.
According to a leaked company-wide email, Business Insider reports that Google is requesting all of its staff to keep aside between two and four hours each day to examine the “Bard” AI, which the firm intends to implement into its chat feature.
It’s uncertain if the request was made by Google workers around the world. Although Google just disclosed 12,000 mass layoffs towards its total workforce, the firm still has approximately 170,000 workers globally, even without the parent group Alphabet.

Sundar Pichai, the Head of Google, stated in that memo that he would “appreciate” if all workers “contributed in a deeper way” and took two to four hours to stress test, Bard. Anybody who has ever seen an owner’s “recommendation” email realizes that it’s essentially more of a command.
The email content doesn’t provide any indication of whether the two- to the four-hour proposal would be made to them on a regular basis or stretched out over a longer amount of time.
In an effort to maintain a lead over Microsoft, which integrated its own chatbot AI within Bing search, Google launched Bard this week.
The Webb Space Telescope was inaccurately presented during a recent introductory presentation by the AI, which possibly cost the firm $100 million in capital value.
Google reportedly began internal testing, or “dogfooding,” on Tuesday. Pichai apparently stated that the company already has “thousands” of inside and outside testers messing around with Bard.
According to sources, those testers are checking into problems with the search AI’s quality and safety as well as its “groundedness,” which may be related to how “human” the AI-generated text responses sound.
Gizmodo received an email from a Google official who claimed that “Testing and feedback, from Googlers and external trusted testers, are critical components of enhancing Bard to ensure that it’s ready for our consumers.
We frequently ask Googlers for feedback to help us improve our products, and it’s a key aspect of our internal culture. Concerning how frequently and for how long employees should stress test the AI, the corporation did not answer inquiries.
The contrasts between Bard’s somewhat disappointing show and the increasing range of uses for Bing search AI have put Google on the back foot ever since it initially showed its AI.
In contrast to Bing’s search, instances of Google’s Bard did not include references for the items it presented. Citations, though, are not the only method for assigning authority to AI replies.
According to Margaret Mitchell, the lead ethics researcher of Hugging Face and a former employee of Google’s AI team, “a lot of people don’t check citations” and having citations appear could simply give misleading info credibility.
Bing search launched with a lot better features than Google’s, but it now encounters the same drawbacks which other AI chatbots have traditionally faced, specifically that they are utterly filled with errors and, well, odd answers to user questions.
In addition, persuading the AI to refrain from distributing offensive material—such as xenophobia, racism, or anti-Semitism—as bots have indeed been prone to do, may take many people working very long shifts.
To filter through thousands of cases of terrible content, OpenAI, the company that created ChatGPT and worked with Microsoft to develop its Bing AI, recruited low-paid workers in Kenya. This included murder, torture, suicide, and other forms of violence against children.
Although it’s unknown if Google employees will experience a number of the same things, it’s safe to speculate that stress testing the AI with thousands of individuals won’t be particularly entertaining.
Google recently invested roughly $400 million in OpenAI competitor Anthropic, which is actively seeking a “prompt engineer” to find ways to teach big language models to complete specific tasks.