The highly anticipated next-generation AI model from Google, Gemini, is facing setbacks in its developmental journey. This has caused a delay in its public launch, which is now rescheduled for January 2024. Google first introduced Gemini to the public during the I/O 2023 conference in May and had initially scheduled a grand reveal next week. But because of the roadblocks, events in California, New York, and Washington that were planned for Gemini’s launch have been postponed. This unexpected delay comes as Google grapples with challenges related to the model’s handling of non-English queries, according to information from two anonymous insiders.
Language Struggles and Competitive Edge
The primary reason behind the postponement appears to be the AI’s difficulty in reliably handling specific non-English queries. In November, Sundar Pichai, Google’s CEO, emphasized the importance of global language support. He expressed the company’s commitment to ensuring the competitiveness of Gemini 1.0.
Despite these language-related challenges, insiders reveal that Gemini has either met or surpassed the standards set by GPT-4, Google’s previous AI model, in some areas. With its advanced text and image generation capabilities, Gemini aims to outshine its competitors, particularly OpenAI’s models.
Multimodal Prowess Unveiled
During the I/O 2023 conference, Google showcased Gemini’s remarkable multimodal capabilities, setting it apart from other existing models. This next-gen AI not only comprehends text and images but also efficiently integrates various tools and APIs. Sissie Hsiao, Google’s Vice President and manager of Bard and Google Assistant, demonstrated Gemini’s capabilities by stating that it could “draw me three pictures of the steps to how to ice a three-layer cake.” However, despite these impressive demonstrations, it appears that Google is still in the finalization stages of the primary version of Gemini.
Handling Diverse Data Types
Gemini’s standout feature lies in its ability to handle various types of data. The AI model reportedly excels in comprehending and generating text, images, and other forms of content. It can also generate websites and other content based on sketches or written descriptions.
Gemini aims to showcase superior performance by leveraging significantly greater computing power than its rival, OpenAI’s GPT-4. While Google already boasts a generative AI model named Bard, consumer awareness currently tends to favor OpenAI’s ChatGPT. Analysts, however, speculate that this may change once Gemini officially hits the market.
Integration and Future Innovations
The timeline for incorporating Gemini into Google services such as Bard, Search, and Workspace remains uncertain. Google has ambitious plans for Gemini, intending to offer it in various sizes. This includes a lightweight version known as “Gecko” designed for mobile devices. Additionally, the company envisions Gemini as a platform that will “enable future innovations, like memory and planning.” As the eagerly awaited launch approaches, the tech community is keen to explore the potential impact of Gemini’s advanced features on various AI applications. While a setback, the delay in the public launch reflects Google’s commitment to ensuring the model meets the highest standards before its official release.