Google Launched its Gemini AI to Rival OpenAI’s GPT-4

Google Launched its Gemini AI to Rival OpenAI's GPT-4

–Google Launched its Gemini AI to Rival OpenAI’s GPT-4 | Image by The Alphabet

Key Points

  • Gemini AI achieves a groundbreaking 90% benchmark score, surpassing GPT-4 and human capabilities.

  • Trained in text, images, and sound, Gemini hints at diverse interaction potential beyond text-based applications.

  • Google plans to integrate Gemini into Bard chatbot, aiming to exceed human coding abilities.

Google has unveiled its Gemini AI, an advanced Artificial Intelligence model positioned to outperform both OpenAI’s GPT-4 and human experts across a spectrum of intelligence evaluations.

Gemini features three iterations—Nano, Pro, and Ultra—each advancing in size and capability. While specifics regarding Pro and Ultra remain undisclosed, Nano presents as dual models tailored for smartphones: one with 1.8 billion parameters for slower devices and another with 3.25 billion parameters for higher-performance units. In contrast, GPT-4 supposedly integrates up to 1.7 trillion parameters, whereas Meta’s LLAMA-2 employs 70 billion.

Precision in assessing AI capabilities remains challenging; nevertheless, Google maintains the mid-level Pro version of Gemini surpasses models such as OpenAI’s GPT3.5, with the formidable Ultra variant surpassing all existing AI models. Google asserts a 90 percent score on the MMLU benchmark for Ultra, surpassing the projected “expert level” human performance of 89.8 percent.

Gemini notably outshines GPT-4 (87 percent), LLAMA-2 (68 percent), and Anthropic’s Claude 2 (78.5 percent) in the same test, excelling across eight of nine other standard benchmark tests.

This milestone marks the first instance of an AI model outperforming humans in this test, securing the highest score among current models. The test encompasses a diverse array of challenging questions spanning logical fallacies, ethical dilemmas, medical scenarios, economics, and geography.

The Pro version is poised to be integrated into Google’s Bard, an online chatbot introduced in March, with an upcoming Bard Advanced edition scheduled to incorporate the larger Gemini Ultra model in the early months of the following year.

While Bard’s English version has been launched in over 170 countries, its accessibility is restricted based on language and regional parameters. Sissie Hsiao, from Google, points out that these limitations are not due to technological constraints but rather the need to comply with specific regulations. She emphasizes the importance of aligning Bard’s expansion with local laws before making it more widely available.

At Google DeepMind, Eli Collins positions Gemini as the company’s most versatile model, breaking away from prior text-centric models by undergoing training across text, images, and sound. This wider training supposedly enables processing inputs and producing outputs in various formats. Initially, Bard is limited to text interactions, but there are upcoming plans to extend its functionality to include interactions involving audio and images.

Collins underscores Gemini’s leading-edge status across diverse domains, highlighting ongoing evaluations to discern its proficiency in various mediums, languages, and applications. Stressing the exploration of Ultra’s unique capabilities, Collins notes, “We’re still working to understand all of Ultra’s novel capabilities.”

Although Gemini models weren’t available for testing during the launch, Google showcased demonstrations illustrating the AI’s problem-solving skills and video processing prowess. Remarkably, Gemini asserts superior software development capabilities compared to previous models. The previous iteration of AlphaCode by DeepMind, which outperformed more than half of human developers, has been improved by Gemini with the goal of surpassing the coding capabilities of 85 percent of human coders.

–All you need to know about Gemini AI in 90 seconds I | Video by Google

留下评论

您的邮箱地址不会被公开。 必填项已用 * 标注