Can Google Gemini’s Multimodal AI Model Compete with OpenAI’s GPT-4 and ChatGPT?
After nearly a year of secrecy, Google has unveiled its project Gemini, allowing the world to witness its capabilities. Gemini is Google’s largest AI model, possessing the ability to generate outputs in various formats such as images, video, and audio. This powerful AI system will directly compete with OpenAI’s GPT-4, and Google has wasted no time in asserting its superiority. During the launch, Google confidently claimed that Gemini outperforms other models in numerous benchmarks. Now, the question arises: how does Gemini differ from GPT-4, and can it surpass the creator of ChatGPT? Let’s delve into the details.
Google Gemini
Google cites the Gemini model’s problem-solving skills as particularly adept in math and physics, fueling hopes among AI optimists that it could lead to scientific breakthroughs that improve people’s lives.
“This is a major milestone in the development of artificial intelligence and the beginning of a new era for us at Google,” said Demis Hassabis, CEO of Google DeepMind, the AI division behind Gemini.
Google claimed that Gemini is its most flexible model to date, capable of operating efficiently in everything from data centers to mobile devices. Its state-of-the-art capabilities significantly improve the way developers and enterprise customers build and scale with AI. It comes in three versions – Gemini Nano, the basic model, Gemini Pro and its most advanced model, Gemini Ultra, which can deliver results in images, video and audio.
Gemini vs GPT-4
Google has also tested its benchmarks against GPT-4 benchmarks, and the company claims its AI modal has beaten OpenAI’s LLM in 30 out of 32 tests. A blog post said: “We have rigorously tested our Gemini models and evaluated their performance on a wide variety of tasks. From natural image, audio and video understanding to mathematical reasoning, Gemini Ultra’s performance exceeds current state-of-the-art results in 30 of 32 widely used academic benchmarks used for large language models (LLM) in research and development.
So what were these benchmarks where Google Gemini took the lead? The first and most notable was the MMLU (massive multitask language comprehension), which uses a combination of 57 subjects including mathematics, physics, history, law, medicine and ethics to test both world knowledge and problem-solving abilities. According to the company, Gemini became the first model to outperform human experts by 90.0 percent. By comparison, GPT-4 scored 86.4 percent.
Gemini was also ahead in the Big-Bench Hard (Multi-Step Reasoning) and DROP (Reading Comprehension) benchmarks under the Reasoning umbrella, where it scored 83.6 percent and 82.4 percent, compared to GPT-4 scores of 83.1 and 80.9 percent. . It also swept the OpenAI LLM in coding and math comparisons. However, the GPT-4 scored a massive 95.3 percent on HellaSwag (common sense reasoning for everyday tasks), beating out the Gemini, which scored 87.8 percent.