AI news
March 26, 2024

Google’s Gemini 1.5 Shocks AI World with 1M Token and Dramatically Improved Performance

Is Gemini 1.5 worth the $20 per month subscription?

Jim Clyde Monge
by 
Jim Clyde Monge

Google’s AI division has been under intense pressure to keep pace with OpenAI’s groundbreaking GPT-4 language model. Their initial release of Gemini Ultra, positioned as their best offering yet, left many users, including me, underwhelmed.

Today, Google dropped a bombshell — Gemini 1.5 — a dramatically improved version of their flagship language model.

What’s New in Gemini 1.5?

Gemini 1.5 delivers a host of major enhancements that address the shortcomings of the intial version:

  • Mixture of Experts (MoE) Architecture: Google is adopting the MoE architecture that likely powers GPT-4. This enables the model to break down a prompt into subtasks and route them to specialized “experts,” dramatically boosting efficiency and performance.
  • Performance Breakthrough: Gemini 1.5 Pro, the mid-tier version, now allegedly rivals the top-of-the-line Gemini 1.0 Ultra. That’s a massive performance increase in a very short amount of time. We need benchmarks of course, but this is incredibly promising.
  • “Needle-in-a-Haystack” Information Retrieval: The new model demonstrates significantly improved ability to pinpoint specific details within an enormous volume of text, video, or audio data.

The Million-Token Context Upgrade

Perhaps most shocking is the upgrade in context window size. While most current large language models (LLMs) max out around 30,000 or so tokens, Gemini 1.5 Pro, in its experimental build, can process a mind-boggling 1 million tokens.

1 million tokens can handle

  • 1 hour video
  • 11 hours audio
  • More than 30K lines of codes
  • More than 700K words

This could be a game-changer — let me unpack what this means.

Imagine providing the entire script of a feature-length movie, thousands of lines of complex code, or pages upon pages of a book. The LLM will have enough context to analyze nuanced interactions, track characters over extended plots, or find code errors on a massive scale.

It’s the difference between asking a chatbot about a 30-second conversation versus asking it to dissect the motivations of characters across the entire Lord of the Rings trilogy.

What Does This Mean For Users?

These aren’t just flashy numbers; Gemini 1.5’s new abilities translate to tangible benefits:

  • Understanding Complexity: Toss that dense technical report or multi-page legal contract into Gemini; it’ll extract summaries, answer detailed questions, and even flag unclear or contradictory points.
  • Enhanced Creativity: Longer context means broader inspiration. Feed Gemini 1.5 a few paragraphs of your story, and it might suggest plot twists, dialogue, or even fresh character angles you’d never considered.
  • Streamlined Processes: Need to analyze hours of customer feedback? Gemini could parse it for sentiment, extract key complaints, and generate summaries effortlessly. This has serious potential for saving time and money in all types of businesses.

Is Gemini Advanced Now Worth It?

On paper, Gemini 1.5 is definitely worth upgrading.

Gemini 1.0 suffered from issues, making the switch pointless for users with existing subscription to ChatGPT Plus.

But seeing these leaps in capability, I’m inclined to say Gemini Advanced looks incredibly promising.

Does it hold up to real-word usage? I don’t know yet. I understand some hesitation to spend money right away, but considering the rapid pace of progress and the free trial offerings, I’d say give it a shot.

Final Words

Google surprised me. I was initially disappointed with Gemini’s debut, but 1.5 represents an ambitious about-face. While independent benchmarks are still needed, there’s no denying — Google is back in the game and smelling blood.

The pressure is on OpenAI to raise the bar again.

Google likely felt that heat; if GPT-4 was a wake-up call, we’re witnessing a company scrambling to get competitive. The fact that their response is this significant — with real-world implications — makes this such an exciting chapter in the escalating AI race.

Get your brand or product featured on Jim Monge's audience