🤖 Ghostwritten by Claude · Curated by Tom Hundley
This article was written by Claude and curated for publication by Tom Hundley.
The first model to cross 1500 Elo. The birth of generative interfaces. And a $2.4 billion bet that changed how we code. November 2025 will be remembered.
Three years ago, Google declared Code Red over ChatGPT. On November 18, 2025, Sam Altman declared his own Code Red—this time over Gemini 3.
The tables havent just turned. Theyve been flipped.
Gemini 3 represents the most significant single release in Googles AI history. Its now integrated into products reaching over 2.6 billion monthly users (650 million on the Gemini app, 2 billion on AI Overviews in Search). It achieved the first-ever 1500+ Elo score on LMArena. And it introduced an entirely new paradigm for how AI presents information.
This isnt hype. This is the moment Google stopped catching up and started leading.
Before diving into capabilities, lets establish the competitive landscape Gemini 3 entered:
| Benchmark | Gemini 3 Pro | Claude Opus 4.5 | GPT-5.2 |
|---|---|---|---|
| LMArena Elo | 1501 (first 1500) | 1478 | 1485 |
| Humanitys Last Exam | 41.0% (Deep Think) | 35.2% | 38.4% |
| GPQA Diamond (PhD science) | 93.8% | 91.2% | 93.2% |
| SWE-bench Verified (coding) | 76.2% | 80.9% | 80.0% |
| MMMU-Pro (multimodal) | 81.0% | 72.4% | 68.9% |
| ScreenSpot-Pro (screenshot understanding) | 72.7% | 36.1% | 3.6% |
The pattern: Gemini 3 dominates reasoning and multimodal understanding. Claude leads production coding. GPT-5.2 offers strong all-around performance. But that ScreenSpot-Pro gap—Gemini scoring 20x higher than GPT—signals something deeper about where Googles model excels.
The most transformative feature in Gemini 3 isnt faster reasoning or longer context windows. Its Generative UI—the ability for the model to decide what kind of output best fits your prompt.
Traditional AI: You ask a question, you get text back.
Gemini 3: You ask a question, and the model might return:
Ask Gemini to explain Van Goghs gallery with context for each piece, and you dont get paragraphs of text. You get an immersive, image-rich layout resembling a digital museum exhibition.
Ask about mortgage rates, and you might get a functional loan calculator generated in real-time.
Generative UI operates through two key mechanisms:
Visual Layout: Gemini assembles magazine-style responses with modules, images, and structured visual hierarchies. The model determines the optimal presentation based on the query type.
Dynamic View: Using what Google calls agentic coding, Gemini can build interactive modules—calculators, galleries, simulations—within the response itself.
This is now live in the Gemini app and Google Searchs AI Mode. A GenUI SDK for Flutter is available in alpha for developers who want to build similar experiences.
Generative UI eliminates the abstraction layer between intent and output. Instead of asking an AI for information and then formatting it yourself, the AI delivers the complete experience. This is closer to how humans actually want to consume information—not as walls of text, but as purposeful presentations matched to the content.
Gemini 3 Deep Think isnt just longer thinking. Its a fundamentally different approach to complex problem-solving.
Deep Think uses parallel reasoning to explore multiple hypotheses simultaneously. Rather than pursuing a single chain of thought, it branches into iterative rounds of reasoning, evaluating different approaches before converging on an answer.
According to Google: Unlike previous versions where Gemini would respond hastily, Deep Think ensures the model first deeply understands the question, considers its various aspects, and then responds with greater thoughtfulness.
Deep Think builds on variants that achieved:
Deep Think excels where problems require:
Deep Think is available to Google AI Ultra subscribers ($250/month). Select Deep Think in the prompt bar with Gemini 3 Pro selected in the model dropdown.
The story behind Google Antigravity is as significant as the product itself.
In May 2025, OpenAI reached a $3 billion agreement to acquire Windsurf (formerly Codeium). Microsoft blocked the deal over exclusivity concerns. The agreement expired in July.
Google moved fast. For $2.4 billion, they secured:
Four months later, that team shipped Antigravity.
Antigravity is a VS Code fork rebuilt around agent-first architecture. Its not just AI-assisted coding—its AI-managed coding.
Two Modes:
Editor View: Traditional hands-on coding with an agent sidebar. Claude Code and Cursor users will feel at home.
Manager View: Mission control for orchestrating multiple agents and workspaces asynchronously. You architect; agents execute.
Key Capabilities:
Traditional AI coding tools are reactive—you prompt, they respond. Antigravity is proactive. You describe a feature; multiple agents work different aspects in parallel, coordinating through the Manager View.
As Antigravitys documentation states: You act as the architect, collaborating with intelligent agents that operate autonomously across the editor, terminal, and browser.
Antigravity is free during public preview. Available for macOS, Windows, and Linux.
With Gemini 3, Google introduced Gemini Agent—an experimental feature for multi-step task execution directly within the Gemini app.
Gemini Agent decomposes user intent into discrete operations, then calls services like Gmail, Calendar, Drive, Maps, or YouTube as needed to complete the task.
Example workflow:
The key word: confirmation. Gemini Agent asks before making important changes—purchases, deletions, sensitive operations.
| Service | Actions |
|---|---|
| Gmail | Triage, summarize threads, draft responses |
| Calendar | Create/modify events, resolve conflicts, set reminders |
| Drive | Search files, organize folders, prepare documents |
| Deep Research | Synthesize information across web sources |
| Canvas | Generate and iterate on content |
Gemini Agent is experimental and currently limited to:
Google explicitly states: Your supervision is important to help prevent unintended and potentially harmful actions.
According to Googles internal testing, agent-assisted meeting scheduling increased success rates from 62% (manual) to 78% (with structured follow-ups). The compound effect of reliable small automations adds up.
Gemini 3s capabilities extend beyond screens. Project Astra brings these models into the physical world.
At The Android Show: XR Edition, Google confirmed two smart glasses form factors shipping in 2026:
Screen-Free Glasses (with Gentle Monster, Warby Parker):
Display AI Glasses:
Project Astra integrates Geminis multimodal understanding into continuous real-world interaction. The glasses see and hear what you do, maintaining context across your day.
According to Google: Android XR is the first Android platform built in the Gemini era.
This represents Googles long-term vision: AI that isnt confined to devices you hold, but ambient intelligence that accompanies you.
Workspace Studio, launched December 3, 2025, brings Gemini 3s agentic capabilities to business users without code.
Transforms Gmail, Drive, Docs, Sheets, Chat, and Meet into AI automation platforms. Users with no coding experience can create agents in minutes.
Example agents:
Google reports over 20 million tasks processed by Workspace Studio agents in the first 30 days among alpha program participants.
The platform is included at no extra cost in all Google Workspace business and enterprise plans.
Google Cloud showed 35% revenue growth with a $155 billion backlog. CEO Sundar Pichai noted they signed more billion-dollar deals through Q3 2025 than in the previous two years combined.
Gemini 3s enterprise integration strategy embeds AI within existing workflows rather than requiring new ones—a critical factor for adoption.
Gemini 3 Pros API pricing undercuts competitors significantly:
| Model | Input (per 1M tokens) | Output (per 1M tokens) |
|---|---|---|
| Gemini 3 Pro | $2.00 | $12.00 |
| Claude Opus 4.5 | $5.00 | $25.00 |
| GPT-5.2 | $5.00 | $15.00 |
For contexts over 200K tokens:
| Tier | Price | Features |
|---|---|---|
| Free | $0 | Basic access, rate limited |
| AI Pro | $20/month | Gemini 3 Pro, standard limits |
| AI Ultra | $250/month | Deep Think, highest limits, early features |
Google waived charges during the initial preview weeks—many developers reported $0 bills even after thousands of requests. Standard pricing applies once the model reaches GA (expected Q1 2026).
OpenAIs internal Code Red came two weeks after Gemini 3s launch. The response: GPT-5.2, released three weeks ahead of schedule on December 11, 2025.
To focus resources on ChatGPT:
Despite Code Red, ChatGPT maintains user lead:
But Googles distribution advantage—Search, Workspace, Android—means Gemini touches more moments in more contexts. The question isnt just users; its integration depth.
By December, Altman indicated ChatGPT metrics hadnt dropped as feared: I believe that when a competitive threat happens, you want to focus on it, deal with it quickly. He projected exiting Code Red by January.
For teams evaluating AI infrastructure, Gemini 3 shifts the calculus:
Multimodal understanding is critical: ScreenSpot-Pros 20x advantage over GPT isnt a typo. If your use case involves parsing images, screenshots, or visual interfaces, Gemini leads.
Cost matters at scale: At $2/$12 per million tokens (vs. $5/$25 for Claude Opus), Gemini offers 60% savings on complex reasoning tasks.
Youre in the Google ecosystem: Workspace integration, Search deployment, Android XR roadmap—Gemini has native pathways everywhere.
You need generative UI: No other model offers dynamic interface generation at this level of maturity.
Production coding is the priority: Claude Opus 4.5s 80.9% on SWE-bench Verified still leads. For software engineering teams, that gap matters.
You need proven stability: Gemini 3 is new. Claude and GPT have more production miles.
Your workflows depend on specific tool integrations: Evaluate MCP support and existing toolchains before switching.
Gemini 3 is the first model to cross 1500 Elo on LMArena. The reasoning benchmark gap is real, not marketing.
Generative UI changes how we think about AI output. Information delivered as experience, not text dumps. This is a paradigm shift.
Deep Thinks parallel reasoning isnt slower answers. Its fundamentally different problem-solving architecture, achieving gold-medal mathematical and programming performance.
Antigravity represents a $2.4B bet on agent-first development. Google acquired Windsurfs best people and tech, then shipped in four months.
The ecosystem play is Googles real advantage. 2 billion AI Overview users, Workspace Studio, Android XR glasses—Gemini touches more contexts than any competitor.
OpenAIs Code Red validated the threat. When Sam Altman pauses advertising and ships early, the competitive pressure is real.
Pricing is a weapon. At $2/$12 vs. competitors $5/$25, Gemini 3 makes frontier reasoning economically accessible.
The AI landscape just shifted. Gemini 3 didnt just catch up to the frontier—it expanded it. The question now isnt whether Google is competitive. Its how everyone else responds.
This article is a live example of the AI-enabled content workflow we build for clients.
| Stage | Who | What |
|---|---|---|
| Research | Claude Opus 4.5 | Analyzed current industry data, studies, and expert sources |
| Curation | Tom Hundley | Directed focus, validated relevance, ensured strategic alignment |
| Drafting | Claude Opus 4.5 | Synthesized research into structured narrative |
| Fact-Check | Human + AI | All statistics linked to original sources below |
| Editorial | Tom Hundley | Final review for accuracy, tone, and value |
The result: Research-backed content in a fraction of the time, with full transparency and human accountability.
Were an AI enablement company. It would be strange if we didnt use AI to create content. But more importantly, we believe the future of professional content isnt AI vs. Human—its AI amplifying human expertise.
Every article we publish demonstrates the same workflow we help clients implement: AI handles the heavy lifting of research and drafting, humans provide direction, judgment, and accountability.
Want to build this capability for your team? Lets talk about AI enablement →
Discover more content: