Is Google’s New 27B Model Better?

March 12, 2025

0 Views 0

SaveSavedRemoved 0

Google has just launched its latest state-of-the-art lightweight language model, Gemma 3. The model looks promising, outperforming Meta’s Llama 3, DeepSeek-V3, and OpenAI’s o3-mini in standard benchmark tests. While Google claims that it’s the “world’s best single-accelerator model,” let’s see how well it actually performs against other popular models. In this Gemma 3 27B vs DeepSeek-R1 comparison we will look into the features, benchmarks, and performance of the new model and compare them with those of China’s renowned DeepSeek-R1.

What is Gemma 3?

Gemma 3 is Google’s latest open-source AI model series, designed for seamless deployment across various devices, from handheld devices to enterprise-level workstations. Gemma 3 introduces multimodal capabilities, powered by PaliGemma 2, enabling it to process textual and visual content. It can also take in audio files and entire folders as contextual data input.

While large models like Grok 3 utilizes the power of over 100,000 NVIDIA H100 GPUs, and DeepSeek-R1 uses 32 GPUs, Gemma 3 is estimated to work on just a single one. Despite that and its small size of just 27B parameters, it has shown to outperform much larger models like DeepSeek-V3, OpenAI’s o3-mini, Llama3-405B, and Mistral Large.

Key Features of Gemma 3

Here are some of the key features of Google’s latest Gemma 3 model:

Multiple Variations: Gemma 3 is available in various sizes – 1B, 4B, 12B, and 27B – making it efficient and cost-effective for diverse use cases.
Small Size: The largest variant, Gemma 3 27B, is designed to deliver high performance while maintaining efficiency, owing to its 27B parameter size.
Single Accelerator Compatibility: The model is optimized to run on a single GPU or TPU, and is compatible with Nvidia GPUs as well. This makes it accessible for devices from smartphones to workstations.
Multimodality: Gemma 3 can analyze text, images, short videos, and audio files enabling applications such as visual question answering and image-based storytelling.
Google Integration: Since it’s developed by Google, Gemma 3 lets users upload files directly from Google Drive.
Multilingual: Pre-trained in over 35 languages, with support for more than 140 languages, Gemma 3 facilitates tasks like translation and optical character recognition (OCR).
Large Context Window: It supports 32k tokens in the 1B model and up to 128k tokens in larger models, as opposed to just 8k tokens in Gemma 2.
ShieldGemma 2: An image safety classifier that filters explicit, dangerous, or violent content, enhancing the safety of generated outputs.

How to Access Gemma 3

Gemma 3 is available for use on Google AI Studio. Here’s how you can access it:

Open Google AI Studio
Open Google AI Studio by clicking here.
Login or Sign Up
Sign in using your Gmail credentials. Sign up for an account if you don’t have one already.
Select Gemma 3 27B
Once signed in, go to the model selection dropdown list and scroll all the way down to find Gemma 3 27B. Simply select the model and start chatting with it.

Alternatively, you can access Gemma 3 directly on its Hugging Face space. You may also use it for building models on Keras, JAX, and Ollama.

Gemma 3 vs DeepSeek-R1: Features Comparison

Now let’s begin with the Gemma 3 vs DeepSeek-R1 comparisons. We’ll first have a look at their features and see what each model has to offer.

Feature	Gemma 3	DeepSeek-R1
Model Sizes	1B, 4B, 12B, 27B parameters	671B total (37B active per query)
Context Window	Up to 128K tokens in 27B model, 32K in 1B model	Up to 128K tokens
GPU Needs	Runs on single GPU/TPU	Needs high-end GPUs (H800/H100)
Image Generation	❌ No	❌ No
Image Analysis	✅ Yes (via SigLIP)	❌ No
Video Analysis	✅ Yes (short clips)	❌ No
Multimodality	✅ Text, images, videos	❌ Mainly text-based; can do text-extraction from images
File Uploads	✅ Text, images, videos	❌ Mostly text input
Web Search	❌ No	✅ Yes
Languages	35+ supported, trained in 140+	Best for English & Chinese
Safety	✅ Strong safety by ShieldGemma 2	❌ Weaker safety, jailbreak risks

Also Read: QwQ-32B vs DeepSeek-R1: Can a 32B Model Challenge a 671B Parameter Model?

Gemma 3 vs DeepSeek-R1: Performance Comparison

Now that we know what Gemma 3 and DeepSeek-R1 are capable of doing, let’s test out some of their common features and compare their performance. For this comparison, we’ll be testing the models’ performance on the following three tasks:

Coding: creating an animation
Logical Reasoning: solving a puzzle
STEM Problem-solving: solving a Physics problem

For each task, we’ll try out the same prompt on both the models and evaluate their responses based on the speed of generation and quality of the output.

If you wish to join me and try out some prompts for the comparison yourself, you can access DeepSeek-R1 by enabling the ‘DeepThink’ feature on the chat interface.

Task 1: Coding

Let’s start off by testing the coding capabilities of both the models. For this task, I am going to ask Gemma 3 and DeepSeek-R1 to write a Python code for a physics-based animation. We’ll run the code generated by both the models on Google Colab and compare their outputs.

Prompt: ”Write a python program that shows a ball bouncing inside a spinning pentagon, following the laws of Physics, increasing its speed every time it bounces off an edge.”

Output by Gemma 3’s Code

Output by DeepSeek-R1’s Code

Comparative Analysis

Gemma 3 starts writing the code almost immediately once given the prompt. On the other hand, DeepSeek-R1 begins by explaining the prompt and takes us through its thought process. Both the models provide us instructions on how to run the code. Gemma also gives us some key improvements and explanations, while DeepSeek explains the components of the animation and mentions its adjustable parameters.

All that being said, what Gemma created was a series of the same static image of a pentagon, instead of a visual animation, which was quite disappointing. Meanwhile DeepSeek-R1 did a great job at creating a simulation as per the prompt, with the ball flying off of the screen, beyond peak velocity. Hence, quite evidently, DeepSeek-R1 wins this round.

Score: Gemma 3: 0 | DeepSeek-R1: 1

Also Read: Google Gemini 2.0 Pro vs DeepSeek-R1: Who Does Coding Better?

Task 2: Logical Reasoning

In this task, we’ll give the models a logical puzzle to solve and compare their responses.

Prompt: “A solid, four-inch cube of wood is coated with blue paint on all six sides.
Then the cube is cut into smaller one-inch cubes.
These new one-inch cubes will have either three blue sides, two blue sides, one blue side, or no blue sides. How many of each will there be?”

Response by Gemma 3

Response by DeepSeek-R1

Comparative Analysis

Both the models have accurately solved the puzzle using logical reasoning. However, Gemma 3 only took about 30 seconds to get to the right answer while DeepSeek-R1 took almost twice as much. Gemma incorporates its thought process directly into the answer, while DeepSeek runs us through its thought process in detail before generating the response. Although the transparency helps us understand how the model thinks, I found it unnecessarily long for this task. For a simpler, quicker answer, I give a point to Gemma 3!

Score: Gemma 3: 1 | DeepSeek-R1: 1

Also Read: Can o3-mini Replace DeepSeek-R1 for Logical Reasoning?

Task 3: STEM Problem-solving

In this task we will test how good these models are when it comes to solving problems related to Science and Math. I’ll give the models a physics problem involving calculations and see how well they can solve it.

Prompt: “A satellite of mass 500 kg is in a circular orbit around Earth at an altitude of 500 km. Given:

Mass of Earth = 5.97 × 10²⁴ kg
Radius of Earth = 6,371 km
Gravitational constant (G) = 6.674 × 10⁻¹¹ N·m²/kg²

Calculate:
a) The orbital velocity of the satellite.
b) The period of revolution around Earth.”

Response by Gemma 3

“Okay, let’s break down the calculations for the satellite’s orbital velocity and period.

a) Orbital Velocity

The orbital velocity (v) of a satellite in a circular orbit is determined by the balance between the gravitational force and the centripetal force. The formula is:

v = √(GM/r)

Where:

G = Gravitational constant (6.674 × 10⁻¹¹ N·m²/kg²)
M = Mass of Earth (5.97 × 10²⁴ kg)
r = Orbital radius (distance from the center of Earth to the satellite)

First, we need to calculate the orbital radius (r):

r = Radius of Earth + Altitude

Plugging in the values,

r = 6,371 km + 500 km

r = 6,871 km = 6,871,000 m (Convert to meters)

Now, plug the values into the velocity formula:

v = √((6.674 × 10⁻¹¹ N·m²/kg² * 5.97 × 10²⁴ kg) / 6,871,000 m)

v = √((3.9857 × 10¹⁴ N·m²/kg) / 6,871,000 m) = √(5.799 × 10⁷ m²/s²) ≈ 7615 m/s

Therefore, the orbital velocity of the satellite is approximately 7.615 km/s.

b) Period of Revolution

The period (T) of revolution is the time it takes for the satellite to complete one orbit. It can be calculated using the following formula:

T = 2πr / v

Where:

π ≈ 3.14159
r = Orbital radius (6,871,000 m)
v = Orbital velocity (7615 m/s)

Plug in the values:

T = (2 * 3.14159 * 6,871,000 m) / 7615 m/s

T = 43,109,886.8 m / 7615 m/s ≈ 5652 seconds

Convert to minutes:

T ≈ 5652 seconds / 60 seconds/minute

T ≈ 94.2 minutes

Therefore, the period of revolution of the satellite around Earth is approximately 94.2 minutes.

Summary:

Orbital Velocity: 7.615 km/s
Period of Revolution: 94.2 minutes”

Response by DeepSeek-R1

Comparative Analysis

While both the models managed to answer the questions following all the steps correctly, Gemma did it in just 35 seconds which is over 6 times faster than DeepSeek-R1! Similar to the previous tasks, DeepSeek-R1 explains the entire thought process before generating the response, while Gemma 3 directly starts generating the response while explaining the steps. Gemma gave the velocity in km/s while DeepSeek gave it in m/s which is the correct SI unit of velocity.

For the second part of the question, although both the models used the same formula and values, Gemma 3 miscalculated the 2πr i.e. (2 * 3.14159 * 6,871,000) as 43,109,886.8, instead of the actual value, which is 43171729.78. This resulted in the model getting the final answer off by 12 seconds, which is a significant gap in space-related calculations. Hence, for this task as well, DeepSeek-R1 gets the point.

Score: Gemma 3: 1 | DeepSeek-R1: 2

Also Read: Grok 3 vs DeepSeek R1: Which is Better?

Performance Comparison Summary

Task	Gemma 3 Performance	DeepSeek-R1 Performance	Winner
Coding: Animation	Started generating code quickly but failed to produce a working animation. Provided explanations and improvements but lacked execution.	Took longer but provided a working animation following the prompt. Explained components and included adjustable parameters.	DeepSeek-R1
Logical Reasoning	Solved the puzzle correctly in ~30 seconds, integrating the thought process into the response for a concise answer.	Also solved correctly but took twice as long, providing a detailed step-by-step explanation.	Gemma 3
STEM Problem-solving	Answered quickly (~35s) with mostly correct steps but made a miscalculation in the final answer. Provided velocity in km/s instead of SI unit (m/s).	Took significantly longer but followed a structured approach, ensuring correct calculations with proper SI units.	DeepSeek-R1

Although Gemma 3 excels in speed and multimodal capabilities, it struggles in execution-heavy tasks like coding and complex problem-solving. On the other hand, DeepSeek-R1, despite being slower, delivers more precise outputs, especially in STEM-related problems.

Gemma 3 vs DeepSeek-R1: Benchmark Comparison

Despite its small size of just 27B parameters, Gemma 3 has been outperforming much larger models like DeepSeek-V3, OpenAI’s o3-mini, Llama3-405B, and Mistral Large, especially in coding tasks. However, it comes second to DeepSeek-R1, as per the Chatbot arena elo scores.

Gemma 3 Benchmark Comparison — Source: Google Dev

On the real-time leaderboard of Chatbot Arena, Gemma 3 is tied in 9th position along with Qwen2.5-Max, o1-preview, and o3-mini (high). Meanwhile, DeepSeek-R1 is ranked 6 on the same leaderboard.

Chatbot Arena Leaderboard — Source: Chatbot Arena

When it comes to other standard benchmarks, DeepSeek-R1 outperforms Gemma 3 in almost all categories. Here are some of the test results.

Benchmark (Metric)	Bird-SQL	MMLU-Pro (EM)	GPQA-Diamond (Pass@1)	SimpleQA (Correct)	LiveCodeBench (Pass@1-COT)	MATH-500 (Pass@1)
Gemma 3 27B	54.4	67.5	42.4	10	29.7	89
DeepSeek R1	34	84.0	71.5	30.1	65.9	97.3

Sources:

Conclusion

This comparison of Gemma 3 vs DeepSeek-R1 gives us a lot of clarity regarding the performance of both these models in real-life applications. While Google’s Gemma 3 is an impressive lightweight model optimized for efficiency, DeepSeek-R1 remains a dominant force in AI displaying superior performance across multiple benchmarks and tasks.

However, Gemma 3’s ability to run on a single GPU and its integration with Google’s ecosystem make it a viable choice for developers and researchers seeking an efficient and accessible model. It’s smaller size also makes it a great choice for handheld devices and smaller projects.

Frequently Asked Questions

Q1. What is Gemma 3?

A. Gemma 3 is Google’s latest lightweight AI model designed for efficiency, running on a single GPU. It offers multimodal capabilities like text, image, and video processing.

Q2. What is DeepSeek-R1?

A. DeepSeek-R1 is a high-performance Chinese AI model optimized for text-based tasks and web search. It is powered by high-end GPUs and shows great performance in various benchmark tests.

Q3. What are the key differences between Gemma 3 and DeepSeek-R1?

A. Gemma 3 is optimized for single-GPU deployment, supports multimodal input, and offers strong safety measures. DeepSeek-R1 excels in reasoning and coding tasks but lacks multimodal capabilities and requires more computational resources.

Q4. Is Gemma 3 better than DeepSeek-R1 in coding?

A. No, DeepSeek-R1 outperforms Gemma 3 in coding tasks. While Gemma 3 generates responses quickly, it fails to produce working animations, whereas DeepSeek-R1 executes even complex coding tasks successfully.

Q5. Which model performs better in benchmark tests – Gemma 3 or DeepSeek-R1?

A. DeepSeek-R1 is ranked higher (#6) in Chatbot Arena compared to Gemma 3 (#9). Benchmark results also show that DeepSeek-R1 outperforms Gemma 3 in areas like SQL, math, and general problem-solving.

Q6. Can Gemma 3 generate images or videos?

A. No, Gemma 3 cannot generate images or videos. However, it can analyze images and short videos, while most other models, like DeepSeek-R1, do not support any visual input.

Q7. How to access Gemma 3?

A. You can access Gemma 3 27B via Google AI Studio or Hugging Face. You can also access it for building models on Keras, JAX, and Ollama.

Sabreena is a GenAI enthusiast and tech editor who’s passionate about documenting the latest advancements that shape the world. She’s currently exploring the world of AI and Data Science as the Manager of Content & Growth at Analytics Vidhya.

Is Google’s New 27B Model Better?

What is Gemma 3?

Key Features of Gemma 3

How to Access Gemma 3

Gemma 3 vs DeepSeek-R1: Features Comparison

Gemma 3 vs DeepSeek-R1: Performance Comparison

Task 1: Coding

Output by Gemma 3’s Code

Output by DeepSeek-R1’s Code

Comparative Analysis

Score: Gemma 3: 0 | DeepSeek-R1: 1

Task 2: Logical Reasoning

Response by Gemma 3

Response by DeepSeek-R1

Comparative Analysis

Score: Gemma 3: 1 | DeepSeek-R1: 1

Task 3: STEM Problem-solving

Response by Gemma 3

Response by DeepSeek-R1

Comparative Analysis

Score: Gemma 3: 1 | DeepSeek-R1: 2

Performance Comparison Summary

Gemma 3 vs DeepSeek-R1: Benchmark Comparison

Conclusion

Frequently Asked Questions

Login to continue reading and enjoy expert-curated content.

Ghost Ensemble / Gutstein / LA Creative Music Book – Avant Music News

Funny animal gifs - part 340 (10 gifs)

Review of The Business Trip: A Novel by Jessie Garcia

Building AI-Powered Apps with DeepSeek-V3 and Gradio

#GTMTips: Use eventTimeout With eventCallback

Review of The Tell: Oprah’s Book Club: A Memoir by Amy Griffin

Leave a reply Cancel reply

Compare items