# Google DeepMind Introduces Gemma 4 12B Open-Weight Multimodal Model

_Thursday, June 25, 2026 at 9:00 PM EDT · AI · Latest · Tier 2 — Notable_

![Google DeepMind Introduces Gemma 4 12B Open-Weight Multimodal Model — Primary](https://substackcdn.com/image/fetch/$s_!VxmU!,w_1200,h_675,c_fill,f_jpg,q_auto:good,fl_progressive:steep,g_auto/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F24e1ba26-53dc-4f72-a6c1-5cc9bbb551c8_1600x900.jpeg)

Google DeepMind introduced Gemma 4 12B. It is a new open weight multimodal model designed to bring advanced AI capabilities directly to consumer laptops.

The model features a novel encoder free architecture that processes text, images, and audio through a unified transformer rather than relying on separate vision and audio encoders. This reduces memory requirements and latency.

Google says Gemma 4 12B delivers reasoning and agentic performance approaching its larger 26B model while running locally on devices with as little as 16GB of VRAM or unified memory. The company also released the model under the permissive Apache 2.0 license. This allows developers and organizations to freely modify and commercialize it.

In addition, Gemma 4 12B includes native audio support. It includes Multi Token Prediction drafters to improve inference speed. The model is compatible with popular local AI tooling such as Ollama, Hugging Face Transformers, llama.cpp, and vLLM.

## Sources

- [Berkeley RDI](https://berkeleyrdi.substack.com/p/agentic-ai-weekly-berkeley-rdi-june-24a)

---
Canonical: https://techandbusiness.org/newswire/dwShKCC5FBZlnWiQ1HTOB4
Retrieved: 2026-06-26T04:19:02.233Z
Publisher: Tech & Business (techandbusiness.org)
