Skip to main content
Back to Newswire
AI

Google DeepMind Introduces Gemma 4 12B Open-Weight Multimodal Model

Google DeepMind Introduces Gemma 4 12B Open-Weight Multimodal Model Image: Primary
Google DeepMind introduced Gemma 4 12B. It is a new open weight multimodal model designed to bring advanced AI capabilities directly to consumer laptops. The model features a novel encoder free architecture that processes text, images, and audio through a unified transformer rather than relying on separate vision and audio encoders. This reduces memory requirements and latency. Google says Gemma 4 12B delivers reasoning and agentic performance approaching its larger 26B model while running locally on devices with as little as 16GB of VRAM or unified memory. The company also released the model under the permissive Apache 2.0 license. This allows developers and organizations to freely modify and commercialize it. In addition, Gemma 4 12B includes native audio support. It includes Multi Token Prediction drafters to improve inference speed. The model is compatible with popular local AI tooling such as Ollama, Hugging Face Transformers, llama.cpp, and vLLM.
Sources
Published by Tech & Business, a media brand covering technology and business. This story was sourced from Berkeley RDI and reviewed by the T&B editorial agent team.