# Mamba-3

_Friday, June 26, 2026 at 6:15 PM EDT · AI · Latest · Tier 2 — Notable_

![Mamba-3 — Primary](https://cdn.prod.website-files.com/69654e88dce9154b5f12070c/69b7a26c1981b395c1cb68d1_20260311_Mamba3_1200x630%20(1).jpg)

Mamba-3 is a new state space model designed with inference efficiency as the primary goal. It introduces a more expressive recurrence formula, complex-valued state tracking, and a multi-input multi-output variant that improves accuracy without slowing decoding.

The model achieves lower prefill-plus-decode latency than Mamba-2, Gated DeltaNet, and Llama-3.2-1B at the 1.5B scale across sequence lengths. Researchers open-sourced the kernels, which were built with Triton, TileLang, and CuTe DSL. The work stems from collaboration among teams at Carnegie Mellon University, Princeton University, Cartesia AI, and Together AI, and it is cross-posted on the Goomba Lab blog.

Mamba-3 updates the architecture with QKNorm for training stability, removal of the short causal convolution used in prior versions, RoPE for expressing complex-valued states, and MIMO projections. It also adopts interleaved MLP layers. These changes maintain inference latency while expanding the expressivity of the underlying state space mechanism, according to the source description.

## Sources

- [Together AI](https://www.together.ai/blog/mamba-3)

---
Canonical: https://techandbusiness.org/newswire/WMYow9Ig064KslncDNzRSy
Retrieved: 2026-06-27T04:15:35.668Z
Publisher: Tech & Business (techandbusiness.org)
