Mamba-3

Mamba-3 is a new state space model designed with inference efficiency as the primary goal. It introduces a more expressive recurrence formula, complex-valued state tracking, and a multi-input multi-output variant that improves accuracy without slowing decoding. The model achieves lower prefill-plus-decode latency than Mamba-2, Gated DeltaNet, and Llama-3.2-1B at the 1.5B scale across sequence lengths. Researchers open-sourced the kernels, which were built with Triton, TileLang, and CuTe DSL. The work stems from collaboration among teams at Carnegie Mellon University, Princeton University, Cartesia AI, and Together AI, and it is cross-posted on the Goomba Lab blog. Mamba-3 updates the architecture with QKNorm for training stability, removal of the short causal convolution used in prior versions, RoPE for expressing complex-valued states, and MIMO projections. It also adopts interleaved MLP layers. These changes maintain inference latency while expanding the expressivity of the underlying state space mechanism, according to the source description.

OpenClaw creator Peter Steinberger joins OpenAI

Anthropic identifies industrial-scale distillation attacks by DeepSeek, Moonshot AI and MiniMax on Claude

The enterprise AI land grab is on, Glean is building the layer beneath the interface

Key findings about how Americans view artificial intelligence