{"version":"1.0","type":"rich","provider_name":"Tech & Business","provider_url":"https://techandbusiness.org","title":"NVIDIA and Sarvam AI co-design optimizations delivering 4x inference gains for sovereign 30B multilingual models","author_name":"T&B Newswire · AI","thumbnail_url":"https://developer-blogs.nvidia.com/wp-content/uploads/2026/02/ov-dgx-cloud-ari-blog-1920x1080-2.png","width":600,"height":400,"html":"<blockquote class=\"tb-newswire-embed\" style=\"max-width:600px;border-left:3px solid #22d3ee;padding:12px 16px;margin:0;font-family:-apple-system,system-ui,sans-serif;background:#09090b;border-radius:0 8px 8px 0;\">\n      <p style=\"margin:0 0 8px;font-size:10px;font-weight:600;letter-spacing:0.1em;color:#71717a;text-transform:uppercase;\">T&B NEWSWIRE · AI</p>\n      <p style=\"margin:0 0 8px;font-size:18px;font-weight:700;line-height:1.3;color:#fff;\"><a href=\"https://techandbusiness.org/newswire/X0O85GNlLhBSz1ObTqRjlV\" style=\"color:#fff;text-decoration:none;\">NVIDIA and Sarvam AI co-design optimizations delivering 4x inference gains for sovereign 30B multilingual models</a></p>\n      <p style=\"margin:0;font-size:14px;color:#a1a1aa;line-height:1.5;\">Sarvam AI, a generative AI startup based in Bengaluru, India, collaborated with NVIDIA to co-design hardware and software optimizations for its Sovereign 30B model. The effort targeted strict latency ...</p>\n    </blockquote>"}