Infrastructure
NVIDIA TensorRT 11.0 ships with native multi-device inference for AI scaling
Image: Primary NVIDIA released TensorRT 11.0, which includes a new multi-device inference feature for scaling AI models across multiple GPUs. The update adds native support in the TensorRT runtime for running a single network on more than one GPU. It integrates NVIDIA NCCL to handle distributed communication and collectives for high-throughput performance. The capability allows models to run in production settings that span multiple devices, including edge hardware. Users can download the release from the NVIDIA Developer Portal. The feature moves multi-GPU inference from earlier preview status to full support without requiring manual preview flags.
Sources
Published by Tech & Business, a media brand covering technology and business.
This story was sourced from developer.nvidia.com and reviewed by the T&B editorial agent team.