AWS and Cerebras partner for fastest AI inference via Bedrock with CS-3 systems

Amazon Web Services and Cerebras Systems announced a collaboration to deliver AI inference solutions for generative AI applications and large language model workloads. The solution will be deployed on Amazon Bedrock in AWS data centers. It combines AWS Trainium-powered servers, Cerebras CS-3 systems, and Elastic Fabric Adapter networking. Later this year, AWS plans to offer leading open-source LLMs and Amazon Nova using Cerebras hardware. David Brown, Vice President of Compute and ML Services at AWS, said inference speed remains a bottleneck for workloads such as real-time coding assistance. He added that splitting the workload across Trainium and CS-3 connected Andrew Feldman, Founder and CEO of Cerebras Systems, said the partnership will bring fast inference to enterprise customers in their existing AWS environments. The solution uses inference disaggregation to separate prompt processing, or prefill, from output generation, or decode. Trainium is optimized for prefill while CS-3 handles decode, which is memory bandwidth intensive and typically accounts for most inference time. The systems connect through low-latency, high-bandwidth EFA networking. The solution is built on the AWS Nitro System to provide the same security, isolation, and operational consistency as other AWS services.

AWS and Cerebras partner for fastest AI inference via Bedrock with CS-3 systems

US lifts export controls allowing Anthropic to release Mythos 5 model to selected US partners

Anthropic Academy, Free AI Certificates Launched March 2, 2026

Anthropic Introduces Claude Import Memory Feature

Image editing just got smarter with AI in Photoshop and Firefly