ACE-AI delivers a unified fabric across the network for Distributed AI, from Datacenter to Edge to Multi-cloud
ACE-AI employs IP CLOs and Virtual Distributed Router (VDR) architectures for scalable GPU connectivity. It ensures high performance and lossless connectivity through RoCEv2 support, Priority Flow Control (PFC), and Adaptive Routing, all while maintaining low latency and high availability.
ACE-AI supports SmartNICs like BlueField3, enhancing inferencing capabilities at the Edge. This setup facilitates security, traffic engineering, and efficient multi-cloud networking for smooth model operations.
ACE-AI provides seamless access to AI workloads across various locations. Its Egress Cost Control (ECC) reduces costs associated with large transfers of AI data, optimizing resource use across clouds.
AI workloads are increasingly distributed. Examples of how AI is being distributed across the network include Distributed Model Training, where AI/ML models are trained on multiple nodes within the network, enhancing efficiency and performance for large, complex models. Additionally, Inferencing at the Edge involves deploying inferencing models closest to end users, with reduced latency and improved performance for applications. Key requirements for networks supporting distributed AI include high performance and lossless connectivity, predictable latency, high availability and resiliency with zero impact failover, and fabric-wide visibility.
Complementing the powerful capabilities of ACE-AI, the TGAX hardware platform delivers the robust, high-performance foundation essential for deploying your AI/ML and Connected Edge applications at scale. Built on NVIDIA Spectrum 4 and designed to seamlessly integrate with ACE-AI, TGAX provides the purpose-built infrastructure you need to accelerate your edge computing initiatives. Discover how TGAX enhances your connected edge experience by exploring its dedicated features.