Design World

  • Home
  • Technologies
    • ELECTRONICS • ELECTRICAL
    • Fastening • joining
    • FLUID POWER
    • LINEAR MOTION
    • MOTION CONTROL
    • SENSORS
    • TEST & MEASUREMENT
    • Factory automation
    • Warehouse automation
    • DIGITAL TRANSFORMATION
  • Learn
    • Tech Toolboxes
    • Learning center
    • eBooks • Tech Tips
    • Podcasts
    • Videos
    • Webinars • general engineering
    • Webinars • Automated warehousing
    • Voices
  • LEAP Awards
  • 2025 Leadership
    • 2024 Winners
    • 2023 Winners
    • 2022 Winners
    • 2021 Winners
  • Design Guides
  • Resources
    • Subscribe
    • 3D Cad Models
      • PARTsolutions
      • TraceParts
    • Digital Issues
      • Design World
      • EE World
    • Engineering diversity
    • Trends
  • Supplier Listings
  • Advertise
  • Subscribe

What type of interconnects and connectors link accelerator cards in AI data centers?

By Aharon Etengoff | May 7, 2025

Many data centers are packed with racks of high-performance graphics processing units (GPUs) and tensor processing units (TPUs). These accelerators process massive artificial intelligence (AI) and machine learning (ML) datasets, executing complex operations in parallel and exchanging data at high speed. This article explores the interconnects and connectors that link AI accelerator clusters together.

Scaling AI compute with accelerators and clustered architectures

AI accelerators such as GPUs, TPUs, and, in some cases, field-programmable gate arrays (FPGAs) run large language models (LLMs) using parallel processing to handle complex computations at scale. These devices divide complex workloads into smaller tasks and execute billions of operations simultaneously. Most AI models are built on neural networks, which benefit from this massively parallel architecture to accelerate both training and inference.

As shown in Figure 1, AI accelerators are typically deployed in tightly coupled clusters to efficiently share data, synchronize computations, and scale training across thousands of processing units.

Figure 1. A Google data center houses racks of tightly coupled AI accelerators used for large-scale machine learning workloads. Shown here is an illustration of the TPU v4 infrastructure. (Image: Google)

This configuration helps meet the low-latency, high-performance demands of AI workloads. It also improves throughput, minimizes bottlenecks, and enables real-time inference for complex, compute-intensive tasks.

High-level interconnect architectures and protocols

Data centers use specialized interconnect technologies to link AI accelerator clusters to operate efficiently at scale, enabling high-speed communication within and across nodes. These interconnects support massive data exchange, synchronized processing, and the parallel execution of complex workloads. Common AI accelerator interconnects include:

NVLink — NVIDIA’s proprietary, high-bandwidth interconnect facilitates direct GPU-to-GPU communication with low latency and high energy efficiency. It supports rapid synchronization and data sharing across accelerators using dedicated connectors and NVSwitch technology. NVLink scales efficiently in multi-GPU environments by enabling memory pooling, allowing GPUs to share a unified address space and operate as a single, high-performance compute unit. As shown in Figure 2, NVLink 4.0 delivers up to 900 GB/s of bidirectional bandwidth on the H100 GPU.

Figure 2. Nvidia’s H100 GPU uses NVLink 4.0 to enable up to 900 GB/s of bidirectional bandwidth for high-speed GPU-to-GPU communication in multi-accelerator clusters. (Image: Nvidia)

UALink  — the Ultra Accelerator Link is an open interconnect standard designed to scale clusters of up to 1,024 AI accelerators within a single computing pod. The 1.0 specification supports 200G per lane and enables dense, memory-semantic connections with Ethernet-class bandwidth and PCIe-level latency. UALink supports read, write, and atomic transactions across nodes and defines a common protocol stack for scalable multi-node systems. UALink is positioned as a high-performance alternative for scaling within accelerator pods by targeting lower latency than typical Ethernet for inter-node communication.

Compute Express Link (CXL) enables coherent, low-latency communication between CPUs, GPUs, and other accelerators. It improves resource utilization across heterogeneous systems by supporting cache coherency, memory pooling, resource sharing, and memory disaggregation. CXL 1.1 and 2.0 operate over PCIe 5.0, while CXL 3.0 and later leverage PCIe 6.0 or beyond, enabling transfer speeds of up to 64 GT/s and bidirectional bandwidth of 128 GB/s.

High-speed Ethernet facilitates data movement between accelerator clusters distributed across servers and nodes. Technologies such as 400 GbE and 800 GbE enable high-throughput communication using NICs and optical or copper cabling. While Ethernet introduces higher latency than NVLink or UALink, it offers broad interoperability and flexible deployment at the rack and data center levels.

Optical interconnects and form factors; optical links transmit data at high speeds over extended distances, linking accelerator clusters across racks and nodes. Compared to copper-based connections, they consume less power and overcome signal integrity challenges such as attenuation and EMI. These interconnects often rely on standardized form factors, such as Quad Small Form-factor Pluggable (QSFP), Quad Small Form-factor Pluggable Double Density (QSFP-DD), and Octal Small Form-factor Pluggable (OSFP), which function as the physical interface for both electrical and optical Ethernet connections. These same form factors are also widely used for other high-speed optical interconnects in data centers, such as InfiniBand and proprietary optical links, further extending their role in scalable compute infrastructure.

Physical connectors and interfaces for AI accelerators

High-performance interconnects rely on various physical-layer components, including connectors, slots, and cabling interfaces. These components help maintain signal integrity, mechanical compatibility, and scalable system design. They transmit electrical and optical signals across boards, devices, and systems, facilitating the reliable operation of clustered AI infrastructure.

Although interconnects define the communication protocols and signaling standards, they rely on these physical interfaces to function effectively at scale. Common connector and interface technologies are described below.

PCIe interface connects accelerator cards to host systems and other components. Although newer generations, such as PCIe 5.0 and 6.0, offer scalable bandwidth, they may act as bottlenecks in tightly coupled multi-accelerator environments. Retimers are often used to maintain signal integrity over longer board traces.

Mezzanine connectors are used in the Open Compute Project’s Open Accelerator Infrastructure (OAI). They support high-density module-to-module connections, reduce signal loss, manage impedance, and simplify mechanical integration in modular accelerator designs.

Active electrical cables (AECs) integrate digital signal processors within copper cabling to boost signal strength over longer distances. This enables electrical links to maintain data integrity beyond the reach of passive cables.

High-speed board-to-board connectors enable direct module communication at data rates up to 224 Gbps using PAM4 modulation. They support dense, low-latency communication within AI platforms and tightly integrated accelerator clusters.

Optical connectors — QSFP, QSFP-DD, and OSFP form factors are the physical interface for both optical and short-range electrical Ethernet connections. These transceiver formats are widely deployed across NICs, switch ports, and optical modules and support PAM4 modulation to maintain signal performance across various deployment scenarios.

Liquid-cooled connectors

As shown in Figure 3, an increasing number of high-performance AI accelerator racks rely on liquid cooling. Many of the connectors used in these systems must meet stringent mechanical and thermal requirements to ensure safe, reliable operation.

Figure 3. A liquid-cooled GPU server with integrated quick-disconnect fittings and manifold connections for high-density AI training workloads. These connectors are engineered to support safe, high-throughput cooling in systems such as the NVIDIA HGX H100 platform. (Image: Supermicro)

These connectors typically withstand temperatures up to 50°C (122°F), support coolant flow rates up to 13 liters per minute (LPM), and maintain low pressure drops around 0.25 pounds per square inch (psi). They provide leak-free operation with water-based and dielectric fluids, prevent corrosion, and integrate easily with in-rack manifolds.

Most liquid-cooled connectors incorporate quick-disconnect functionality for dripless maintenance access. Large internal diameters — often around 5/8 inch — support high flow rates across AI racks. Some offer hybrid designs that combine high-speed data transmission with liquid cooling channels. Others support compatibility with three-inch square stainless-steel tubing or feature ruggedized construction to withstand temperature fluctuations, pressure changes, and vibration.

Summary

AI data centers depend on various interconnects and physical connectors to link accelerator cards, enable high-speed data exchange, and facilitate large-scale parallel processing. These components are critical in maintaining performance, signal integrity, and mechanical reliability across tightly coupled clusters.

References

A Deep Dive into the Copper and Optical Interconnects Weaving AI Clusters Together, Marvell
The Evolution of AI Interconnects, Marvell
Open Accelerator Infrastructure, Molex
NVIDIA NVLink-C2C, Nvidia
UCIe For 1.6T Interconnects In Next-Gen I/O Chiplets For AI Data Centers, Semiconductor Engineering
New Connectivity Solutions for the AI Data Center, ConnectorSupplier
New Mezzanine Connector that Supports the Open Computing Project, Arrow
PCIe 6.0 and CXL: The Perfect Alignment for AI and ML Workloads, Signal Integrity Journal

Related EE World content

What is the Role of Liquid Cooling Connectors in AI Data Centers?
What Are the Different Types of AI Accelerators?
How Do 224 G Connectors Support AI/ML Training in Hyperscale Data Centers?
Why AI Chips Need PCIe 7.0 IP Interconnects
Accelerating High-Performance AI Workloads with Photonic Chips

You Might Also Like


Filed Under: Connector Tips
Tagged With: FAQ
 

LEARNING CENTER

Design World Learning Center
“dw
EXPAND YOUR KNOWLEDGE AND STAY CONNECTED
Get the latest info on technologies, tools and strategies for Design Engineering Professionals.
Motor University

Design World Digital Edition

cover

Browse the most current issue of Design World and back issues in an easy to use high quality format. Clip, share and download with the leading design engineering magazine today.

EDABoard the Forum for Electronics

Top global problem solving EE forum covering Microcontrollers, DSP, Networking, Analog and Digital Design, RF, Power Electronics, PCB Routing and much more

EDABoard: Forum for electronics

Sponsored Content

  • Sustainability, Innovation and Safety, Central to Our Approach
  • Why off-highway is the sweet spot for AC electrification technology
  • Looking to 2025: Past Success Guides Future Achievements
  • North American Companies Seek Stronger Ties with Italian OEMs
  • Adapt and Evolve
  • Sustainable Practices for a Sustainable World
View More >>
Engineering Exchange

The Engineering Exchange is a global educational networking community for engineers.

Connect, share, and learn today »

Design World
  • About us
  • Contact
  • Manage your Design World Subscription
  • Subscribe
  • Design World Digital Network
  • Control Engineering
  • Consulting-Specifying Engineer
  • Plant Engineering
  • Engineering White Papers
  • Leap Awards

Copyright © 2025 WTWH Media LLC. All Rights Reserved. The material on this site may not be reproduced, distributed, transmitted, cached or otherwise used, except with the prior written permission of WTWH Media
Privacy Policy | Advertising | About Us

Search Design World

  • Home
  • Technologies
    • ELECTRONICS • ELECTRICAL
    • Fastening • joining
    • FLUID POWER
    • LINEAR MOTION
    • MOTION CONTROL
    • SENSORS
    • TEST & MEASUREMENT
    • Factory automation
    • Warehouse automation
    • DIGITAL TRANSFORMATION
  • Learn
    • Tech Toolboxes
    • Learning center
    • eBooks • Tech Tips
    • Podcasts
    • Videos
    • Webinars • general engineering
    • Webinars • Automated warehousing
    • Voices
  • LEAP Awards
  • 2025 Leadership
    • 2024 Winners
    • 2023 Winners
    • 2022 Winners
    • 2021 Winners
  • Design Guides
  • Resources
    • Subscribe
    • 3D Cad Models
      • PARTsolutions
      • TraceParts
    • Digital Issues
      • Design World
      • EE World
    • Engineering diversity
    • Trends
  • Supplier Listings
  • Advertise
  • Subscribe
We use cookies to personalize content and ads, to provide social media features, and to analyze our traffic. We share information about your use of our site with our social media, advertising, and analytics partners who may combine it with other information you’ve provided to them or that they’ve collected from your use of their services. You consent to our cookies if you continue to use this website.OkNoRead more