Fire-Flyer AI-HPC

What is Fire-Flyer AI-HPC?

Fire-Flyer AI-HPC is an artificial intelligence high-performance computing (AI-HPC) architecture designed and deployed by the DeepSeek-AI team. This architecture was created to meet the growing demand for computing power and bandwidth in deep learning (especially large-scale language model training). Fire-Flyer AI-HPC achieves a balance between cost-effectiveness and high performance through software and hardware co-design, aiming to provide an economical and efficient solution to address the challenges of high-performance computing.

Fire-Flyer AI-HPC Architecture Explanation

Fire-Flyer AI-HPC Architecture Diagram

Fire-Flyer AI-HPC architecture construction process

Core components

Hardware design

GPU cluster

Fire-Flyer AI-HPC is configured with 10,000 PCIe A100 GPUs.

What is PCIe A100 GPU?

PCIe A100 GPU refers to NVIDIA A100 Tensor Core GPU that uses PCIe (Peripheral Component Interconnect Express) interface.

1. NVIDIA A100 Tensor Core GPU

Architecture : Based on NVIDIA's Ampere architecture, it is the successor to the Volta and Turing architectures.
Performance : The A100 is a GPU designed by NVIDIA for high-performance computing (HPC), artificial intelligence (AI), and deep learning tasks. It provides extremely high computing performance and memory bandwidth.
Tensor Core : The A100 is equipped with the third-generation Tensor Core, which supports mixed-precision computation (such as FP16, TF32, INT8, etc.), significantly accelerating AI training and inference tasks.
Memory : The A100 is available in a variety of memory configurations, including 40 GB and 80 GB HBM2 memory with high bandwidth such as 1.6 TB/s or 2.0 TB/s.

2. PCIe interface

PCIe Standard : PCIe is a high-speed serial computer expansion bus standard used to connect peripheral devices (such as GPUs, network interface cards, storage devices, etc.) to the host system.
Versions : Common PCIe versions include PCIe 3.0 and PCIe 4.0. PCIe 4.0 offers higher bandwidth than PCIe 3.0 (16 Gbps vs. 8 Gbps per channel).