In recent years, the field of data science using big data has developed rapidly, and with the spread of 5G, the amount of data tends to increase and diversify. In addition, applications and joint development using the cloud are increasing, and there is a pressing need to strengthen data security. In order to process bloated data at high speed, it is necessary to expand the computational resources, power consumption, and installation space that are the resources of the data center. If you try to support advanced security functions, you will also need resources accordingly. However, data center resources are reaching the limits of expansion.
The DPU (Data Processing Unit), a technology that has the potential to solve these issues and accelerate the deployment of 5G, will play a very important role in fields such as data centers and enterprises in the future. In this article, I will explain what a DPU is and what benefits it has.
What is DPU, which appeared next to CPU/GPU?
Before the rise of GPUs, CPUs were responsible for the original application operations, as well as parallel computation processing, network processing between servers, remote storage access, and so on. It was only a few years ago that GPUs appeared and could be entrusted with parallel computing, and then DPUs appeared.
A DPU (Data Processing Unit) is a device for offloading communication functions that were previously handled by the host CPU. NICs (Network Interface Cards) that offload network processing already existed, with a concept similar to that of a DPU. However, since the commands for offloading had to be executed by the host CPU, they could not be completely freed. By equipping an Arm CPU and peripheral devices on an NVIDIA ConnectX series NIC, a DPU can control the offload functions previously performed by NICs and achieve further additional functions. This allows the host CPU to focus on application processing.
Whereabouts of CPU resources that are overused in data center networks
Server resources are consumed on servers in data centers for two main purposes. The first is the resources required for processing various applications, which is essentially the user area used for "what the user wants to achieve" on the server. Another area that cannot be ignored is "resources for infrastructure" such as network/storage processing, which is often used by server administrators. With the recent increase in data capacity and communication bandwidth, the ratio of "resources for infrastructure" is increasing rapidly. Furthermore, in order to realize zero trust security, which is one of the trends, even security functions are being required to be realized on the server, and server resources for the user area are increasingly under pressure. .
As the quality and number of applications increases, users will want to expand the user area accordingly. On the other hand, administrators will also need to expand resources as part of improving SLAs (service level agreements), and as a result, the shortage of CPU resources will become more serious.
Utilization of DPU makes it possible to separate the user area and the administrator area, which were competing with each other within limited server resources, and move administrator resources onto the DPU. Add parallel processing to the GPU, and CPU resources can be fully devoted to user applications.
What exactly can you do with DPU?
We will introduce applications in which DPUs play an active role, mainly from three perspectives.
Network
By adding a DPU at the entrance of the system, it will play the role of a new accelerator in front of the CPU and GPU. This minimizes CPU processing in packet IO and achieves even faster processing.
storage
The use of remote storage spreads to support large volumes, but access to it required processing on the host OS. By utilizing a function called SNAP on the DPU side, storage in a remote environment can be handled as if it were local storage, supporting high-speed data transfer.
Security
By offloading security functions onto the DPU, security processing can be performed in a state that is completely physically separated and protected from the server's CPU, memory, and even higher-level applications.
Areas expected to be utilized
Detailed information on the NVIDIA DGX H100 is available.
The NVIDIA DPU is an acceleration card that combines the ConnectX series NIC introduced above with an Arm CPU and peripheral devices.
In conventional systems, the CPU is responsible for many processes such as network, storage, security, and management, which can place a high load on the CPU. However, by substituting the DPU (Data Processing Unit) for these processes, the host CPU can be dedicated to application processing.
The reference material "NVIDIA DGX™️ H100 Dissection from a Network Perspective: The Role of NVIDIA ConnectX®️-7 in Maximizing GPU Utilization" explains what a DPU is and how it can be used specifically.
You can view the full text without entering any personal information.
Contents of the materials (excerpts)
Click image to enlarge.