The Forefront of IP Video Transmission Episode 2: Softwareization of Broadcast Equipment

This article is the 2nd episode of the [Forefront of video transmission IP conversion] series.

Episode 1 Evolution of technology in the broadcasting industry

Episode 2 Softwareization of broadcasting equipment

Episode 3 Overview of NVIDIA Rivermax SDK

 

As I mentioned in Episode 1, along with the shift to IP for broadcasting equipment, there is a trend in the world to move broadcasting equipment to the cloud and to make broadcasting equipment software-based.

 

Recent general-purpose servers have made remarkable progress, such as faster CPUs, faster and larger memory/disk capacity, enhanced computing performance with GPUs, and faster network interfaces. As a result, the server specs that can be used have increased dramatically. However, just like the early days of broadcasting over IP, it is not necessarily easy to convert to software.

 

Various organizations, represented by SMPTE, have been promoting specifications as a basis for converting broadcasting to IP. These specifications define stringent requirements on a level similar to that of traditional SDI-based broadcast equipment. It's not easy because it's trying to realize the standard specifications that have been achieved by hardware with dedicated equipment so far (although the performance evolution is remarkable) on a best-effort platform using a general-purpose server. I think you can imagine.

 

NVIDIA Mellanox's Rivermax SDK solution solves this challenge.

Issues that must be overcome for softwareization

There are two major issues in implementing software-based broadcasting equipment functions on general-purpose servers. One is how to realize strict requirements such as SMPTE, and the other is how to realize applications (functions) of broadcasting equipment.

Application implementation resources

Although the performance of general-purpose servers has improved dramatically, there is a limit to the computational resources of servers. In addition to applications such as video processing that you want to implement using server resources, it is also necessary to consider the computational resources used for sending and receiving data to and from the network.

Realization of SMPTE requirements

SMPTE defines specifications such as ST2110 and ST2022 that accompany IP conversion of broadcasting. Among these, the following are the issues that arise especially when trying to achieve this on a general-purpose server basis.

 

ST2110-21: Packet packing

ST2059: PTP sync

ST2022-7: Redundancy

 

General-purpose servers were originally a best-effort operation, using CPU resources to send and receive data to and from the network. On the other hand, these three functions sometimes require accuracy on the order of ns/us, and it was a high hurdle to achieve by embedding software.

A solution that solves these issues at once

As mentioned earlier, when converting broadcasting equipment into software on a general-purpose server,

 

1) efficiency of server resources,

2) How to implement functions that require high accuracy

 

will be a big challenge. The solution to these problems is the Rivermax and ConnectX series (ConnectX-5 or later) NICs provided by NVIDIA Mellanox. I will explain step by step how this Rivermax solution solves the problem.

Kernel bypass function for efficient server resources

In sending/receiving IP packets on the server, the factor that consumes the most resources is the processing in the kernel layer of the OS. Normally, when sending and receiving IP packets in the User Application layer, various processes are performed in the OS kernel layer. The server resources such as CPU and memory required at this time will increase as the interface speed increases. Recent NICs such as the ConnectX series have ultra-high-speed network interfaces such as 100 Gigabit Ethernet, and the consumption of server resources due to this I/O has increased to a level that cannot be ignored.

 

The function to avoid this is the kernel bypass technology that NVIDIA Mellanox's NIC is good at.

 

When the ConnectX series NIC and Rivermax software are combined, by calling the API for transmission and reception of Rivermax from the user application, all kernel areas of the OS are bypassed and transmission and reception processing is performed directly with the NIC. This will

 

- Less server resources used for sending and receiving, more resources available for applications

- Sending/receiving is possible with low load, resulting in improved overall throughput (elimination of bottlenecks caused by server resources)

- Can reduce transmission/reception delay and packet fluctuation (jitter)

 

You can enjoy benefits such as

 

Of course, in addition to things like video streams that you want to send and receive at high speed, there are certainly functions that you want to use the existing TCP/UDP stack, such as ARP and PTP, as is. Selective Bypass) is also provided.

kernel bypass technology
kernel bypass

Hardware offload function for high accuracy

As mentioned earlier, there are functions that require accuracy as specified in the SMPTE standard for broadcasting. In order to meet these demands, the power of hardware is absolutely necessary. The ConnectX series NIC is a general-purpose network card, but it has a built-in hardware engine that satisfies these requirements, making it easy to implement. I will explain them one by one.

ST2110-21 Packet Pacing function

Normally, data transmission on general-purpose servers located in data centers, etc., will be sent at full power as long as there is bandwidth. On the other hand, SMPTE ST2110-21 requires packet transmission at a constant rate for each video stream. The images projected on the screen can be viewed as moving images by displaying, for example, 59.94 frames per second like a picture-story show. Since the display speed is constant, it is quite natural to want to receive video data as an IP stream at regular intervals. However, in the network/server world, this is not always the case.

 

When trying to send packets from the server at regular intervals, if you try to process this with the CPU, it will be extremely difficult to implement due to fluctuations caused by factors such as external load factors on the CPU. Therefore, the ConnectX series NIC has implemented this function called Packet Pacing in NIC hardware.

 

When used in combination with Rivermax software, Packet Pacing that conforms to the video format you want to send is realized on the NIC hardware and sent at regular intervals with high accuracy. Because it is implemented in hardware, it is less susceptible to fluctuations due to external factors such as CPU, OS, and application load, and can achieve packet pacing at a level that complies with SMPTE.

Packet Pacing

ST2059 PTP function

SMPTE ST2110 requires a timestamp function by PTP (Precision Time Protocol) conforming to ST2059. This is because the ST2110 communicates Video, Audio, and Ancillary as separate streams, so they must be synchronized with high precision. And, as with Packet Pacing, it is difficult to stably implement this high-precision time stamp function in software, so a hardware offload function will appear.

 

The ConnectX series NIC implements hardware logic for PTP timestamp processing, even though it is a general-purpose NIC. By linking with general-purpose PTP software such as Linux PTP, it is possible to process packet time stamps with ultra-high accuracy.

ST2022-7 Redundancy function

SMPTE has a standard called ST2022-7 for stream redundancy. This ensures the redundancy of streams such as video and their uninterrupted switching. As a broadcasting device,

 

Transmitting side: Simultaneous transmission of two primary/secondary streams

Receiving side: Receives two streams, Primary/Secondary, and immediately switches to the normal system when an abnormality occurs in one system

 

action is required.

 

When using Rivermax and ConnectX-6 DX or later NICs, these operations can be implemented in hardware. Specifically, it works as follows.

 

Sending side: The application prepares a memory area for the stream to be sent and stores the stream data here, but prepares only one stream at this time. When sending in ST2022-7 mode, Rivermax copies one stream of data and sends it to the port assigned to Primary/Secondary on ConnectX-6 DX.

 

Receiving side: Streams are received from the physical ports assigned to Primary/Secondary of ConnectX-6 DX, respectively, and these are stored in the memory area of Rivermax. At that time, RTP header information stores Primary/Secondary information, stream data is formed into SMPTE2110/2022 format by combining two streams, and stored in one memory area. By referencing the memory area for this stream, the application can receive data correctly even if one of the streams is lost.

ST2022-7 Redundancy function

As benefits of using these various functions, we will introduce evidence from two perspectives: performance and accuracy/accuracy.

 

In terms of transmission and reception performance, information is available on the NVIDIA website.

 

Achieving Higher Performance & lower CPU Using Rivermax The graph of the part expresses it. Four measurement conditions are shown as examples, with the green graph showing CPU usage and throughput when Rivermax is used and the light blue graph when Rivermax is not used. If you refer to this, you can see that by using Rivermax, the CPU usage rate is reduced to 1/3 to 1/6, and the throughput is improved by 6 to 17 times at the same time.

 

It is important to note that the server resource consumption such as CPU shown here is used only for packetizing and transmitting data such as video, and other processes such as video processing are not included. We do not. In terms of accuracy, Rivermax solutions are also JT-MN Tested certified.

 

At Macnica, we have developed a variety of sample applications on top of Rivermax, as well as an adapter that connects them to Rivermax. As part of these efforts, we have also begun measuring performance up to the application layer. We can provide this information separately, so please feel free to contact us.

Please take a look next time

Next time, I will introduce the application interface of Rivermax and GPU cooperation.

Author profile

Hiroshi Funaki

Macnica CLAVIS Company

Technology Management Department, Technology Department 3, Section 2

Hiroshi Funaki

 

 

Biography:

After working in communication equipment development at a certain domestic manufacturer, joined Macnica. After supporting semiconductors for communication equipment, he started working on NVIDIA (formerly Mellanox) products for about seven years. In recent years, he has been engaged in promotion and support mainly for the broadcasting industry.