More and more people are starting to develop AI image analysis applications using NVIDIA Jetson. However, for beginners, there is a hurdle in understanding the software tools required for development before reaching the development of the application itself, and the current situation is that it is difficult to move forward. In this article, we will introduce the NVIDIA Vision Programming Interface that can be used to run image processing algorithms on NVIDIA Jetson.

What is VPI?

VPI (NVIDIA Vision Programming Interface) is a software library of computer vision and image processing algorithms implemented for NVIDIA Jetson. In the past, accessing multiple computing resources in a device required using multiple APIs such as OpenCV and NVIDIA CUDA. In addition, there were computational resources such as PVA (Programmable Vision Accelerator) and VIC (Video and Image Compositor) that did not have an open API in the first place. VPI seamlessly accesses these multiple computational resources and provides APIs for computer vision and image processing without sacrificing their processing power.

Compute resources accessible by VPI are CPU, GPU, PVA and VIC. The API provided by VPI has multiple implementations with these multiple computational resources as backends. Pipelined processing makes full use of computational resources. For example, at the same time that inference processing is performed on the GPU, preprocessing of the next frame can be performed with PVA or VIC, and the CPU can process the GUI at that time.

VPI is also designed to avoid unnecessary memory copying when processing across multiple backends. This zero-copy memory allocation maximizes processing throughput.

In addition, VPI provides interoperability with existing projects developed based on OpenCV and CUDA. The processing realized by OpenCV can be easily replaced with VPI and processing performance can be improved.

Perspective Warp

VPI offers many algorithms, but this time we will focus on Perspective Warp.

Perspective Warp is an image geometric correction algorithm that is used, for example, to correct distorted images caused by poor camera positioning. The correction is represented by a 3x3 transformation matrix.

Fig. 1 Concept of Perspective Warp

Details of the algorithm can be found on the following pages.

VPI Documentation -> Algorithms -> Perspective Warp -> Implementation

 

First of all, please see how the VPI Perspective Warp sample application created by our company is executed.

Figure 4 Execution of the sample application

ArUco Marker

The sample application I created this time uses OpenCV ArUco markers to generate a transformation matrix for Perspective Warp. The transformation matrix can be obtained by specifying 4 plane coordinates before transformation and 4 plane coordinates after transformation. In the sample application, 4 plane coordinates after conversion are determined from 4 ArUco markers. The OpenCV getPerspectiveTransform() function is used to generate the transformation matrix. The four plane coordinates before conversion are the coordinates of the four corners of the video.

Fig. 2 Plane coordinate 4 points determined by ArUco marker

Sample application

Let's take a look at the structure of the sample application.

This application continuously captures the sheet on which ArUco markers are printed with a camera, and displays the video overlaid on the sheet.

Figure 3 Mechanism of the sample application

The processing flow of the sample application is as follows. Steps 6 to 9 are all processed by the VPI API, so you can proceed to the next process without waiting for the completion of each process. This is achieved with a feature called Streams.

 

  1. Image capture from camera
  2. Detection of ArUco markers from captured images (cv::aruco::detectMarkers)
  3. Determine the coordinates of 4 points from the detected ArUco markers
  4. Get transformation matrix (cv::getPerspectiveTransform)
  5. Video frame decoding
  6. Convert video frame image format to NV12 (vpiSubmitConvertImageFormat)
  7. Resize the video frame to the same size as the image captured by the camera (vpiSubmitRescale)
  8. Perspective Warp processing for video frames (vpiSubmitPerspectiveWarp)
  9. Convert the image format of Perspective Warp processing result of video frame to BGR (vpiSubmitConvertImageFormat)
  10. Wait for VPI processing to complete (vpiStreamSync)
  11. Convert video frame Perspective Warp processing result data to Mat format (vpiImageDataExportOpenCVMat)
  12. Overlay the Perspective Warp processing result of the video frame on the image captured by the camera (cv::add)
  13. Display overlay result (cv::imshow)
  14. return to 1

Sample application available on GitHub

How was the introduction based on the VPI Perspective Warp sample application created by our company? We plan to introduce more about the use of VPI in the future.

The sample application is published on GitHub. Please take a look.

MACNICA-CLAVIS-NV / vpi_perspective_warp_aruco