20190423-arih-column-thum

This article is recommended for those who

Those who are investigating the latest AI trends/fashion

Time needed to finish reading this article

3 minutes

Introduction

Hello! I'm BB from Macnica AI Research Center.

The heat of the workstations in the office is supposed to be amazing.
Will we survive the summer?
Please pray for my blog updates slowing down...

This is the final part of the report on the NVIDIA GTC 2019, a must-see event for AI engineers!
We will introduce the library group "RAPIDS" for making full use of GPUs in machine learning and preprocessing, and session content on modeling such as machine learning / deep learning and data explainability.

RAPIDS

"RAPIDS" is a library group that makes GPU computing power compatible with all aspects of AI/data science, such as data reading, preprocessing, analysis, and visualization.
I remember that our AI engineering team was excited when we announced that things other than deep learning would be accelerated by GPUs last year. .
Since RAPIDS has already reached the point where it can be put to practical use, the session related to RAPIDS was the highlight of this event.

As of March 2019, RAPIDS v0.6 has been released.
Although it is implemented in CUDA/C++, the python environment is maintained as a language binding, so the usability is not much different from other standard libraries.
This time, I will introduce the characteristic RAPIDS library.

■ cuDF cuIO

A library for so-called data preprocessing such as data reading, joining, aggregation, and filtering. Data is handled in DataFrame format.

■ cuML

A library about machine learning algorithms and basic mathematics. v0.6 supports the following algorithms.

  • GBDT
  • GLM
  • random forest
  • K-Means
  • K-NN
  • DBSCAN
  • UMAP
  • kalman filter
  • PCA
  • SVD (supports ARIMA model and Holt-Winter method in v1.0)

■ cuGraph

A library for graph data analysis. It has a wide range and v0.6 supports:

  • Jaccard
  • Weighted Jaccard
  • Louvain
  • SSSP
  • BFS (Scheduled to support SSWP, Triangle Counting, Subgraph Extraction in v1.0)

■ cuXfilter

This is a GPU-accelerated library for filtering and visualizing data based on "Crossfilter", which was not touched on much in each session.
It seems that it is possible to execute and visualize each operation of content confirmation / histogram calculation / groupby in less than 1 second for a data set of 10 million to 200 million rows x multiple columns. (*Execution environment is under investigation.)

 

I'm planning to upload a verification blog on how comfortable the analysis will be with the in-house NVIDIA DGX-2. I will do my best.

Modeling x Data x Explainability

There were also several sessions on this topic.
First, there was a session on whether the model can be explained by satisfying the following three points.

  • Natural Representations: Can the phenomena to be modeled be logically explained?
  • Modular and Composable ... Can the parameters you want to explain be algebraically replaced (modularization)?
  • Constructive: Can modules be optimized computationally?

This means that the parameters to be learned should be derived in a way that explains why they are important.
This means that feature design is important.

Certainly, deep learning has the advantage that it is not necessary to construct and describe detailed feature quantity design, but if you want to be able to understand the decision results of the data and the model by linking them, deriving the feature quantity is recommended. It is a way of thinking that it is better not to fly.
It's a way of thinking that is faithful to the basics, but I've been convinced again.

I've seen some examples where the PD control parameters of a drone are properly derived and used as feature values (learning data) for modeling using deep learning.
Compared to PD control, it seemed that the motion of the drone controlled by the trained model had more sharpness.
It might be a demo for a simple problem, but it's compelling.

 

By the way, this was the NVIDIA GTC 2019 participation report for all three times.

Although not introduced this time, the sessions/exhibitions on the development of autonomous driving were all surprising, and the poster sessions also gave us a wide range of information on the advanced initiatives of each industry/research institute.
This event brings together all trends related to AI, from hardware to advanced research/examples, so be sure to visit the venue next year!

In addition, you can purchase the Jetson Nano development kit introduced in the first time from the link below.
You can also get detailed information on other NVIDIA GPU products, so please take a look!