20190917-aigirl-column-thum

This article is recommended for those who

  • Those who want to know what kind of papers are accepted for AAAI 2019
  • Those who do not have time to read papers but want to know about recent AI research
  • Those who want to know about AAAI 2019

Time needed to finish reading this article

less than 10 minutes

AAAI 2019 papers will be picked up and introduced

Hello, I'm Makky from Macnica AI Women's Club! Summer is getting longer every year, and it looks like there will be more hot days.
However, the season will shift to autumn and it will gradually become colder.
As the seasons are changing, please take care of your health!

Now let's get to the point. The theme of this blog is “AAAI 2019 Adopted Papers”.
In the tech blog the other day, I introduced the papers accepted for CVPR 2019, and I would like to introduce the papers accepted for AAAI 2019 following CVPR.

Recent AI trends can be seen from the AAAI 2019 paper

AAAI is a world-class international conference on artificial intelligence held from January to February every year.
Therefore, by reading papers accepted by AAAI, you can learn about the latest trends in artificial intelligence in general.

This time, we will introduce three selected papers from China, where AI research has been active in recent years, among the accepted papers of AAAI 2019 held from January 27 to February 1, 2019.

  1. “Joint Representation Learning for Multi-Modal Transportation Recommendation” Transportation Recommendation Method Using Multi-Modal Learning
  2. High Performance Face Recognition Framework “Selective Refinement Network for High Performance Face Detection”
  3. “Deep Interest Evolution Network for Click-Through Rate Prediction” CTR prediction architecture whose performance has already been evaluated in actual operation

1. “Joint Representation Learning for Multi-Modal Transportation Recommendation”

The first article to be introduced, "Joint Representation Learning for Multi-Modal Transportation Recommendation," is a paper that proposes a transportation recommendation system based on multi-modal learning and builds its framework.

Multimodal learning is one of the trends in deep learning and has been the subject of a lot of research recently.
Multimodal refers to the use of multiple modalities (senses), and multimodal learning in machine learning is a learning method that learns from multiple types of data and processes them in an integrated manner.

Now, regarding the transportation recommendation system that appears in the paper I introduce, conventionally, the focus is only on one modality, and various modes of transportation such as walking, cycling, automobiles, public transportation, etc., and combinations of these modes. I was recommending transportation to users.
However, if the user's preferences (cost, time, etc.) and travel characteristics (travel purpose, travel distance, etc.) are taken into account when recommending transportation, it is believed that better recommendations can be made that are more tailored to the user.

Therefore, in this paper, we propose a learning framework, Trans2Vec, that enables traffic recommendations by multimodal learning. The figure below is a schematic diagram of the Trans2Vec framework.

Specifically, based on the map database, user demographic attributes, and data including origin-destination information, we create a multi-modal traffic graph that focuses on multiple modalities, as shown in the image below. increase.
By learning transportation methods using this traffic graph, it is possible to acquire the optimal transportation method according to the history of the user and the destination.

In the paper, we also compare the performance of existing algorithms and Trans2Vec using evaluation indices of multiple recommendation systems. The result is the table below.

As a result of recommending transportation facilities in Beijing and Shanghai, Trans2Vec produces good values in three indicators other than PREC (Precision).
Even in PREC, the value is not so bad, rather it is a value that is not much different from the best value.

Trans2Vec has already been introduced into a production system and is likely to become a framework that will continue to play an active role in the future.

2. “Selective Refinement Network for High Performance Face Detection”

Next, I will introduce a paper on the theme of "face recognition".

Recently, major Japanese companies are actively developing systems for facial recognition cashless payments, and we often see updates on system development in the news.
Of course, face recognition can be used for purposes other than cashless payments, so research into face recognition is being enthusiastically conducted not only in Japan but in many countries.

So the second is a research paper "Selective Refinement Network for High Performance Face Detection" from China, where face recognition research is very active.

The task of face recognition has been actively researched for a long time, but face recognition for images with many small faces, such as photographs of crowds, has been a difficult task because it cannot be said to have high performance. .

There are two issues with conventional face recognition tasks, the first of which is the recall rate.
Even SOTA's ReitiaNet, which was proposed in 2017, had a precision (precision) of 90% and a recall of 50%, which is good precision but not very good recall.
The second issue is that of position recognition accuracy. The smaller the face, the more difficult it is to locate the face.

In this paper introduced this time, a new framework for face detection,  Selective Refinement Network(SRN)  is advocating.
The SRN consists of two main modules, Selective Two-Stage Classification (STC) and Selective Two-Stage Regression (STR), which reduce false positives and improve location accuracy.

Below is a comparison of the accuracy of face recognition using face recognition datasets such as AFW, PASCAL face, and FDDB.
You can see that Ours (that is, SRN) indicated by the red line is highly accurate in any graph.

And the figure below is a comparison of the accuracy of face recognition using a face recognition dataset called WIDER FACE.
Again, we can see that the SRN is highly accurate in every dataset case.

Since face recognition is a familiar field in our daily lives, it may develop further in the future, and the quality of face recognition cashless payments, which are currently being developed and researched by many companies, may become higher.

3. “Deep Interest Evolution Network for Click-Through Rate Prediction”

Finally, I would like to introduce the technology that is already active in the marketing field, which I have recently been interested in.

Marketing terms include LTV (the total amount of profit that one customer brings to a company in a lifetime: Life Time Value) and CPA (a numerical value that expresses how much advertising costs are spent for one result: Cost Per Action). Yes, but the relationship between these numbers and advertising is very close.

スマートデバイスの普及により、今や広告は身近すぎる存在ではありますが、ここ2年ほど市場規模ではWeb広告の方がテレビ広告を上回っています。

Therefore, in the third paper, I would like to introduce research on predicting the CTR (Click Through Rate) of the web advertising industry.

DIN (Deep Interest Network) has been devised as a method for CTR prediction.
This DIN directly considered relative interest in target items to be related to user behavior.
However, users' latent interests may not be fully reflected in their explicit behavior.
Also, since user interests are constantly changing and evolving, it is very important to capture trends in interest.

In order to model the process of this interest evolution, a new network architectureDIEN is proposed in this paper.
Below is DIEN's network structure diagram.

DIEN has two features that differ from DIN in its network structure.
The first is that instead of directly treating behavior as an interest, we added an Interest Extractor Layer that extracts the user's interests.
Second, we've added an Interest Evolving Layer that models the process by which a user's interests evolve.

With the addition of these two layers, we are now able to use the underlying interests of our users as well for CTR prediction.

DIEN has already advanced to the actual operation phase, and was introduced to Taobao's display advertising system in China, and reportedly achieved a 20.7% improvement in CTR.

Summary

今回はAAAI 2019採択論文をご紹介いたしました。
マルチモーダル学習や顔認証などは近年とても人気の分野です。
今後も実運用フェーズで活躍できそうな研究成果が多く発表されることでしょう。

We will continue to pick up and introduce summaries of conference papers and interesting AI papers on the women's division tech blog, so please look forward to the next time!

Click here for examples of papers

Case of Aisin AW Co., Ltd.