AI trends unraveled from JSAI2022 - AI explainability, AR, and data expansion

AI business menu

AI business
HOME

What is macnica.ai

What AI Can Do

Products/Services

Seminar

Document List

What is macnica.ai

Strengths of macnica.ai

Search by industry/theme/library

Case Study

Blog

Glossary

● Recommended for: ●

・ Those who are interested in AI academic trends
・Those who want to read JSAI2022 papers
・In business AI Those who are considering using

Time needed to finish reading this article

5 minutes

event report

The 2022 Annual Conference of the Japanese Society for Artificial Intelligence (JSAI) was held in Kyoto for four days from June 14th to June 17th, 2022. This conference will be an academic conference for presenting research on artificial intelligence.

Since the year before last, it was held online due to the influence of the new coronavirus infection, but this year it was an attempt to hold a hybrid online and offline event. A total of 727 papers were published this year. Organized sessions and company exhibitions were also held offline, and I felt a lively atmosphere for the first time in a long time.

As we have posted past JSAI reports, we have seen many keywords related to the explainability of AI (XAI) in the accepted papers, but this year, natural language, etc. models (Transformer, BERT) that handle time-series data, reinforcement learning and dialogue systems, keywords related to medicine (ECG, healthcare, etc.), and keywords related to brain science (EEG/EEG, homeostasis, etc.) have increased. rice field. In addition, the number of keywords related to "data augmentation" has increased from before. (See Fig. 1) This time, we will pick up and introduce papers with trend keywords that can be seen from the keyword distribution created by our company based on the content of the JSAI2022 accepted papers.

Figure 1: Created Macnica based on the contents of the JSAI accepted paper

paper trends

reinforcement learning

First, I would like to introduce "Reinforcement learning with neurosymbolic AI" proposed by the IBM Research team. In this paper, we propose a reinforcement learning method called Logical Optimal Action (LOA) that utilizes Logical Neural Networks (LNN). We conclude that we can learn well.

LNN is a neuro-symbolic AI method, and was recently proposed by the IBM Research team in "Logical Neural Networks" as one of deep learning that can guarantee explainability.

LNN's revolutionary technique is to learn a formula to use in a network of models to derive inference results. A general neural network assigns weights to the features of the input data that indicate how important those features are to producing results, and learns these weights so that the prediction results are the best with a specified formula. . The logical formulas (AND, OR, NOT, etc.) used in calculation formulas are usually predefined, but LNN can learn the best logical formulas in the model. In this paper, in a text-based coin-earning game, LOA, which utilizes LNN for model reinforcement learning, LSTM-DQN++, a deep learning model for action selection, and NLM, a conventional neuro-symbolic reinforcement learning method, are presented. We compared -DQN and concluded that LOA is superior in terms of model accuracy and learning efficiency. In addition, LOA learns rules using logical symbols, and proposes that explanationable reinforcement learning can be realized in that it is possible to check the learning content.

Since AI algorithms are black Box, their explainability has been questioned for some time. The model proposed in this paper, which allows confirmation of learning content, can be expected to improve the reliability of the answers and inferences derived by AI.

Source: 34th Annual Conference of the Japanese Society for Artificial Intelligence Lecture number [3Yin2-56]
"Reinforcement Learning with Neurosymbolic AI"
Caption: Figure 1: Schematic diagram of the proposed method
https://www.jstage.jst.go.jp/article/pjsai/JSAI2022/0/JSAI2022_3Yin256/_pdf/-char/ja

dialogue system

Next, I will introduce a paper on a dialogue system that utilizes the Transformer model.

It is remarkable that communication support devices have spread in recent years due to advances in technologies such as speech recognition and language processing. Therefore, NTT Communication Science Laboratories and Tokyo Denki University researched "Construction of multimodal dialogue strategy for android robots in symbiotic society" to organize technical issues of android robots (AR) as human communication partners. , conducted an experiment using AR. This paper utilizes the Transformer-based large-scale dialogue model and BERT proposed in "Empirical Analysis of Training Strategies of Transformer-based Japanese Chit-chat Systems" published by the same author, NTT Communication Science Laboratories. and implemented a dialogue system in AR using a multimodal dialogue strategy. (See Figure 3) A schematic diagram of the interaction scenario is shown in Figure 4.

Source: The 34th Annual Conference of the Japanese Society for Artificial Intelligence Lecture number [2N5-OS-7a-02]
"Construction of Multimodal Dialogue Strategy for Android Robots in Symbiotic Society"
Caption: Figure 1: Dialogue between Android I and a customer
Figure 2: Schematic diagram of dialogue scenario
https://www.jstage.jst.go.jp/article/pjsai/JSAI2022/0/JSAI2022_2N5OS7a02/_pdf/-char/en

First, in this paper, I pointed out two issues that are rarely dealt with in ordinary robot dialogues. One is AR as a multimodal actuator and the other is AR as an asymmetric communication partner. The former is that it is necessary to implement movements such as facial expressions and voices that humans are unconsciously doing in AR, and the latter is that humans will recognize AR that simulates humans. The point is that it is necessary to perform optimal action control for AR on the premise of "difference".

As a result of verification by implementing an AR dialogue system, it was found that in the AR dialogue system, dialogue postures, nods, and facial expressions are friendly to fostering a sense of intimacy, and that communication is possible without compromising human satisfaction. showed. On the other hand, unnatural movement patterns that are not observed with human opponents are observed when AR is used as opponent, and it was concluded that optimal action control of AR assuming this is an issue for future study.

In recent years, social implementation of AR is expected as a means to solve the labor shortage due to the declining birthrate and aging population. From the experiments in this paper, we concluded that it is necessary to proceed with further verification experiments in order to acquire natural motions for the social implementation of AR. I was.

synthetic data

Finally, I would like to introduce the paper "Generation of editable medical images" by the National Cancer Center Research Institute and the RIKEN Center for Advanced Intelligence Project.

This paper focuses on the properties of generative adversarial networks (GANs) that can generate synthetic data, and proposes an approach to synthetic data generation for medical applications. GAN models have an inherent tendency to over-fit to the most frequent features in the dataset, and there is concern that synthetic data may be biased due to the small variations in case images in the medical field. I was.

Therefore, in this paper, we focus on the existence of specialized knowledge when judging diseases in clinical medicine. I suggested to generate as The generation of the segmentation map uses self-supervised learning and is editable, allowing physicians to edit directly during the data generation process, resulting in a generation algorithm that can complement their expertise.

This paper will make it possible to build a dataset that fully covers the characteristics of specific diseases in the medical field, and is expected to solve the problem of the small number of training data in medical images.

Source: 34th Annual Conference of the Japanese Society for Artificial Intelligence Lecture number [2K5-OS-1a-01]
"Editable medical image generation"
Caption: Fig.1 Neural network architecture overview.
https://www.jstage.jst.go.jp/article/pjsai/JSAI2022/0/JSAI2022_2K5OS1a01/_pdf/-char/ja

Summary

This time, we introduced three papers on topics seen as trends at JSAI2022.

In "Reinforcement Learning with Neuro-Symbolic AI", we focus on the explainability of AI, similar to past JSAI, and propose a new explainable reinforcement learning method, which has not been considered in previous robot dialogues in dialogue systems. By experimenting with the subject, we clarified the need for optimal behavior control for more natural AR dialogue systems in the future. From these two papers, we can see a trend toward social implementation of AI in a way that is closely related to human perception and understanding of AI. In addition, as can be seen from the synthetic data generation technology initiative, research to solve real-world issues such as lack of data will be actively conducted in order to further improve the accuracy of AI required for social implementation.

■ Sources of content and papers introduced on this page / References

Daiki Kimura, SUBHAJIT Chaudhury, SARATHKRISHNA Swaminathan, Tsunehiko Tanaka, DON Joven Agravante, Michiaki Tatebori, ASIM Munawar, ALEXANDER Gray
"Reinforcement Learning with Neurosymbolic AI"
Figure 1: Schematic diagram of the proposed method
https://www.jstage.jst.go.jp/article/pjsai/JSAI2022/0/JSAI2022_3Yin256/_pdf/-char/ja

Makoto Kawamoto, Daisuke Kawakubo, Hiroaki Sugiyama, Masaki Shuzo, Eisaku Maeda
"Construction of Multimodal Dialogue Strategy for Android Robots in Symbiotic Society"
Figure 1: Dialogue between Android I and a customer
Figure 2: Schematic diagram of dialogue scenario
https://www.jstage.jst.go.jp/article/pjsai/JSAI2022/0/JSAI2022_2N5OS7a02/_pdf/-char/en

Kazuma Kobayashi, Yasuyuki Takamizawa, Sono Ito, Mototaka Miyake, Yukihide Kanemitsu, Ryuji Hamamoto
"Editable medical image generation"
Fig.1 Neural network architecture overview.
https://www.jstage.jst.go.jp/article/pjsai/JSAI2022/0/JSAI2022_2K5OS1a01/_pdf/-char/en