This article is recommended for those who

  • I want to put cutting-edge AI technology into practical use in business
  • I want to know the outline of the posture estimation model
  • I want to know application examples of pose estimation models

Time needed to finish reading this article

5 minutes

Introduction

Hello, I'm Makky from Macnica AI Women's Club!
AI continues to grow exponentially.
In China, payment methods using QR codes are no longer the latest, and facial recognition payment methods are starting to spread.
Image recognition, like this face recognition, is more familiar and easy to imagine in daily life, but AI is also widely used in the manufacturing industry for abnormality detection and other purposes.

Therefore, this time, I would like to briefly explain the "posture estimation" model, which is one of image recognition, and then introduce the algorithm.

What kind of model is the posture estimation model?

This pose estimation model is also known as "human body detection". This model learns human joint points from still images, and can also detect human poses connecting joint points in real time from still images and videos. Model.

Autonomous autonomous driving technology is easy to imagine as a use case of the attitude estimation model, but there are many other cases that utilize the merits of attitude estimation.
For example, conventional posture estimation models may overlap people or hide parts of the body depending on the angle of the image. Currently, it is difficult to correctly detect human joint points from images.
However, recent pose estimation models do not use high-performance cameras, estimate depth (depth), and can accurately detect overlapping human joint points.
These technological advances have enabled pose estimation models to be used in a variety of fields, from applications in the fields of sports and security to analysis of flow lines at event venues and factories.

However, AI in the field of image recognition, including pose estimation models, has a large learning cost (learning time, data required for learning), and it is not easy to improve accuracy.
However, just as any AI model requires accuracy, what is also required in the field of image recognition is "detection accuracy", and this is not limited to object detection, which is often known as image recognition, but pose estimation as well.

Learn more about pose estimation models

Now, I would like to introduce a posture estimation model that has a low learning cost and high accuracy.
Pose estimation models are broadly classified into two types: bottom-up or top-down.
These are the calculation order for detecting human joint points divided by type.

bottom-up

The bottom-up type is a model generated by an algorithm that follows the steps below.

1: Identify all key points in the image
2: Match and connect for each person

By identifying all the key points (human joint points) that exist in the image at the beginning, it is characterized by being easier to reduce the calculation cost during learning than the top-down method described later.
However, after extracting the keypoints, it is necessary to perform a huge amount of pattern matching in order to match the optimal keypoints for each person. Therefore, it is difficult to improve the accuracy of matching, such as false detection of overlapping parts of people.

top down type

The top-down type is a model that detects the joint points of a person and estimates the pose using the following procedure.

1: Detect people with object detection algorithm
2: Estimate pose for each person

Since the pose is estimated for each person detected in step 1, even images in which people overlap can be estimated with higher accuracy.
However, as you can imagine from this procedure, the two processes of human detection and pose estimation are performed, so the calculation time is also a problem.

fast! high! Posture estimation model

I have explained two types, the bottom-up type and the top-down type, but of course there are models that have the advantages of both types.

This time, we will introduce a deep learning model called Pose Proposal Network, which is a good pose estimation model that has a relatively low learning cost and can obtain higher accuracy.

Pose Proposal Network can be classified as a top-down type if classified by the type introduced in the previous chapter. After detecting a person using an object detection algorithm, it detects the joint points of each person and connects the joint points. pose estimation.
Although it is a method with a large amount of calculation because it is a top-down type,

  • Base the object detection algorithm of the deep learning network for detecting people on YOLO v3, which has "fast learning and high accuracy"
  • Based on OpenPose, which ``simplifies the network structure'' and ``highly accurate detection of joint points,'' which performs learning by matching the connection information between joint points in addition to the coordinates of the joint points in order to detect the joint points of a person. to do

With these two innovations, it is possible to reduce the learning computation cost to some extent.

Source: “Pose Proposal Networks”,
Caption: “Fig. 2. Pipeline of our proposed approach. Pose proposals are generated by parsing RPs of person instances and parts into individual people with limb detections (cf. § 3).”
http://taikisekii.com/PDF/Sekii_ECCV18.pdf

Operation impression of Pose Proposal Network

I actually used the Pose Proposal Network algorithm and proceeded with learning.

The dataset used is the “MPII Human Pose” described in the paper, which includes 24,000 image data and over 40,000 annotation data (coordinate data at joint points of people).
This time, we used YOLOv3 based on Mobilnet v2 and ResNet as the network structure for object detection.

When I actually ran the training for model generation, I felt that the network was light, that is, the training time was relatively short.
Furthermore, the combination of YOLOv3 and OpenPose does not make the network more complex, so the size of the model itself is smaller than models trained with other algorithms.

Possibilities of Posture Estimation Models Seen from Application Cases

Now that we have a general understanding of what pose estimation models are, let's consider the possibilities of pose estimation models from specific application examples.
As I mentioned earlier, this model has been applied to various fields, but it is specifically used for the following purposes.

・Detection of pedestrians in the field of autonomous driving
・Use for movement analysis and scoring methods in the world of sports and dance
・For security purposes, monitor for suspicious movements of people
・Characteristic analysis of group behavior from posture information of multiple people (flow line analysis, etc.)

These application examples will be further developed, and in the future it will be used for behavior analysis and applied to robots.
It is also possible that it will become easier to apply AI to more familiar issues. For example, it is conceivable that AI will replace humans in many of the tasks of monitoring video for long periods of time to ensure safety.
Furthermore, the pose estimation model is characterized by its ability to detect joint points such as the head, arms, hips, and knees in detail, and can be used for product development and the succession of craftsmanship.

Summary

This time, we introduced a posture estimation model that can be used in a wide range of applications depending on your ideas.
Depending on the idea... it seems easy to me, but it's a difficult point.
In order to be able to think flexibly in times of need, I read papers and cases in a wide range of genres, listen to the opinions of various people, and even watch fantasy movies and immerse myself in daydreams.
I thought I'd use various parts of the brain on a daily basis!

 

Sources of AI papers featured in this article / Reference Lists
Taiki Sekii. “Pose Proposal Networks” http://taikisekii.com/PDF/Sekii_ECCV18.pdf

 

Macnica 's AI Research & Innovation Hub (ARIH) conducts cutting-edge AI research, investigations, and implementation evaluations to provide knowledge that combines the most appropriate AI technologies, and works to lead companies to optimal solutions to their problems.
Please see below for details.

AI girls' club column

* Tech Blog AI Women's Club *
Learn about cutting-edge research - AAAI 2019 3 selected AI papers -

* Tech Blog AI Women's Club *
CVPR 2019 5 selected papers

* Tech Blog AI Women's Club *
[AI paper] ``Deep Image Prior'' for image correction using only the target image