Recommended for

  • Introduction to object detection and its difference with other computer vision techniques such as classification and segmentation.
  • Overview on ARIH’s object detection use cases in visual surveillance, sports, and transportation.

Expected reading duration

10 minutes

Introduction

Welcome to first part of our Object Detection blog series!
I’m Je of AI Research and Innovation Hub (ARIH). In this blog, we’ll briefly discuss what is object detection and provide an introduction to some of our team’s object detection use cases in visual surveillance, sports, and transportation. This blog is part 1 of a four-part blog post. For the next parts, we’ll discuss the use cases and common challenges we encountered in more detail, so please tune in!

Parts
1.  Introduction to Object Detection and ARIH’s Use Cases
2.  Deep Learning Approaches for Abandoned Object Detection
3.  AI in Soccer: Smart Analytics from Gameplay Videos
4.  Common challenges and possible solutions in Object Detection

Object Detection and other visual recognition techniques

Object detection is a computer vision technique that aims to not only recognize, but also locate objects using boxes (called bounding box) in images. It’s one of the visual recognition problems in computer vision, and is often differentiated with the following terms: image classification, semantic segmentation, and instance segmentation.

To better explain the differences, let’s take a look at the image below:

出典:Recent Advances in Deep Learning for Object Detection
キャプション:Figure 1: Comparison of different visual recognition tasks in computer vision.
https://arxiv.org/pdf/1908.03673.pdf

In the image above, we can see a single image processed in four different ways with increasing complexity. Image classification (a) simply outputs a single class or category in an image (cow). Object detection (b) improves on this by recognizing multiple objects at once, and also localizing the objects in the form of boxes (cows + box locations). Semantic segmentation (c) predicts the category label in pixel level, but doesn’t differentiate between objects (cow + pixel locations). Lastly, Instance segmentation (d) improves on semantic segmentation by differentiating between objects (cows + pixel locations). As you may have noticed, the information that can be gained is increased from one approach to the next (a to d).

Now, you may be wondering, why are we writing an “Object Detection blog series” and not “Instance Segmentation blog series”? It seems to give more useful information, doesn’t it?

This decision is primarily made to satisfy our goals with minimal development complexity. In the use cases we’ve had (as you will learn later on), we just need to know the location of the objects in the images to solve the problems. Moreover, there is an increase in training, processing, and dataset requirement complexity when using instance segmentation techniques compared to object detection techniques.

I hope the explanation above clarified the different visual recognition tasks and justified our decision on applying object detection techniques to our use cases. If you want to learn more on how Object Detection models work, along with some of the recent Deep Learning models, please check this blog (in Japanese) here.

Object Detection Use Cases in ARIH

The following projects are three real-world problems that our team worked on using Object Detection.
Using Object Detection as the main technique, we can formulate our own logic or system to solve complex problems as we will introduce later on.

Visual Surveillance: Potential Threat Detection

Visual surveillance is one of the important applications of computer vision techniques.
As the use of surveillance cameras increase, the demand for automatic methods for video analysis also increases.

One use case of visual surveillance is abandoned luggage detection for potential threat detection. Object detection can be used to detect abandoned luggage in crowded public spaces like train stations.With this, automatic warnings can be issued when a luggage has been left unattended for some period.

出典:Real-Time Deep Learning Method for Abandoned Luggage Detection in Video
キャプション:Figure 3. Examples of abandoned luggage items detected by our approach based on SOD and CCNN trained with generated samples.
https://arxiv.org/pdf/1803.01160.pdf

Sports: Practice Video Gameplay Information Extraction

Sports organizations have begun to realize that there is a lot of potential in using data mining in sports. Applications that can be developed using these sports-related data include, but not limited to, performance evaluation, training and strategic planning, and patterns prediction*.  

One practical application of object detection in sports is for automatic extraction of gameplay information or logs i.e., player and ball positions during Soccer games. Normally, companies focus on generating logs during matches takes considerable time and manpower. Automating and applying this process on practice games could help in formulating strategies to maximize each players’ potential in preparation for an actual match.

*出典:Computational Intelligence in Sports: A Systematic Literature Review, 2018, https://arxiv.org/ftp/arxiv/papers/1810/1810.12850.pdf

出典:Self-Supervised Small Soccer Player Detection and Tracking
キャプション:Figure 4: Visual performances of the proposed tracker on two challenging sequences (1 second apart frames).
https://arxiv.org/pdf/2011.10336.pdf

Transportation: Vehicle Traffic Counter

Traffic monitoring system is a critical part of Intelligent Transport Systems (ITS).
With traffic monitoring system in place, traffic data can be collected and analyzed to help in planning roadway systems, improve safety, and establish future transportation plans*.

Automatic identification and counting of vehicles from video feeds is an important field in traffic monitoring. Developing a robust object detection model that can handle different vehicles remains a challenge. However, developing a robust model for this task can reduce significant manual labor and can be a critical help in formulating policies and strategies.

*出典:Intelligent Traffic Monitoring Systems for Vehicle Classification: A Survey, 2020, https://arxiv.org/pdf/1910.04656.pdf

出典:Vision-based vehicle detection and counting system using deep learning in highway scenes
キャプション:Fig. 12 Trajectory of the vehicle and detection line
https://etrr.springeropen.com/articles/10.1186/s12544-019-0390-4

Summary

In this blog, we were able to discuss and differentiate the different visual recognition techniques and justified the use of object detection in our use cases.Moreover, we showed the usefulness of object detection techniques in different real-world applications through the introduction of our three use cases.

For the next two blogs, we’ll be providing a more detailed information on two of the presented use cases: visual surveillance and sports.

For more information about the use cases, please contact our team.

 

■ 本ページでご紹介した内容・論文の出典元/References

Xiongwei Wu, Doyen Sahoo, Steven C.H. Hoi, ”Recent Advances in Deep Learning for Object Detection”, Figure 1: Comparison of different visual recognition tasks in computer vision., arXiv:1908.03673v1, https://arxiv.org/pdf/1908.03673.pdf

Sorina Smeureanu, Radu Tudor Ionescu, “Real-Time Deep Learning Method for Abandoned Luggage Detection in Video”, Figure 3. Examples of abandoned luggage items detected by our approach based on SOD and CCNN trained with generated samples., arXiv:1803.01160v3, https://arxiv.org/pdf/1803.01160.pdf

Computational Intelligence in Sports: A Systematic Literature Review, 2018, https://arxiv.org/ftp/arxiv/papers/1810/1810.12850.pdf

Samuel Hurault, Coloma Ballester, Gloria Haro, “Self-Supervised Small Soccer Player Detection and Tracking”, Figure 4: Visual performances of the proposed tracker on two challenging sequences (1 second apart frames)., arXiv:2011.10336v1, https://arxiv.org/pdf/2011.10336.pdf

Intelligent Traffic Monitoring Systems for Vehicle Classification: A Survey,  2020, https://arxiv.org/pdf/1910.04656.pdf

Huansheng Song, Haoxiang Liang, Huaiyu Li, Zhe Dai, Xu Yun, “Vision-based vehicle detection and counting system using deep learning in highway scenes”, Fig. 12 Trajectory of the vehicle and detection line, https://etrr.springeropen.com/articles/10.1186/s12544-019-0390-4

 

マクニカのARIH(AI Research & InnovationHub)では、最先端のAI研究・調査・実装による評価をした上で最もふさわしいAI技術を組み合わせた知見を提供し、企業課題に対する最適解に導く活動をしています。
詳細は下記よりご覧ください。

関連記事

*テックブログAI女子部*
Deep Learning Approaches for Abandoned Object Detection | Object Detective Blog Series (Part 2)

*テックブログAI女子部*
AI in Sports: Soccer Analytics from Gameplay Videos | Object Detective Blog Series (Part 3)

*テックブログAI女子部*
Common challenges and possible solutions in Object Detection | Object Detective Blog Series (Part 4)