Site Search

Use Case of Generative AI Running on the Edge: Image Understanding and Conversation Support

The need for edge-based generative AI is here

The use of generative AI has been expanding rapidly in recent years, and companies are increasingly adopting it in the areas of improving business efficiency and utilizing knowledge.

In fact, approximately 87% of companies worldwide are "introducing or piloting generative AI," and over 60 % of companies have positioned generative AI as "the top priority investment area for the future" (*).

*Source: BAIN&COMPANYGenerative AI virtually ubiquitous in global business as the technology spreads at a near-unprecedented rate — Bain & Company proprietary survey

 

On the other hand, many of the currently mainstream generative AIs are based on the use of the cloud.

Therefore, issues remain, such as dependency on the communication environment, the need to send data externally, and concerns about delays and security.

 

In particular, in the manufacturing, infrastructure, and energy sectors, there are many situations where stable communications cannot be ensured or confidential information cannot be released to the outside world, making it difficult to fully utilize cloud-based AI.

Furthermore, there are limits to how much humans can continue to configure and troubleshoot increasingly complex equipment.

Given this background, generative AI that operates on the edge is attracting attention.

  

In this article, we will introduce a video example of how generative AI can be used to understand video in a local environment and provide on-site support through conversation.

Case study: Edge-generated AI that understands images and provides conversational support

At work sites that handle complex equipment, there is a huge amount of information, including how to set up, operate, diagnose problems, and how to recover.

In particular, when a problem occurs, it is necessary to refer to this information immediately and make accurate decisions.

  

In this demo, a generative AI running on an edge device analyzes camera footage in real time and explains the equipment's condition and any possible abnormalities in natural language.

Users can grasp the situation on-site simply by talking to the device.

It goes beyond simple object detection by understanding the context of the video, interpreting the situation as meaning, and responding in a conversational format.

 

A demo of video analysis and conversation response can be found here (video in English).

The following processing is performed within the edge device based on the captured video:

  

① Detection of objects and conditions using AI

②Situation estimation through scene analysis

3) Generating natural language responses using generative AI

④All processing is completed within the local environment

  

This allows field personnel to obtain the information they need on the spot without having to search through extensive manuals.

This will improve the speed of decision-making when a problem occurs, reduce variations in response, and standardize work quality.

 

Furthermore, since processing is completed within the local environment without the need for a cloud connection, confidential data is not sent externally, allowing for safe and stable use of AI.

It will also be a practical solution to issues such as labor shortages and skill transfer, supporting the workplace.

Expanding expertise with RAG

This solution utilizes RAG (Retrieval Augmented Generation).

RAG is a system in which pre-trained generative AI generates answers by referencing information such as external manuals and technical documents on a case-by-case basis.

  

This allows for flexible and rapid expansion of application expertise without the need to retrain models.

You can utilize existing document assets and retrieve the information you need on the spot.

This will speed up repair response, reduce the training burden, and standardize know-how.

Summary: Generative AI goes from a cloud-based tool to a field assistant

As the use of generative AI expands, its true value lies in its ability to be used immediately and safely in the field.

The edge-generated AI introduced in this article directly supports on-site decision-making and work by understanding video images, interpreting situations as meaning, and communicating in natural language.

Furthermore, by utilizing RAG, you can incorporate existing manuals and technical documents, allowing you to flexibly expand your expertise according to your application.

  

Generative AI, which is not affected by communication environments or security constraints and is completed within a local environment, is one form of practical AI utilization in the manufacturing and infrastructure fields.

Generative AI that runs on the edge is not simply a tool for improving business efficiency, but is evolving into a foundation that improves on-site knowledge and judgment.

  

The SiMa.ai​ ​MLSoC™ used in this demonstration is a next-generation chip optimized for realizing AI that can be used in the field.

Despite its compact size and energy-saving design, it achieves highly efficient inference of up to 50 TOPS and supports high-speed production lines with real-time processing at 120 FPS. The flexible development environment provided by the built-in Arm Cortex-A65 is also a major benefit of its introduction.

 

If you are considering introducing AI, you can start small like in this case study to verify the improvement effects on your own production line.

We hope you will find this article useful as your first step in utilizing AI.

Inquiry

Please feel free to contact us with any questions about our products, technical inquiries, sample requests, or estimates.

SiMa.ai Manufacturer Information Top