It has become difficult to meet OEM requirements with conventional ICs such as SoCs and CPUs.

“Voice operation for in-vehicle products” such as audio, air conditioning, and car navigation are being realized in luxury cars.
It is very convenient to be able to operate the device in the car without taking your hands off the steering wheel. In the future, I think that there will be demand for introduction to other than luxury cars.

When considering the realization of voice operation, it is common to think of support with conventional ICs such as SoCs and microcomputers. SoCs dedicated to voice control are on the market.

However, did you know that distributing processing with an “FPGA” is more advantageous than realizing it with conventional ICs alone?

In this article, we will introduce the scene where “FPGA” is effective in “realization of voice operation in in-vehicle products”.
By all means, after knowing the effectiveness of FPGA, please bring “introduction of FPGA” as one of the materials for consideration.

Performance and effectiveness of FPGA required for realizing voice operation of in-vehicle products

Flow until realization of voice operation and scene where FPGA is effective

First of all, I will introduce the flow for realizing voice operation and in which process “FPGA” is effective.
In order to realize voice operation, the following “four processes” are required.

1. Using a microphone to convert human voice into an audio signal (input)

2. Filter processing is performed so that the audio signal contains only the information necessary for speech recognition (preprocessing)

3. Based on the contents of the filtered audio signal, determine the in-vehicle product to be operated and the output value for it (analysis)

Four. Send the output value to the in-vehicle product to be operated and execute the voice operation (output)

In the process of 1 to 4, it is common for the IC responsible for voice operation to perform "2. Preprocessing" and "3. Analysis".

However, in the special environment of a car, there are many performance requirements in the "2. Preprocessing" stage, and it is conceivable that the specifications may not be sufficient with only the IC responsible for voice operation.

Of course, it is possible to solve this problem by introducing a "high-spec IC", but it will be difficult because of the increase in "cost and power consumption".

Therefore, what Macnica recommends is: 2. The process of “pretreatment” “FPGA” This is a method of dispersing it into FPGA From the features of “2. The performance required for “pretreatment” is FPGA is good at
There are many things to do. Responsible for voice operation I C This could potentially reduce costs and power consumption compared to using a standalone PC.

Four Performances and FPGA Effectiveness Required for Preprocessing of Voice Manipulation

Next, we will introduce the “4 main performances” required in the “2.

Required performance [1] Support for 4 or more microphones

In realizing voice operation in the car, I think that each seat will have a microphone to pick up the voice from all seats. same number of seats Four to one or more microphones
Response will be required.

If the number of sound-collecting microphones increases, the number of interfaces will be the same as the number of microphones. However, conventional ICs were sufficient if they could support stereo, so the "number of interfaces" increased.
Most often there are no more than two. So the lack of interfaces will be a “challenge”.

Also, as the number of sound-collecting microphones increases, parallel processing is required for the audio signals sent from each microphone, but conventional ICs do not support multiple audio signal inputs.
Even if it can be handled, we think that it is likely to become a "problem" that the processing time and latency increase.

On the other hand, with "FPGA", the circuit data can be rewritten later, so it is possible to increase the number of interfaces according to the number of microphones.

Also, since an “FPGA” is a hardware device, parallel processing can be done quickly.

Required performance [2] Support for beamforming

“Beamforming” is a technique that only receives sound from a specific direction.
By recognizing the microphone that picked up the sound first, it instantly determines which seat's microphone should be used. It is necessary to turn off unnecessary microphones and concentrate on processing only the sound from one microphone and one seat, so it is an indispensable technology for realizing voice operation.

In order to achieve beamforming, it is necessary to quickly respond to multiple audio signals sent from microphones in real time. In other words, ICs are required to have “quick parallel processing”.

However, as mentioned above, “conventional ICs” do not support multiple audio signal inputs, and even if they do, processing time and latency will increase, which is likely to be a “problem”.

On the other hand,"FPGA" is a hardware device that is strong in parallel processing and can quickly respond to multiple voices in real time.

Required performance [3] Support for noise canceling

“Inside the car” is an environment with a lot of noise, including road noise. I think that a "noise canceling function" will be required for the purpose of enabling accurate voice operation.
I expect. By the way, noise canceling is a technology that "quiets the inside of the car" by picking up and analyzing the noise inside the car with a microphone and playing back the sound that cancels out the noise from the speaker.

To achieve this kind of noise cancellation, you need a noise canceling microphone and speakers. Also, since it is necessary to instantly create a sound that cancels out the noise,
Real-time audio processing is paramount. If you use an IC dedicated to noise cancellation, real-time performance will not be a problem.

However, "FPGA" can support "other functions" such as beamforming while achieving noise cancellation.
Furthermore, since it supports various interfaces, it can be connected to a wide variety of microphones and speakers.

Required performance [4] Flexible response to device changes

I hope you have understood the advantages of “FPGA” in “preprocessing”.
In addition to the above, it is also a great advantage that the same FPGA can be reused even if the type of input microphone for voice operation or the output destination in-vehicle product changes.

As explained so far, "FPGA" has the feature that "circuit data can be rewritten even later". Since it can flexibly support the number and types of interfaces, even if the specifications change during development or the number and types of mounted microphones and in-vehicle products change due to model changes, it can be handled flexibly by rewriting the circuit data. can.

In addition to reducing the "man-hours required for development", the same FPGA can be reused to reduce the "frequency of board development/modification", so you can expect cost savings in the long run.

How to deploy FPGA

I hope you have understood how effective “FPGA” is for realizing “voice operation” in in-vehicle products.

However, even if you are interested in FPGAs, many people may find it difficult to "design and develop FPGAs"when it comes to actually implementing them.
Therefore, in order to minimize the hurdles, Macnica supports the introduction of FPGA by following the steps [1] to [5] below.

 

[1] Hearing about the functions you want to realize and issues with existing systems

[2] Macnica posts a concrete configuration plan realized with FPGA

[3] Proposing additional benefits that make use of the advantages of FPGA

[4] Confirmation of FPGA model number and posting of approximate price

[5] Display of FPGA design development cost and period

 

All you need to prepare is the text and diagrams summarizing your requirements in [1]. It can be as simple as a PowerPoint presentation.

You can proceed with FPGA design and development while minimizing man-hours. We also provide after-sales support after delivery of the FPGA, so please do not worry.

There is also an article that introduces Macnica support in detail, so please refer to it as well.

Is the introduction of FPGA really realistic? There are also articles that introduce the performance of current FPGAs, so please refer to them as well.

Summary

In this article, we introduced how “FPGA” is effective in “realizing voice operation”.

Since the inside of a car is a special environment, various problems can be considered if voice operation is realized using only conventional ICs.
In many cases, it is advantageous to distribute the performance required for preprocessing to the FPGA.
Specifically, we introduced "problems with conventional ICs" and "effectiveness of FPGA" in the "four performances" below.

Required performance [1] Support for 4 or more microphones

Required performance [2] Support for beamforming

Required performance [3] Support for noise canceling

Required performance [4] Flexible response to device changes

In both cases, the required performance can be achieved by taking advantage of the flexibility of being able to rewrite circuit data later, which is a feature of "FPGA", and the high-speed parallel processing of hardware devices.

We anticipate that the need for voice operation in in-vehicle products will increase in the future.
By all means, please bring “introduction of FPGA” as one of the materials for consideration when realizing voice operation.

Inquiry

If you have any questions regarding this article, please contact us below.