Site Search

Stepping up to Machine Learning (ML)

*This article is a Japanese translation by Macnica of a blog written by an engineer at DSP Concepts.

As the DSP Concepts Machine Learning enablement program continues to expand, we wanted to clarify some questions that have emerged as our customer and partner community better understands the purpose, capabilities, and benefits of machine learning (ML) for embedded audio DSP. We sat down with Josh Morris, ML Engineering Manager, to answer some questions.

Machine learning concept

--ML can be tentatively described as the branch of computer science concerned with algorithms that perform pattern recognition based on input data, employed in models that automatically improve with training experience. There are many nuances about what "experience" means in this context. How can we refine this description further?

JM: Supervised learning is probably the most common form of ML in production today. Roughly speaking, models trained using supervised learning are learning from experience. Successful supervised learning requires two things: a loss function and a dataset. The dataset provides a mapping between input features and the desired, truthful output. The loss function is a differentiable equation that estimates how accurately the model's predicted output matches the truth. Backpropagation allows us to update the model's weights based on how accurate the predictions are. By showing the model the dataset multiple times, updating the weights each time, the model gradually learns from experience.

--What is the main difference between the two concepts of ML algorithms and ML models?

JM: Algorithm is a pretty broad term that includes a lot of things other than machine learning. When we talk about models, we're usually talking about trained instances of a learning algorithm, like a neural network. A model is not itself an algorithm, but rather the artifact of a training process that is an algorithm. You can think of it as a file format that stores the results of a training process.

What are the most important considerations when it comes to training machine learning models?

JM: Data and process. There are already many algorithms defined in the framework. When it comes to model quality, organizational and data practices are the real differentiators. You need to really understand the data you're using and have practices that ensure reproducibility.

Q: The quality and quantity of training data is crucial to the final performance of a machine learning model. Can you tell us a bit more about features and labeling?

JM: Yes, quality is more important than quantity, but having more data is always better.

Labeling can take many forms. I think of it as a mapping of inputs and outputs to the task we want our model to solve. For classification, the labels would be "dog" or "cat". For a denoising algorithm, the target would be a clean audio recording.

Feature engineering is about taking input data and transforming it into a form suitable for your model. Many audio applications take time-domain audio and transform it into the frequency domain via FFT. By transforming audio into the frequency domain, the data has an inherent 2-dimensional structure with frequency on the Y-axis and time on the X-axis. The convolutional layers of neural networks can take advantage of this structural information because they pass a 2-dimensional filter over the input data. We could skip this transformation, but at the cost of much larger models, as we would have to do more work to extract the relevant information. This is why feature engineering and domain expertise are still so important to machine learning.

We can distinguish between what we call human intuition and how machines learn patterns. What are the main differences to consider when applying ML to your tasks?

JM: Humans have a much deeper understanding of what they're doing. They can also learn new tasks much more quickly than current ML methods. I tend to think of ML models as correlation machines that make powerful mappings between inputs and outputs based on the data they were trained on. Models are generally not good at extrapolating or generalizing to data that differs from the data they were trained on.


--How do we know if machine learning will work well for a given task or problem?

JM: It's interesting, a lot of the time the intuition check is whether a human can identify a pattern given the input data, and a lot of intuition comes from matching the right kind of model to the kind of data that you have and the task that you're trying to solve.

What are some audio applications of machine learning?

JM: Recognition, transcription, and denoising are all common applications of machine learning in the speech domain.

- Finally, tell us about some audio-related tasks that DSP Concepts would like to approach using ML in the near future.

JM: Right now, we are very focused on the experience of audio application developers who use Audio Weaver as their development and prototyping platform. One of my team's goals is to reduce the time it takes to move models into production by leveraging Audio Weaver at key points in the ML lifecycle. We are excited to release the Audio Weaver ML Module Pack in January, which provides the support you need for feature extraction, model execution, and model tuning in our platform.

Recommended related articles

Inquiry

If you have any questions regarding this article, please contact us below.

DSP Concepts Manufacturer Information Top

DSP Concepts Manufacturer Information If you would like to return to the top page, please click below.