In this blog, I aim to share the following information:

  1. CNN, although popular in image datasets, can also be used (and may be more practical than RNNs) on time series data
  2. Present a popular architecture for time series classification (univariate AND multivariate) called Fully Convolutional Neural Network (FCN)


Time series data can be any sort of information collected successively in time.Since processes are often measured relative to time, this type of data exists in almost every task. Some examples of it are stock prices, industrial processes, electronic health records, human activities, sensor readings, and language. Because it’s essentially ubiquitous, extracting value from time series data around us is only practical. Popular ways of doing so are through time series analysis, forecasting, and classification.

Back when we learn our first concepts in Deep Learning, we often have three architecture choices depending on the data type we’re handling:

1.     Multi-layered Perceptron (MLP) for tabular data (often used as baseline)
2.     Convolutional Neural Networks (CNN)-based for spatial data (such as images data)
3.     Recurrent Neural Networks (RNN)-based for sequential data (such as time series data)

To briefly explain why these standards were made: CNNs use convolution operations that can handle spatial information available in images while RNNs have memory which can store temporal information available in time series data. MLP, on the other hand, is a classical neural network often used as a baseline nowadays.

出典:“Towards Better Analysis of Deep Convolutional Neural Networks”
キャプション:Fig. 2. The typical architecture of a CNN.
 Fig. 3. Illustration of convolution and max-pooling: (a) convolution; (b) max-pooling.

CNN Illustration: (left) Convolutional operation, (right) Typical architecture of CNN

出典:Understanding LSTM Networks
キャプション:Recurrent Neural Networks have loops.
An unrolled recurrent neural network.

RNN Illustration: (left) Loops in RNN, (right) Unrolled RNN through time

CNN in time series data

As mentioned, CNNs’s convolutions are popularly known to work on spatial or 2D data. What’s less popular is that there are also convolutions for 1D data. This allows CNN to be used in more general data type including texts and other time series data. Instead of extracting spatial information, you use 1D convolutions to extract information along the time dimension. Pretty neat, right?

Conv1D: Convolving on time dimension

Trivia: If there’s conv1D and conv2D, there’s also conv3D operations which applies spatial convolutions over volumes! Conv3D is useful for sequence of images like MRI scans or videos.

Now, why should we use CNN and not RNN? I will lay down some points:

  1. CNNs are computationally cheaper than RNNs: CNN learns by batch while RNNs train sequentially. As such, RNN can’t use parallelization because it must wait for the previous computations.
  2. CNNs don’t have the assumption that history is complete: Unlike RNNs, CNNs learn patterns within the time window. If you have missing data, CNNs should be useful.
  3. In a way, CNNs can look forward: RNN models only learn from data before the timestep it needs to predict. CNNs (with shuffling) can see data from a broader perspective.
  4. More active research in CNN: there are some arguments that RNN / LSTM is becoming irrelevant. Whether it’s true or not, I think it depends on how we look at it.

“So, we should ALWAYS use CNN in time series data then?” Safe answer is no. There are times that you want your model to be dependent from long history (language) and handle varying sizes of input and outputs. In such cases, RNNs are more suitable for those tasks.

As always, in AI and ML, you can only make sure by consulting to domain experts, referencing to research papers or other learning resources, and by doing actual experiments. ¯\_(ツ)_/¯

Example architecture

Now, I’ll introduce an architecture that illustrates the use of convolutions for time series data. The succeeding discussion is primarily based from a review paper on time series classification using deep learning techniques entitled “Deep learning for time series classification: a review”. In the paper, Fawaz et al. compared several time series classification architectures on both univariate and multivariate time series data. The image below shows the general framework used.

出典:Deep learning for time series classification: a review
キャプチャ:Fig. 1: A unified deep learning framework for time series classification.

General Framework for time series classification

One of the architectures compared in the paper is the Fully Convolutional Neural Network (FCN). From the assessment, FCN did not only perform well but also offered prediction explainability using Class Activation Maps (CAM). So, let’s start with FCN!

出典:Deep learning for time series classification: a review
キャプチャ:Fig. 3: Fully Convolutional Neural Network architecture

Fully Convolutional Neural Network Architecture on time series classification

The FCN architecture, first introduced in the study of Wang et al. (2017), is described in the image above. This architecture has the following main properties:

  1. Mainly convolutional network without local pooling layers. This means that the length of time series is kept constant.
  2. Instead of using fully connected layer in the final layer, this was replaced with Global Average Pooling (GAP). This allows us to highlight which parts of input time series contributed to the class prediction (more on this later on)

For those who understand better with code, the code used in the paper was made public by the authors. Please see the short code snippet (in Keras) below.

出典:Deep learning for time series classification

FCN Architecture Code

As aforementioned, use of GAP layer enables us to get CAM to explain the prediction. In the paper, they showed some example CAM results using GunPoint dataset. As can be seen in the image below, CAM shows the different discriminative areas (red and yellow segments) for each class.

出典:Deep learning for time series classification: a review
キャプチャ:Fig. 13: Highlighting with the Class Activation Map the contribution of each time series region for both classes in GunPoint when using the FCN and ResNet classifiers.
Red corresponds to high contribution and blue to almost no contribution to the correct class identification (smoothed for visual clarity and best viewed in color) (Color figure online).

Left: CAM on Class-1, Right: CAM on Class-2.
The trends for each graph show each time series’ CAM results.
The color denotes how much contribution the time segment has on the class (whether predicted as Class-1 or Class-2).

With the CAM visualizations, we can know by how much the time segments contribute to each class. You can have your CNN time series classifier and explainer in one!



In this blog, I shared another use (less popular, I believe) of Convolutional Neural Networks (CNN) which is for time series data as well as some reasons on why this makes sense. To demonstrate the idea, I briefly discussed an architecture, Fully Convolutional Neural Networks (FCN), which uses convolutional layers with Global Average Pooling (GAP) to 1) perform either univariate or multivariate time series classification, and 2) explain the time segments’ contribution to the class prediction.


■ 本ページでご紹介した内容・論文の出典元/References

Mengchen Liu, Jiaxin Shi, Zhen Li, Chongxuan Li, Jun Zhu, Shixia Liu, “Towards Better Analysis of Deep Convolutional Neural Networks”, arXiv:1604.07043v3, https://arxiv.org/pdf/1604.07043.pdf

“Understanding LSTM Networks”, https://colah.github.io/posts/2015-08-Understanding-LSTMs/

Hassan Ismail Fawaz, Germain Forestier, Jonathan Weber, Lhassane Idoumghar, Pierre-Alain Muller, “Deep learning for time series classification: a review”, arXiv:1809.04356v, https://arxiv.org/pdf/1809.04356v4.pdf

“Deep learning for time series classification”, https://github.com/hfawaz/dl-4-tsc/blob/master/classifiers/fcn.py

マクニカのARIH(AI Research & InnovationHub)では、最先端のAI研究・調査・実装による評価をした上で最もふさわしいAI技術を組み合わせた知見を提供し、企業課題に対する最適解に導く活動をしています。