Multi-output Regression Example with Keras LSTM Network in R

   This tutorial is about how to fit and predict the multi-output regression data with LSTM Network in R. As you may already know, the LSTM ( Long Short-Term Memory) network is a type of recurrent neural network and used to analyze the sequence data. We'll use Keras R interface to implement keras neural network API in R. The tutorial covers:
  1. Preparing the data
  2. Defining the model
  3. Predicting and visualizing the result
  4. Source code listing
We 'll start by loading the required packages of R.

library(keras)
library(caret)

Preparing the data

   We 'll create a multi-output dataset for this tutorial. It is randomly generated data with some rules. You can check the logic of data generation in the below function. There are three inputs and two outputs in this dataset. We'll plot the generated data to check it visually.

n = 600
s = seq(.1, n / 10, .1)
x1 = s * sin(s / 50) - rnorm(n) * 5
x2 = s * sin(s) + rnorm(n) * 10
x3 = s * sin(s / 100) + 2 + rnorm(n) * 10
y1 = x1 + x2 + x3 + 2 + rnorm(n) * 2
y2 = x1 + x2 / 2 - x3 - rnorm(n)
df = data.frame(x1, x2, x3, y1, y2)
 
plot(s, df$y1, ylim = c(min(df), max(df)), type = "l", col = "blue")
lines(s, df$y2, type = "l", col = "red")
lines(s, df$x1, type = "l", col = "green")
lines(s, df$x2, type = "l", col = "yellow")
lines(s, df$x3, type = "l", col = "gray") 

Here, the blue line is y1, the red line is y2, and others are x1, x2, x3 variables.
Next, we'll split the dataset into the train and test parts.

indexes = createDataPartition(df$x1, p = .85, list = F)
train = df[indexes,]
test = df[-indexes,]

Then, we'll separate x and y parts of data and convert them into the matrix type.

xtrain = as.matrix(data.frame(train$x1, train$x2, train$x3))
ytrain = as.matrix(data.frame(train$y1, train$y2))
xtest = as.matrix(data.frame(test$x1, test$x2, test$x3))
ytest = as.matrix(data.frame(test$y1, test$y2))

Next, we need to reshape the x train and test data dimension. First, we'll check the dimensions of x and y data.

cat("xtrain dimension:",dim(xtrain))
xtrain dimension: 512 3
  
cat("ytrain dimension:",dim(ytrain))
ytrain dimension: 512 2

Now, we'll add another dimension and make it three-dimensional x input.

xtrain = array(xtrain, dim = c(nrow(xtrain), 3, 1))
xtest = array(xtest, dim = c(nrow(xtest), 3, 1))
 
cat("xtrain dimension:",dim(xtrain))
xtrain dimension: 512 3 1 
 
cat("xtest dimension:",dim(xtest))
xtest dimension: 88 3 1

Next, we'll extract the input and output data dimensions to use in the LSTM model.

in_dim = c(dim(xtrain)[2:3])
out_dim = dim(ytrain)[2]
 
cat(in_dim)
3 1
cat(out_dim)
2


Defining the model

We'll define the sequential model, add the LSTM layers with ReLU activations, add the Dense layer for output, and Adam optimizer with MSE loss function. We'll set the input dimension in the first layer and output dimension in the last layer of the model.

model = keras_model_sequential() %>%
  layer_lstm(units = 64, activation = "relu", input_shape = in_dim) %>%    
  layer_dense(units = out_dim, activation = "linear")
 
model %>% compile(
  loss = "mse",
  optimizer = "adam")
 
model %>% summary()
________________________________________________________________________
Layer (type)                    Output Shape                  Param #    
========================================================================
lstm_1 (LSTM)                   (None, 64)                    16896      
________________________________________________________________________
dense_7 (Dense)                 (None, 2)                     130        
========================================================================
Total params: 17,026
Trainable params: 17,026
Non-trainable params: 0
________________________________________________________________________

We'll fit the model with train data and check the training accuracy score.

model %>% fit(xtrain, ytrain, epochs = 100, batch_size=12, verbose = 0)
scores = model %>% evaluate(xtrain, ytrain, verbose = 0)
print(scores)
    loss 
2.008807


Predicting and visualizing the result

Finally, we'll predict the test data and check the accuracy of y1 and y2 with RMSE metrics.

ypred = model %>% predict(xtest)
 
cat("y1 RMSE:", RMSE(ytest[, 1], ypred[, 1]))
y1 RMSE: 2.396092 
 
cat("y2 RMSE:", RMSE(ytest[, 2], ypred[, 2]))
y2 RMSE: 1.247266

We can also check the results visually in a plot.

x_axes = seq(1:length(ypred[, 1]))

plot(x_axes, ytest[, 1], ylim = c(min(ypred), max(ypred)),
     col = "burlywood", type = "l", lwd = 2)
lines(x_axes, ypred[, 1], col = "red", type = "l", lwd = 2)
lines(x_axes, ytest[, 2], col = "gray", type = "l", lwd = 2)
lines(x_axes, ypred[, 2], col = "blue", type = "l", lwd = 2)
legend("topleft", legend = c("y1-test", "y1-pred", "y2-test", "y2-pred"),
       col = c("burlywood", "red", "gray", "blue"),
       lty = 1, cex = 0.9, lwd = 2, bty = 'n')


   In this tutorial, we've briefly learned how to fit and predict multi-output regression data with the Keras LSTM network model in R. The full source code is listed below.



Source code listing

library(keras)
library(caret)
 
n = 600
s = seq(.1, n / 10, .1)
x1 = s * sin(s / 50) - rnorm(n) * 5
x2 = s * sin(s) + rnorm(n) * 10
x3 = s * sin(s / 100) + 2 + rnorm(n) * 10
y1 = x1 + x2 + x3 + 2 + rnorm(n) * 2
y2 = x1 + x2 / 2 - x3 - rnorm(n)
df = data.frame(x1, x2, x3, y1, y2)
 
plot(s, df$y1, ylim = c(min(df), max(df)), type = "l", col = "blue")
lines(s, df$y2, type = "l", col = "red")
lines(s, df$x1, type = "l", col = "green")
lines(s, df$x2, type = "l", col = "yellow")
lines(s, df$x3, type = "l", col = "gray") 
 
indexes = createDataPartition(df$x1, p = .85, list = F)
train = df[indexes,]
test = df[-indexes,]
 
xtrain = as.matrix(data.frame(train$x1, train$x2, train$x3))
ytrain = as.matrix(data.frame(train$y1, train$y2))
xtest = as.matrix(data.frame(test$x1, test$x2, test$x3))
ytest = as.matrix(data.frame(test$y1, test$y2))
 
xtrain = array(xtrain, dim = c(nrow(xtrain), 3, 1))
xtest = array(xtest, dim = c(nrow(xtest), 3, 1))
 
cat("xtrain dimension:",dim(xtrain))
cat("xtest dimension:",dim(xtest)) 
 
in_dim = c(dim(xtrain)[2:3])
out_dim = dim(ytrain)[2]
cat(in_dim)
cat(out_dim)   
 
model = keras_model_sequential() %>%
  layer_lstm(units = 64, activation = "relu", input_shape = in_dim) %>%    
  layer_dense(units = out_dim, activation = "linear")
 
model %>% compile(
  loss = "mse",
  optimizer = "adam")
 
model %>% summary()
 
model %>% fit(xtrain, ytrain, epochs = 100, batch_size=12, verbose = 0)
scores = model %>% evaluate(xtrain, ytrain, verbose = 0)
print(scores)

ypred = model %>% predict(xtest)
 
cat("y1 RMSE:", RMSE(ytest[, 1], ypred[, 1]))
cat("y2 RMSE:", RMSE(ytest[, 2], ypred[, 2]))
x_axes = seq(1:length(ypred[, 1]))

plot(x_axes, ytest[, 1], ylim = c(min(ypred), max(ypred)),
     col = "burlywood", type = "l", lwd = 2)
lines(x_axes, ypred[, 1], col = "red", type = "l", lwd = 2)
lines(x_axes, ytest[, 2], col = "gray", type = "l", lwd = 2)
lines(x_axes, ypred[, 2], col = "blue", type = "l", lwd = 2)
legend("topleft", legend = c("y1-test", "y1-pred", "y2-test", "y2-pred"),
       col = c("burlywood", "red", "gray", "blue"),
       lty = 1, cex = 0.9, lwd = 2, bty = 'n')

2 comments:

  1. Very good example. matter of fact, the best example about LSTM I have seen

    ReplyDelete
  2. This is the best LSTM example for multivariate purposes out there. This has changed the ML game for me. I am eternally grateful. No bugs, no functions, nothing! (Just a note, I recommend scaling before getting predictions, especially when using multiple datasets, use mean and sd as scaling coefficients)

    ReplyDelete