LSTM network applies memory units to remember RNN outputs. Memory units contain gates to deal with the output information. The importance of the information is decided by the weights measured by the algorithm. The

**forget gate**discards the output if it is useless, the

**input gate**allows to update the state, and the

**output gate**sends the output. In this post, we'll learn how to fit and predict regression data with a Keras LSTM model in R.

This tutorial covers:

- Generating sample dataset
- Reshaping input data
- Building Keras LSTM model
- Predicting and plotting the result

`library(keras)`

**Generating sample dataset**

We'll create sample dataset for this tutorial. Here, we'll create 'a' vector as a dataset of regression data.

N = 400 set.seed(123) n = seq(1:N) a = n/10+4*sin(n/10)+sample(-1:6,N,replace=T)+rnorm(N)

` `

```
head(a,20)
[1] 3.698144 7.307090 3.216936 8.500867 8.003362 1.382323 5.488268
[8] 9.074807 8.684215 6.311856 10.784075 7.171844 10.386709 7.825735
[15] 3.497473 13.273991 5.225496 3.972325 5.448927 10.352474
```

**Reshaping input data**

Next, we'll create 'x' and 'y' training sequence data. Here, we apply a window method with the size of the 'step' value. The result (y value) comes after the sequence of window elements (x values), then the window shifts to the next elements of x, and y value is collected and so on.

step = 2 # step is a window size

To cover all elements in a vector, we'll add a 'step' into the last part of 'a' vector by replicating the last element.

a = c(a, replicate(step, tail(a, 1)))

Creating x - input, and y - output data.

x = NULL y = NULL for(i in 1:N) { s = i-1+step x = rbind(x,a[i:s]) y = rbind(y,a[s+1]) }

` `

cbind(head(x), head(y)) [,1] [,2] [,3] [1,] 3.698144 7.307090 3.216936 [2,] 7.307090 3.216936 8.500867 [3,] 3.216936 8.500867 8.003362 [4,] 8.500867 8.003362 1.382323 [5,] 8.003362 1.382323 5.488268 [6,] 1.382323 5.488268 9.074807

Input data should be an array type, so we'll reshape it.

X = array(x, dim=c(N, step,1))

**Building Keras LSTM model**

Next, we'll create Keras sequential model, add an LSTM layer, and compile it with defined metrics.

model = keras_model_sequential() %>% layer_lstm(units=128, input_shape=c(step, 1), activation="relu") %>% layer_dense(units=64, activation = "relu") %>% layer_dense(units=32) %>% layer_dense(units=1, activation = "linear") model %>% compile(loss = 'mse', optimizer = 'adam', metrics = list("mean_absolute_error") ) model %>% summary() ____________________________________________________________________________ Layer (type) Output Shape Param # ============================================================================ lstm_16 (LSTM) (None, 128) 66560 ____________________________________________________________________________ dense_36 (Dense) (None, 64) 8256 ____________________________________________________________________________ dense_37 (Dense) (None, 32) 2080 ____________________________________________________________________________ dense_38 (Dense) (None, 1) 33 ============================================================================ Total params: 76,929 Trainable params: 76,929 Non-trainable params: 0 ____________________________________________________________________________

**Predicting and plotting the result**

Next, we'll train the model with X and y input data, predict X data, and check the errors.

model %>% fit(X,y, epochs=50, batch_size=32, shuffle = FALSE)

y_pred = model %>% predict(X) scores = model %>% evaluate(X, y, verbose = 0) print(scores) $loss [1] 11.84502 $mean_absolute_error [1] 2.810479

Finally, we'll plot the results.

x_axes = seq(1:length(y_pred)) plot(x_axes, y, type="l", col="red", lwd=2) lines(x_axes, y_pred, col="blue",lwd=2) legend("topleft", legend=c("y-original", "y-predicted"), col=c("red", "blue"), lty=1,cex=0.8)

You may change the step size and check the prediction results.

In this tutorial, we've briefly learned how to use Keras LSTM to predict regression data in R. Thank you for reading!

The full source code is listed below.

library(keras) N = 400 step = 2 set.seed(123) n = seq(1:N) a = n/10+4*sin(n/10)+sample(-1:6,N,replace=T)+rnorm(N) a = c(a,replicate(step,tail(a,1))) x = NULL y = NULL for(i in 1:N) { s = i-1+step x = rbind(x,a[i:s]) y = rbind(y,a[s+1]) } X = array(x, dim=c(N,step,1)) model = keras_model_sequential() %>% layer_lstm(units=128, input_shape=c(step, 1), activation="relu") %>% layer_dense(units=64, activation = "relu") %>% layer_dense(units=32) %>% layer_dense(units=1, activation = "linear") model %>% compile(loss = 'mse', optimizer = 'adam', metrics = list("mean_absolute_error") ) model %>% summary() model %>% fit(X,y, epochs=50, batch_size=32, shuffle = FALSE, verbose=0) y_pred = model %>% predict(X) scores = model %>% evaluate(X, y, verbose = 0) print(scores) x_axes = seq(1:length(y_pred)) plot(x_axes, y, type="l", col="red", lwd=2) lines(x_axes, y_pred, col="blue",lwd=2) legend("topleft", legend=c("y-original", "y-predicted"), col=c("red", "blue"), lty=1,cex=0.8) |

References:

This model only looks good because it probably overfits the data. You did not include any test/validation data to see if the model generalizes out of the training sample. Additionally, with only 400 data points but almost 80,000 learnable parameters, the memory capacity of the net is likely too large for this task. This means that the net was probably able to memorize the test data's specific input-output mappings, and will thus lack predictive power.

ReplyDeleteGood point! But, here I did not intend to build a perfect predictive model. The purpose of this post is to show a simple, workable example with a random data for beginners. Readers should consider every aspect of the model building when they work with real problems.

DeleteHello, excelent post, Im in a proyect using this algorithm and I have one question, if I have more predictors, on the model fit should I use ###fit(x1+x2,y,....) and the predictions ###predict(x1+x2) ??? or am I wrong?

ReplyDeleteThanks for your help. Great post.

You are welcome! You need to create combined X array data (contains all features x1, x2, ..) for your training and prediction. It goes like this;

Deletex1, x2, y

2, 3, 3

3, 4, 4

2, 4, => 4

3, 5, => 5

4, 6, => 6

Here, each window contains 3 elements of both x1 and x2 series.

2, 3,

3, 4,

2, 4, =>4

3, 4,

2, 4,

3, 5, => 5

2, 4,

3, 5,

4, 6, => 6

Thanks, I made an X array with all the predictors and it works. Got a mse = 15.9 (nice) with the default parameters, then I tunned the epochs parameter on the fit and got a better prediction. I´ve been tunnin with epochs and batch_size but I dont know very well how should I change the sequential keras model, (dense and units), I got 37 observations and 19 predictors. Can you give me advices with this tunning? Thanks for your time and post, my model's predictions are great, in fact I could stop now with my results but I want to improve and learn more about this model.

DeleteGood! Your data is too small to evaluate your model and improve the performance. To check the improvement in your model;

Delete1) Use bigger data,

2) Change the units number,

3) Add dense layer,

4) Add dropout layer, layer_dropout()

5) Change optimizer (rmsprop etc.)

Hi, I followed your advices and my model has improve, thanks. But (another doubt) I got some steps (number of samples to reach a new period) with differents numbers, at the beginning of the sample, each 4 samples change the period, and at the end, that changes for 5 periods. How should I attack this problem? 2 models? how can be ensembled?

DeleteThanks for your time, really.

How do I tune this model?

ReplyDeleteHi, I tried this method work for time series data with last 4 year monthly values. The predicted values are vague and i'm not sure of what i did wrong. I also tried by changing the step size but it is also not working out.Can you please help me out with it?

ReplyDelete