Convolutional Autoencoder Example with Keras in R

   We can apply the convolutional neural networks to build the autoencoders. In this tutorial, we'll briefly learn how to build convolutional neural networks with Keras in R. Autoencoder learns to compress the given data and reconstructs the output according to the data trained on. It can only represent a data specific and a lossy version of the trained data. Convolutional networks perform well with image data. To train the autoencoder, we'll use the MNIST handwritten digits dataset. The tutorial covers:
  1. Preparing the data
  2. Defining the model
  3. Generating from test data
  4. Source code listing
   We'll start by loading the required Keras package in R. Note that for this tutorial we need the R interface of Keras API and RStudio. 

library(keras)

Preparing the data

   We'll use MNIST handwritten digit dataset to train the autoencoder. First, we'll load it and prepare it by doing some changes. Autoencoder requires only the input data so that we only use x part of the dataset. We'll scale it into the range of [0, 1].

c(c(xtrain, ytrain), c(xtest, ytest)) %<-% dataset_mnist()
xtrain = xtrain/255
xtest = xtest/255

Next, we need to changes the dimensions of data by adding one more dimension.

dim(xtrain)
[1] 60000    28    28
 
dim(xtest)
[1] 10000    28    28

x_train = array_reshape(xtrain, dim=c(dim(xtrain)[1], 28, 28, 1))
x_test = array_reshape(xtest, dim=c(dim(xtest)[1], 28, 28, 1))
 
print(dim(x_train))
[1] 60000    28    28     1
 
print(dim(x_test))
[1] 10000    28    28     1


Defining the model

   We'll define the autoencoder starting from the input layer. The input layer has a shape similar to the dimensions of the input data.

enc_input = layer_input(shape = c(28, 28, 1))

   The encoding part of the autoencoder contains the convolutional and max-pooling layers to decode the image. The max-pooling layer decreases the sizes of the image by using a pooling function.
   The decoding part of the autoencoder contains convolutional and upsampling layers. The up-sampling layer helps to reconstruct the sizes of image. It is the opposite of the pooling function. The final convolutional layer holds sigmoid activation.

enc_output = enc_input %>% 
  layer_conv_2d(12,kernel_size=c(3,3), activation="relu", padding="same") %>% 
  layer_max_pooling_2d(c(2,2), padding="same") %>% 
  layer_conv_2d(4,kernel_size=c(3,3), activation="relu", padding="same") %>% 
  layer_max_pooling_2d(c(4,4), padding="same")  
 
dec_output = enc_output %>% 
  layer_conv_2d(4, kernel_size=c(3,3), activation="relu", padding="same") %>% 
  layer_upsampling_2d(c(4,4)) %>% 
  layer_conv_2d(12, kernel_size=c(3,3), activation="relu") %>% 
  layer_upsampling_2d(c(2,2)) %>% 
  layer_conv_2d(1, kernel_size=c(3,3), activation="sigmoid", padding="same")
 
aen = keras_model(enc_input, dec_output)

We'll compile the model with the Rmsprop optimizer and the binary cross-entropy loss function.

aen %>% compile(optimizer="rmsprop", loss="binary_crossentropy")
 
summary(aen)
___________________________________________________________________________________
Layer (type)                         Output Shape                     Param #      
===================================================================================
input_3 (InputLayer)                 (None, 28, 28, 1)                0            
___________________________________________________________________________________
conv2d_41 (Conv2D)                   (None, 28, 28, 12)               120          
___________________________________________________________________________________
max_pooling2d_17 (MaxPooling2D)      (None, 14, 14, 12)               0            
___________________________________________________________________________________
conv2d_42 (Conv2D)                   (None, 14, 14, 4)                436          
___________________________________________________________________________________
max_pooling2d_18 (MaxPooling2D)      (None, 4, 4, 4)                  0            
___________________________________________________________________________________
conv2d_43 (Conv2D)                   (None, 4, 4, 4)                  148          
___________________________________________________________________________________
up_sampling2d_17 (UpSampling2D)      (None, 16, 16, 4)                0            
___________________________________________________________________________________
conv2d_44 (Conv2D)                   (None, 14, 14, 12)               444          
___________________________________________________________________________________
up_sampling2d_18 (UpSampling2D)      (None, 28, 28, 12)               0            
___________________________________________________________________________________
conv2d_45 (Conv2D)                   (None, 28, 28, 1)                109          
===================================================================================
Total params: 1,257
Trainable params: 1,257
Non-trainable params: 0
___________________________________________________________________________________

Finally, we'll fit the model on the train data.

aen %>% fit(x_train, x_train, epochs=20, batch_size=128)
Epoch 1/20
 1152/60000 [..............................] - ETA: 1:29 - loss: 0.6738
 ....

After the training is finished, we can save the model.

save_model_hdf5(aen, "cnn_autodencoder.h5")

You can easily load your saved model next time.

aen = load_model_hdf5("cnn_autodencoder.h5")


Generating from test data

Now, we can generate x test data with the trained autoencoder.

pred = aen %>% predict(x_test)

To check the predicted images we'll visualize the outputs in a plot.

n = 10
op = par(mfrow=c(12,2), mar=c(1,0,0,0))
for (i in 1:n) 
{
  plot(as.raster(xtest[i,,]))
  plot(as.raster(pred[i,,,]))
}



   In this plot, the digits on the left side are the original images and the digits on the right side are the reconstructed images. To improve the quality of the output image, you need to increase the number of filters in the layer_conv_2d() layer and epochs in a model fit.

   In this tutorial, we've briefly learned how to build convolutional autoencoder with Keras in R. The full source code is listed below.
   In the previous post, we learned how to build simple autoencoder with Keras in R. Please check out the post to learn about it more. 


Source code listing

library(keras)
 
c(c(xtrain, ytrain), c(xtest, ytest)) %<-% dataset_mnist()
xtrain = xtrain/255
xtest = xtest/255
 
dim(xtrain)
dim(xtest)

x_train = array_reshape(xtrain, dim=c(dim(xtrain)[1], 28, 28, 1))
x_test = array_reshape(xtest, dim=c(dim(xtest)[1], 28, 28, 1))
 
print(dim(x_train))
print(dim(x_test))
 
enc_input = layer_input(shape = c(28, 28, 1))
 
enc_output = enc_input %>% 
  layer_conv_2d(12,kernel_size=c(3,3), activation="relu", padding="same") %>% 
  layer_max_pooling_2d(c(2,2), padding="same") %>% 
  layer_conv_2d(4,kernel_size=c(3,3), activation="relu", padding="same") %>% 
  layer_max_pooling_2d(c(4,4), padding="same")  
 
dec_output = enc_output %>% 
  layer_conv_2d(4, kernel_size=c(3,3), activation="relu", padding="same") %>% 
  layer_upsampling_2d(c(4,4)) %>% 
  layer_conv_2d(12, kernel_size=c(3,3), activation="relu") %>% 
  layer_upsampling_2d(c(2,2)) %>% 
  layer_conv_2d(1, kernel_size=c(3,3), activation="sigmoid", padding="same")
 
aen = keras_model(enc_input, dec_output)
 
aen %>% compile(optimizer="rmsprop", loss="binary_crossentropy") 
 
summary(aen)
 
aen %>% fit(x_train, x_train, epochs=20, batch_size=128) 
 
save_model_hdf5(aen, "cnn_autodencoder.h5")
# aen = load_model_hdf5("cnn_autodencoder.h5") 
 
pred = aen %>% predict(x_test) 
 
n = 10
op = par(mfrow=c(12,2), mar=c(1,0,0,0))
for (i in 1:n) 
{
  plot(as.raster(xtest[i,,]))
  plot(as.raster(pred[i,,,]))
} 
 

References:
  1. Keras API

No comments:

Post a Comment