Convolutional Autoencoder Example with Keras in Python

   Autoencoder is a neural network model that learns from the data to imitate the output based on input data. It can only represent a data-specific and lossy version of the trained data. Thus the autoencoder is a compression and reconstructing method with a neural network.
   When it comes to image data, principally we use the convolutional neural networks in building the deep learning model. In the previous post, we learned how to build simple autoencoders with dense layers. In this tutorial, we'll learn how to build autoencoders by applying the convolutional neural networks with Keras in Python. The tutorial covers:
  1. Preparing the data
  2. Defining the convolutional autoencoder
  3. Generating the images
  4. Source code listing
   We'll start by loading the required Python libraries for this tutorial.

from keras.layers import Conv2D
from keras.layers import Input
from keras.layers import MaxPooling2D, UpSampling2D
from keras.models import Model
from keras.datasets.mnist import load_data
from numpy import reshape
import matplotlib.pyplot as plt

Preparing the data

    We'll use MNIST handwritten digits dataset to train the autoencoder. First, we'll load it and prepare it by doing some changes. Autoencoder requires only input data so that we only focus on x part of the dataset. We'll scale it into the range of [0, 1].

(xtrain, _), (xtest, _) = load_data()
xtrain = xtrain.astype('float32') / 255
xtest = xtest.astype('float32') / 255
 
print(xtrain.shape, xtest.shape)
(60000, 28, 28) (10000, 28, 28) 

For the two-dimensional convolutional layer, we need to add one more dimension to the dataset. We can do it by using the reshape function.

x_train = reshape(xtrain, (len(xtrain), 28, 28, 1)) 
x_test = reshape(xtest, (len(xtest), 28, 28, 1))
 
print(x_train.shape, x_test.shape) 
(60000, 28, 28, 1) (10000, 28, 28, 1)


Defining the convolutional autoencoder

   We'll define the autoencoder starting from the input layer. The input layer has a shape similar to the dimensions of the input data.

input_img = Input(shape=(28, 28, 1))

   The encoding part of the autoencoder contains the convolutional and max-pooling layers to decode the image. The max-pooling layer decreases the sizes of the image by using a pooling function.
   The decoding part of the autoencoder contains convolutional and upsampling layers. The up-sampling layer helps to reconstruct the sizes of the image. It is the opposite of the pooling function. The last convolutional layer holds sigmoid activation. Then, we'll combine both layers into the final autoencoder model and compile it with the RMSProp optimizer and binary cross-entropy loss function. 

enc_conv1 = Conv2D(12, (3, 3), activation='relu', padding='same')(input_img)
enc_pool1 = MaxPooling2D((2, 2), padding='same')(enc_conv1)
enc_conv2 = Conv2D(8, (4, 4), activation='relu', padding='same')(enc_pool1)
enc_ouput = MaxPooling2D((4, 4), padding='same')(enc_conv2)

dec_conv2 = Conv2D(8, (4, 4), activation='relu', padding='same')(enc_ouput)
dec_upsample2 = UpSampling2D((4, 4))(dec_conv2)
dec_conv3 = Conv2D(12, (3, 3), activation='relu')(dec_upsample2)
dec_upsample3 = UpSampling2D((2, 2))(dec_conv3)
dec_output = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(dec_upsample3)

autoencoder = Model(input_img, dec_output)
autoencoder.compile(optimizer='rmsprop', loss='binary_crossentropy'
 
autoencoder.summary()
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 28, 28, 1)         0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 28, 28, 12)        120       
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 14, 14, 12)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 14, 14, 8)         1544      
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 4, 4, 8)           0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 4, 4, 8)           1032      
_________________________________________________________________
up_sampling2d_1 (UpSampling2 (None, 16, 16, 8)         0         
_________________________________________________________________
conv2d_4 (Conv2D)            (None, 14, 14, 12)        876       
_________________________________________________________________
up_sampling2d_2 (UpSampling2 (None, 28, 28, 12)        0         
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 28, 28, 1)         109       
=================================================================
Total params: 3,681
Trainable params: 3,681
Non-trainable params: 0
_________________________________________________________________
 
The model is ready, now we can fit on training data.

autoencoder.fit(x_train, x_train, epochs=20, batch_size=128, shuffle=True)


Generating the images

   Finally, we'll restore the test data and visualize them in a plot. We'll check the result visualizing in a plot.

decoded_imgs = autoencoder.predict(x_test)

n = 10
plt.figure(figsize=(20, 4))
for i in range(n):
 plt.gray()
 ax = plt.subplot(2, n, i+1)
 plt.imshow(x_test[i].reshape(28, 28))
 ax.get_xaxis().set_visible(False)
 ax.get_yaxis().set_visible(False)
 
 ax = plt.subplot(2, n, i +1+n)
 plt.imshow(decoded_imgs[i].reshape(28, 28))
 ax.get_xaxis().set_visible(False)
 ax.get_yaxis().set_visible(False)
plt.show()

The first row in a plot shows the original images in test data. The second row contains the restored data with the autoencoder model.

   In this tutorial, we've briefly learned how to build a convolutional autoencoder with Keras in Python. The full source code is listed below.


Source code listing

from keras.layers import Conv2D
from keras.layers import Input
from keras.layers import MaxPooling2D, UpSampling2D
from keras.models import Model
from keras.datasets.mnist import load_data
from numpy import reshape
import matplotlib.pyplot as plt

(xtrain, _), (xtest, _) = load_data()

xtrain = xtrain.astype('float32') / 255
xtest = xtest.astype('float32') / 255
print(xtrain.shape, xtest.shape) 

x_train = reshape(xtrain, (len(xtrain), 28, 28, 1)) 
x_test = reshape(xtest, (len(xtest), 28, 28, 1)) 
print(x_train.shape, x_test.shape) 

input_img = Input(shape=(28, 28, 1))

enc_conv1 = Conv2D(12, (3, 3), activation='relu', padding='same')(input_img)
enc_pool1 = MaxPooling2D((2, 2), padding='same')(enc_conv1)
enc_conv2 = Conv2D(8, (4, 4), activation='relu', padding='same')(enc_pool1)
enc_ouput = MaxPooling2D((4, 4), padding='same')(enc_conv2)

dec_conv2 = Conv2D(8, (4, 4), activation='relu', padding='same')(enc_ouput)
dec_upsample2 = UpSampling2D((4, 4))(dec_conv2)
dec_conv3 = Conv2D(12, (3, 3), activation='relu')(dec_upsample2)
dec_upsample3 = UpSampling2D((2, 2))(dec_conv3)
dec_output = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(dec_upsample3)

autoencoder = Model(input_img, dec_output)
autoencoder.compile(optimizer='rmsprop', loss='binary_crossentropy')
autoencoder.summary()
 
autoencoder.fit(x_train, x_train, epochs=20, batch_size=128, shuffle=True)

decoded_imgs = autoencoder.predict(x_test)

n = 10
plt.figure(figsize=(20, 4))
for i in range(n):
 plt.gray()
 ax = plt.subplot(2, n, i+1)
 plt.imshow(x_test[i].reshape(28, 28))
 ax.get_xaxis().set_visible(False)
 ax.get_yaxis().set_visible(False)
 
 ax = plt.subplot(2, n, i +1+n)
 plt.imshow(decoded_imgs[i].reshape(28, 28))
 ax.get_xaxis().set_visible(False)
 ax.get_yaxis().set_visible(False)
plt.show()


References:
  1. Building Autoencoders in Keras

No comments:

Post a Comment