How to Fit Regression Data with CNN Model in Python

   Convolutional Neural Network (CNN) models are mainly used for two-dimensional arrays like image data. However, we can also apply CNN with regression data analysis. In this case, we apply a one-dimensional convolutional network and reshape the input data according to it. Keras provides the Conv1D class to add a one-dimensional convolutional layer into the model.
   In this tutorial, we'll learn how to fit and predict regression data with the CNN 1D model with Keras in Python. The tutorial covers:
  1. Preparing the data
  2. Defining and fitting the model
  3. Predicting and visualizing the results
  4. Source code listing
We'll start by loading the required libraries for this tutorial.

from sklearn.datasets import load_boston
from keras.models import Sequential
from keras.layers import Dense, Conv1D, Flatten
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
import matplotlib.pyplot as plt



Preparing the data

We can use the Boston housing dataset as target regression data. First, we'll load the dataset and check the data dimensions of both x and y.

boston = load_boston()
x, y = boston.data, boston.target
print(x.shape) 
(506, 13) 

An x data has two dimensions that are the number of rows and columns. Here, we need to add the third dimension that will be the number of the single input row. In our example, it becomes 1 that is [13, 1]. We'll reshape the x data accordingly.

x = x.reshape(x.shape[0], x.shape[1], 1)
print(x.shape)
(506, 13, 1)

Next, we'll split the data into the train and test parts.

xtrain, xtest, ytrain, ytest=train_test_split(x, y, test_size=0.15) 

Defining and fitting the model

We'll define the Keras sequential model and add a one-dimensional convolutional layer. Input shape becomes as it is defined above (13,1). We'll add Flatten and Dense layers and compile it with optimizers.

model = Sequential()
model.add(Conv1D(32, 2, activation="relu", input_shape=(13, 1)))
model.add(Flatten())
model.add(Dense(64, activation="relu"))
model.add(Dense(1))
model.compile(loss="mse", optimizer="adam")
 
model.summary()
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv1d_10 (Conv1D)           (None, 12, 32)            96        
_________________________________________________________________
flatten_8 (Flatten)          (None, 384)               0         
_________________________________________________________________
dense_355 (Dense)            (None, 64)                24640     
_________________________________________________________________
dense_356 (Dense)            (None, 1)                 65        
=================================================================
Total params: 24,801
Trainable params: 24,801
Non-trainable params: 0
_________________________________________________________________ 

Next, we'll fit the model with train data.

model.fit(xtrain, ytrain, batch_size=12,epochs=200, verbose=0)


Predicting and visualizing the results

Now we can predict the test data with the trained model.

ypred = model.predict(xtest)

We can evaluate the model, check the mean squared error rate (MSE) of the predicted result, and visualize the result in a plot.

print(model.evaluate(xtrain, ytrain))
21.21026409947595 
 
print("MSE: %.4f" % mean_squared_error(ytest, ypred))
MSE: 19.8953 

x_ax = range(len(ypred))
plt.scatter(x_ax, ytest, s=5, color="blue", label="original")
plt.plot(x_ax, ypred, lw=0.8, color="red", label="predicted")
plt.legend()
plt.show()

   In this tutorial, we've briefly learned how to fit and predict regression data with the keras CNN model in Python. The full source code is listed below.


Source code listing

from sklearn.datasets import load_boston
from keras.models import Sequential
from keras.layers import Dense, Conv1D, Flatten
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
import matplotlib.pyplot as plt

boston = load_boston()
x, y = boston.data, boston.target
print(x.shape)

x = x.reshape(x.shape[0], x.shape[1], 1)
print(x.shape)

xtrain, xtest, ytrain, ytest=train_test_split(x, y, test_size=0.15)

model = Sequential()
model.add(Conv1D(32, 2, activation="relu", input_shape=(13,1)))
model.add(Flatten())
model.add(Dense(64, activation="relu"))
model.add(Dense(1))
model.compile(loss="mse", optimizer="adam")
model.summary()
model.fit(xtrain, ytrain, batch_size=12,epochs=200, verbose=0)

ypred = model.predict(xtest)
print(model.evaluate(xtrain, ytrain))
print("MSE: %.4f" % mean_squared_error(ytest, ypred))

x_ax = range(len(ypred))
plt.scatter(x_ax, ytest, s=5, color="blue", label="original")
plt.plot(x_ax, ypred, lw=0.8, color="red", label="predicted")
plt.legend()
plt.show()

5 comments:

  1. Nice post!

    Do you know any good publication about this (CNN applied to regression) that I could cite/reference?

    Thanks.

    ReplyDelete
  2. one of most concise posts I have seen so far...Thank you!

    ReplyDelete