How to Save and Load Machine Learing Models with Pickle

     In machine learning, saving and loading models is important part of the process to ensure  their reusability and deployment. Saving models involves serializing them into a file format that can be easily stored and retrieved. The 'pickle' module provides efficient way to serialize and deserialize machine learning models in Python. 

    In this tutorial, we'll explore how to save and read trained machine learning models using 'pickle'. The tutorial covers:

  1. Model training and saving 
  2. Model loading and using
  3. Source code listing

    Let's get started.

    
Model training and saving
 
     First we'll create model and train it with a sample dataset. After confirming the prediction performance of model, we can save it into the file. In the code snippet below, we import the necessary libraries and prepare training and test data for the model. Then create an instance of 'SVC' model and train it on prepared data.

 
import pickle
from sklearn.svm import SVC
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

# Iris dataset example
iris = load_iris()
x, y = iris.data, iris.target

# split dataset into train and test
xtrain, xtest, ytrain, ytest = train_test_split(x, y, test_size=0.10)

# create SVC model instance
svc_model = SVC()

# train model
svc_model.fit(xtrain, ytrain)

 
We can test model prediction performance.
 

# check prediction
ypred = svc_model.predict(xtest[0].reshape(1, -1))
print("Predicted class:", ypred)
 
 
 
Predicted class: [2] 
  

To save trained model, we use
pickle.dump() function.

 
# define path and model name
file_path = "/Users/user/Desktop/tmp/svc_model_01.pkl"
with open(file_path, "wb") as file:
pickle.dump(svc_model, file)

 
 
Model loading and using
 
    Once we have saved the model, we can easily load it back into memory for further use. The pickle.load() function helps us to load the model object from the file. The loaded model can then be used for making predictions or performing additional analysis tasks. We create new file to use trained model. 

 
import pickle
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

# Iris dataset example
iris = load_iris()
x, y = iris.data, iris.target

# split dataset into train and test
xtrain, xtest, ytrain, ytest = train_test_split(x, y, test_size=0.10)

file_path = "/Users/user/Desktop/tmp/svc_model_01.pkl"
loaded_model = None
# load the model
with open(file_path, 'rb') as file:
loaded_model = pickle.load(file)

if loaded_model is None:
print("Model has not been loaded properly!")
else:
pred = loaded_model.predict(xtest[1].reshape(1, -1))
print("Predicted class:", pred)

  
Predicted class: [2] 
  
 
    In the code snippet above, we loaded required libraries and test data to check the loaded model performance. After successfully loading the model, we predicted the test data.
 
    In this tutorial, we've briefly explored the process of saving and reading machine learning models using the 'pickle' module in Python. The full source code is listed below. 
 
Source code listing
 
# save_model.py
import pickle
from sklearn.svm import SVC
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

# Iris dataset example
iris = load_iris()
x, y = iris.data, iris.target

# split dataset into train and test
xtrain, xtest, ytrain, ytest = train_test_split(x, y, test_size=0.10)

# create SVC model instance
svc_model = SVC()

# train model
svc_model.fit(xtrain, ytrain)

# check prediction
ypred = svc_model.predict(xtest[0].reshape(1, -1))
print("Predicted class:", ypred)

# define saveing path and model name
file_path = "/Users/user/Desktop/tmp/svc_model_01.pkl"
with open(file_path, "wb") as file:
pickle.dump(svc_model, file)
 
 
# load_model.py
import pickle
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

# Iris dataset example
iris = load_iris()
x, y = iris.data, iris.target

# split dataset into train and test
xtrain, xtest, ytrain, ytest = train_test_split(x, y, test_size=0.10)

file_path = "/Users/user/Desktop/tmp/svc_model_01.pkl"
loaded_model = None
# load the model
with open(file_path, 'rb') as file:
loaded_model = pickle.load(file)

if loaded_model is None:
print("Model has not been loaded properly!")
else:
pred = loaded_model.predict(xtest[1].reshape(1, -1))
print("Predicted class:", pred)


No comments:

Post a Comment