- Preparing the data
- Defining the model
- Predicting and visualizing the result
- Source code listing

from numpy import array, hstack, math from numpy.random import uniform import matplotlib.pyplot as plt from sklearn.model_selection import train_test_split from sklearn.metrics import mean_squared_error from sklearn.ensemble import GradientBoostingRegressor from sklearn.multioutput import MultiOutputRegressor

**Preparing the data**

First, we 'll create a multi-output dataset for this tutorial. It is randomly generated data with some rules below. There are three inputs and two outputs in this dataset. We'll plot the generated data to check it visually.

def create_data(n): x1=array([math.sin(i)*(i/10)+uniform(-5,5) for i in range(n)]).reshape(n,1) x2=array([math.cos(i)*(i/10)+uniform(-9,5) for i in range(n)]).reshape(n,1) x3=array([(i/50)+uniform(-10,10) for i in range(n)]).reshape(n,1) y1 = [x1[i]+x2[i]+x3[i]+uniform(-1,4)+15 for i in range(n)] y2 = [x1[i]-x2[i]-x3[i]-uniform(-4,2)-10 for i in range(n)] X = hstack((x1, x2, x3)) Y = hstack((y1, y2)) return X, Y n = 300 X, Y = create_data(n)

f = plt.figure() f.add_subplot(1,2,1) plt.title("Xs input data") plt.plot(X) plt.xlabel("Samples") f.add_subplot(1,2,2) plt.title("Ys output data") plt.plot(Y) plt.xlabel("Samples") plt.show()

Next, we'll split the dataset into the train and test parts and check the data shapes.

xtrain, xtest, ytrain, ytest=train_test_split(X, Y, test_size=0.15) print("xtrain:", xtrain.shape, "ytrian:", ytrain.shape)

xtrain: (255, 3) ytrian: (255, 2)

print("xtest:", xtest.shape, "ytest:", ytest.shape)

xtest: (45, 3) ytest: (45, 2)

**Defining the model**

We'll define the model with the MultiOutputRegressor class of sklearn. As an estimator, we'll implement GradientBoostingRegressor with default parameters and then we'll include the estimator into the MultiOutputRegressor class. You can check the parameters of the model by the print command.

gbr = GradientBoostingRegressor() model = MultiOutputRegressor(estimator=gbr) print(model)

Now, we can fit the model with train data and check the training score.

model.fit(xtrain, ytrain) score = model.score(xtrain, ytrain) print("Training score:", score)

Training score: 0.9952671502749106

**Predicting and visualizing the result**

We'll predict the test data with a trained model and check the MSE rate for both y1 and y2 outputs.

ypred = model.predict(xtest) print("y1 MSE:%.4f" % mean_squared_error(ytest[:,0], ypred[:,0]))

y1 MSE:10.9138

print("y2 MSE:%.4f" % mean_squared_error(ytest[:,1], ypred[:,1]))

y2 MSE:10.8929

Finally, we'll visualize the results in the plot and check them visually.

x_ax = range(len(xtest)) plt.plot(x_ax, ytest[:,0], label="y1-test", color='c') plt.plot(x_ax, ypred[:,0], label="y1-pred", color='b') plt.plot(x_ax, ytest[:,1], label="y2-test", color='m') plt.plot(x_ax, ypred[:,1], label="y2-pred", color='r') plt.legend() plt.show()

In this tutorial, we've briefly learned how to MultiOutputRegressor class in Python. We've trained the multioutput dataset and predicted test data.

**Source code listing**

```
from numpy import array, hstack, math
from numpy.random import uniform
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.multioutput import MultiOutputRegressor
```

` `

def create_data(n): x1=array([math.sin(i)*(i/10)+uniform(-5,5) for i in range(n)]).reshape(n,1) x2=array([math.cos(i)*(i/10)+uniform(-9,5) for i in range(n)]).reshape(n,1) x3=array([(i/50)+uniform(-10,10) for i in range(n)]).reshape(n,1) y1 = [x1[i]+x2[i]+x3[i]+uniform(-1,4)+15 for i in range(n)] y2 = [x1[i]-x2[i]-x3[i]-uniform(-4,2)-10 for i in range(n)] X = hstack((x1, x2, x3)) Y = hstack((y1, y2)) return X, Y n = 300 X, Y = create_data(n)

f = plt.figure() f.add_subplot(1,2,1) plt.title("Xs input data") plt.plot(X) plt.xlabel("Samples") f.add_subplot(1,2,2) plt.title("Ys output data") plt.plot(Y) plt.xlabel("Samples") plt.show()

xtrain, xtest, ytrain, ytest=train_test_split(X, Y, test_size=0.15) print("xtrain:", xtrain.shape, "ytrian:", ytrain.shape) print("xtest:", xtest.shape, "ytest:", ytest.shape) gbr = GradientBoostingRegressor() model = MultiOutputRegressor(estimator=gbr) print(model) model.fit(xtrain, ytrain) score = model.score(xtrain, ytrain) print("Training score:", score) ypred = model.predict(xtest) print("y1 MSE:%.4f" % mean_squared_error(ytest[:,0], ypred[:,0])) print("y2 MSE:%.4f" % mean_squared_error(ytest[:,1], ypred[:,1])) x_ax = range(len(xtest)) plt.plot(x_ax, ytest[:,0], label="y1-test", color='c') plt.plot(x_ax, ypred[:,0], label="y1-pred", color='b') plt.plot(x_ax, ytest[:,1], label="y2-test", color='m') plt.plot(x_ax, ypred[:,1], label="y2-pred", color='r') plt.legend() plt.show()

hey, It's very good read. However, more detailed explanation of topic would have been great.

ReplyDeleteThe Multi-Target Regression focused here is by taking all targets together while fitting the model and during evaluation. Do you think taking one Target at a time would fetch more better results? I wonder why this idea is not taken into account. Appreciate your comments. Thanks again.

You are welcome! Yes, you can do it. But it becomes simple regression model that fits and predicts each target in multiple steps. Here I wanted to show multi-output prediction case in a single training and prediction.

Delete