## Pages

### Gradient Boosting Regression Example in Python

The idea of gradient boosting is to improve weak learners and create a final combined prediction model. Decision trees are mainly used as base learners in this algorithm. The weak learner is identified by the gradient in the loss function. The prediction of a weak learner is compared to actual value and error is calculated. Based on this error, the model can determine the gradient and change the parameters to decrease the error rate in the next training.
In this tutorial, we'll learn how to predict regression data with the Gradient Boosting Regressor (comes in sklearn.ensemble module) class in Python. The post covers:
1. Preparing data
2. Defining the model
3. Predicting test data and visualizing the result

```from sklearn.ensemble import GradientBoostingRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
import matplotlib.pyplot as plt```

Preparing data

We use Boston house-price dataset as regression dataset in this tutorial. After loading the dataset, first, we'll separate data into x and y parts.

```boston = load_boston()
x, y = boston.data, boston.target```

Then we'll split it into train and test parts. Here, we'll extract 15 percent of the data as a test.

```xtrain, xtest, ytrain, ytest=train_test_split(x, y, random_state=12,
test_size=0.15)```

Defining the model

We can define the model with its default parameters or set the new parameter values.

```# with new parameters
max_depth=5,
learning_rate=0.01,
min_samples_split=3)
# with default parameters
` `
```print(gbr)
learning_rate=0.1, loss='ls', max_depth=3, max_features=None,
max_leaf_nodes=None, min_impurity_decrease=0.0,
min_impurity_split=None, min_samples_leaf=1,
min_samples_split=2, min_weight_fraction_leaf=0.0,
n_estimators=100, presort='auto', random_state=None,
subsample=1.0, verbose=0, warm_start=False)```

Next, we'll fit the model with train data.

`gbr.fit(xtrain, ytrain)`

Predicting test data and visualizing the result

We can predict the test data and check the error rate as a following.

```ypred = gbr.predict(xtest)
mse = mean_squared_error(ytest,ypred)```
` `
`print("MSE: %.2f" % mse)`
`MSE: 10.41`

Finally, we'll visualize the original and predicted values in a plot.

```x_ax = range(len(ytest))
plt.scatter(x_ax, ytest, s=5, color="blue", label="original")
plt.plot(x_ax, ypred, lw=0.8, color="red", label="predicted")
plt.legend()
plt.show()```

In this post, we've briefly learned how to use Gradient Boosting Regressor to predict regression data in Python. Thank you for reading!

The full source code is listed below.

```from sklearn.ensemble import GradientBoostingRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
import matplotlib.pyplot as plt

x, y = boston.data, boston.target
xtrain, xtest, ytrain, ytest=train_test_split(x, y, random_state=12,
test_size=0.15)
# with new parameters
max_depth=5,
learning_rate=0.01,
min_samples_split=3)
# with default parameters

gbr.fit(xtrain, ytrain)

ypred = gbr.predict(xtest)
mse = mean_squared_error(ytest,ypred)
print("MSE: %.2f" % mse)

x_ax = range(len(ytest))
plt.scatter(x_ax, ytest, s=5, color="blue", label="original")
plt.plot(x_ax, ypred, lw=0.8, color="red", label="predicted")
plt.legend()
plt.show()```