Support Vector Regression (SVR) is a regression algorithm and it applies a similar technique of Support Vector Machines (SVM) for regression analysis. As we know, regression data contains continuous real numbers. To fit such type of data, the SVR model approximates the best values with a given margin called ε-tube (epsilon-tube, ε identifies a tube width) with considering the model complexity and error rate.

In this tutorial, we'll briefly learn how to fit and predict regression data with SVR method by using SVR class of Scikit-learn API in Python. The tutorial covers:

- Preparing the data
- Model fitting and prediction
- Accuracy check
- Source code listing
- Video tutorial

We'll start by loading the required libraries in Python.

import numpy as np from sklearn.svm import SVR from sklearn.metrics import mean_squared_error import matplotlib.pyplot as plt

**Preparing the data**

We'll use randomly generated regression data as a target data to fit. He, we can write simple function to generate data.

np.random.seed(21) N = 1000 def makeData(x): r = [a/10 for a in x] y = np.sin(x)+np.random.uniform(-.5, .2, len(x)) return np.array(y+r) x = [i/100 for i in range(N)] y = makeData(x) x = np.array(x).reshape(-1,1) plt.scatter(x, y, s=5, color="blue") plt.show()

**Model fitting and prediction**

We'll use Scikit-learn API's SVR class to define the model. The model can be used with default parameters. We'll fit the model on x and y data.

svr = SVR().fit(x, y) print(svr)

**SVR(C=1.0, cache_size=200, coef0=0.0, degree=3, epsilon=0.1, gamma='auto_deprecated', kernel='rbf', max_iter=-1, shrinking=True, tol=0.001, verbose=False)**

Here, kernel, C, and epsilon parameters can be
changed according to regression data characteristics. Kernel identifies kernel type in an algorithm. An 'rbf' (default kernel), 'linear',
'poly', and 'sigmoid' can be used.

Next, we'll predict x data with svr model.

yfit = svr.predict(x)

To check the predicted result, we'll visualize the both y and yfit data in a plot.

**Accuracy check**

Finally, we'll check the model and prediction accuracy with metrics of R-squared and MSE.

score = svr.score(x,y) print("R-squared:", score) print("MSE:", mean_squared_error(y, yfit))

R-squared: 0.9211937698347702
MSE: 0.0411375232810873

In this tutorial, we've briefly learned how to fit regression data by using the SVR method in Python. The full source code is listed below.

**Source code listing**

import numpy as np from sklearn.svm import SVR from sklearn.metrics import mean_squared_error import matplotlib.pyplot as plt np.random.seed(21) N = 1000 def makeData(x): r = [a/10 for a in x] y = np.sin(x)+np.random.uniform(-.5, .2, len(x)) return np.array(y+r) x = [i/100 for i in range(N)] y = makeData(x) x = np.array(x).reshape(-1,1) plt.scatter(x, y, s=5, color="blue") plt.show() svr = SVR().fit(x, y) print(svr) yfit = svr.predict(x) plt.scatter(x, y, s=5, color="blue", label="original") plt.plot(x, yfit, lw=2, color="red", label="fitted") plt.legend() plt.show()

score = svr.score(x,y) print("R-squared:", score) print("MSE:", mean_squared_error(y, yfit))

**Video tutorial**

its really great tutorial.

ReplyDeleteThank you!

DeleteThank you, well done!

ReplyDeleteHi, why is the red line is called predicted, isn't this line an approximation? Can SVR actually can be used to predict? let's say I have 2 minutes of data, can I apply SVR to predict how is this data going to behave for the next 20 minutes?

ReplyDeleteno because train should always be larger than size of data you try to predict

DeleteIt was helpful. Thank you

ReplyDeletegood but can be better

ReplyDelete