Pages

Support Vector Regression Example with SVM in R

Support Vector Machine is a supervised learning method and it can be used for regression and classification problems. An 'e1071' package provides 'svm' function to build support vector machines model to apply for regression problem in R. In this post, we'll briefly learn how to use 'svm' function for regression problem in R. The tutorial covers:
1. Preparing the data
2. Fitting the model and predicting test data
3. Accuracy checking
4. Source code listing
We'll start by loading the required libraries for this tutorial. You can install them by typing the command install.packages(c("e1071", "caret")) if they are not available on your machine.

```library(e1071)
library(caret)```

Preparing the data

We'll use the Boston housing price dataset as a target regression data in this tutorial. We'll prepare data by splitting it into the train and test parts.

```boston = MASS::Boston
set.seed(123)
indexes = createDataPartition(boston\$medv, p = .9, list = F)
train = boston[indexes, ]
test = boston[-indexes, ]```

Fitting the model and predicting test data

Train and test data are ready. Now, we can define the svm model with default parameters and fit it with train data. Here, we can change the kernel type into 'linear', 'polynomial', and 'sigmoid' for training and predicting. The default is a 'radial' kernel.

```model_reg = svm(medv~., data=train)
print(model_reg)

Call:
svm(formula = medv ~ ., data = train)

Parameters:
SVM-Type:  eps-regression
SVM-Kernel:  radial
cost:  1
gamma:  0.07692308
epsilon:  0.1

Number of Support Vectors:  306
```

Next, we'll predict the test data and plot the results to compare visually.

```pred = predict(model_reg, test)

x = 1:length(test\$medv)
plot(x, test\$medv, pch=18, col="red")
lines(x, pred, lwd="1", col="blue")```

Accuracy checking

Finally, we'll check the prediction accuracy with the MSE, MAE, RMSE, and R-squared metrics.

```mse = MSE(test\$medv, pred)
mae = MAE(test\$medv, pred)
rmse = RMSE(test\$medv, pred)
r2 = R2(test\$medv, pred, form = "traditional")

cat(" MAE:", mae, "\n", "MSE:", mse, "\n",
"RMSE:", rmse, "\n", "R-squared:", r2)
MAE: 1.877403
MSE: 6.028015
RMSE: 2.455202
R-squared: 0.914078```

In this tutorial, we have briefly learned how to use an 'e1071' package's svm function for the regression problem. Thank you for reading and the full source code is listed below.

Source code listing

```library(e1071)
library(caret)

# Regression example
boston = MASS::Boston
set.seed(123)
indexes = createDataPartition(boston\$medv, p = .9, list = F)
train = boston[indexes, ]
test = boston[-indexes, ]

model_reg = svm(medv~., data=train)
print(model_reg)

pred = predict(model_reg, test)

x=1:length(test\$medv)
plot(x, test\$medv, pch=18, col="red")
lines(x, pred, lwd="1", col="blue")

# accuracy check
mse = MSE(test\$medv, pred)
mae = MAE(test\$medv, pred)
rmse = RMSE(test\$medv, pred)
r2 = R2(test\$medv, pred, form = "traditional")

cat(" MAE:", mae, "\n", "MSE:", mse, "\n",
"RMSE:", rmse, "\n", "R-squared:", r2)```
` `

3 comments:

1. http://www.analyticspath.com
This information you have shared is really a lot helpful. Was searching for this info from a while. Looking forward for further such interesting postings from you

2. I am copying this program and it is working, the seed is correct and I also tried changing it, but my image with the regression line is different, and the R-squared is <0.8. The number of Support vectors is 302. Gamma and Epsilon are the same (since I copied the Source code). Maybe the dataset is different in 2021? Doesn't look like it from the graph. There's a significant difference between Rsquared <0.8 and yours >0.9

3. me too get R2= .79 why?