## Pages

### Anomaly Detection Example with One-Class SVM in Python

A One-class classification method is used to detect the outliers and anomalies in a dataset. Based on Support Vector Machines (SVM) evaluation, the One-class SVM applies a One-class classification method for novelty detection.
In this tutorial, we'll briefly learn how to detect anomaly in a dataset by using the One-class SVM method in Python. The Scikit-learn API provides the OneClassSVM class for this algorithm and we'll use it in this tutorial. The tutorial covers:
1. Preparing the data
2. Defining the model and prediction
3. Anomaly detection with scores
4. Source code listing

If you want to know other anomaly detection methods, please check out my tutorial.

```from sklearn.svm import OneClassSVM
from sklearn.datasets import make_blobs
from numpy import quantile, where, random
import matplotlib.pyplot as plt```

Preparing the data

We'll create a random sample dataset for this tutorial by using the make_blob() function. We'll check the dataset by visualizing it in a plot.

```random.seed(13)
x, _ = make_blobs(n_samples=200, centers=1, cluster_std=.3, center_box=(8, 8))

plt.scatter(x[:,0], x[:,1])
plt.show()```

Defining the model and prediction

We'll define the model by using the OneClassSVM class of Scikit-learn API. Here, we'll set RBF for kernel type and define the gamma and the 'nu' arguments.

```svm = OneClassSVM(kernel='rbf', gamma=0.001, nu=0.03)
print(svm)
```
```OneClassSVM(cache_size=200, coef0=0.0, degree=3, gamma=0.001, kernel='rbf',
max_iter=-1, nu=0.03, shrinking=True, tol=0.001, verbose=False) ```

We'll fit the model with x dataset and get the prediction data by using the fit() and predict() method.

```svm.fit(x)
pred = svm.predict(x)```

Next, we'll extract the negative outputs as the outliers.

```anom_index = where(pred==-1)
values = x[anom_index]```

Finally, we'll visualize the results in a plot by highlighting the anomalies with a color.

```plt.scatter(x[:,0], x[:,1])
plt.scatter(values[:,0], values[:,1], color='r')
plt.show()```

Anomaly detection with scores

We can find anomalies by using their scores. In this method, we'll define the model, fit it on the x data by using the fit_predict() method. We'll calculate the outliers according to the score value of each element.

```svm = OneClassSVM(kernel='rbf', gamma=0.001, nu=0.02)
print(svm)```

Next, we'll fit the model on x dataset, then extract the samples score.

```pred = svm.fit_predict(x)
scores = svm.score_samples(x)```

Next, we'll obtain the threshold value from the scores by using the quantile function. Here, we'll get the lowest 3 percent of score values as the anomalies.

```thresh = quantile(scores, 0.03)
print(thresh)```
`3.994389673293594 `

Next, we'll extract the anomalies by comparing the threshold value and identify the values of elements.

```index = where(scores<=thresh)
values = x[index]```

Finally, we can visualize the results in a plot by highlighting the anomalies with a color.

```plt.scatter(x[:,0], x[:,1])
plt.scatter(values[:,0], values[:,1], color='r')
plt.show()```

In this tutorial, we've learned how to detect the anomalies with the One-class SVM method by using the Scikit-learn's OneClassSVM class in Python. We've seen two types of outlier detection methods with OneClassSVM. The full source code is listed below.

Source code listing

```from sklearn.svm import OneClassSVM
from sklearn.datasets import make_blobs
from numpy import quantile, where, random
import matplotlib.pyplot as plt

random.seed(13)
x, _ = make_blobs(n_samples=200, centers=1, cluster_std=.3, center_box=(8, 8))

plt.scatter(x[:,0], x[:,1])
plt.show()

svm = OneClassSVM(kernel='rbf', gamma=0.001, nu=0.03)
print(svm)

svm.fit(x)
pred = svm.predict(x)
anom_index = where(pred==-1)
values = x[anom_index]

plt.scatter(x[:,0], x[:,1])
plt.scatter(values[:,0], values[:,1], color='r')
plt.show()

svm = OneClassSVM(kernel='rbf', gamma=0.001, nu=0.02)
print(svm)

pred = svm.fit_predict(x)
scores = svm.score_samples(x)

thresh = quantile(scores, 0.03)
print(thresh)
index = where(scores<=thresh)
values = x[index]

plt.scatter(x[:,0], x[:,1])
plt.scatter(values[:,0], values[:,1], color='r')
plt.show()```

References: