## Pages

### Anomaly Detection Example with Elliptical Envelope in Python

The Elliptical Envelope method detects the outliers in a Gaussian distributed data.
Scikit-learn API provides the EllipticEnvelope class to apply this method for anomaly detection. In this tutorial, we'll learn how to detect the anomalies by using the Elliptical Envelope method in Python. The tutorial covers:
1. Preparing the data
2. Defining the model and prediction
3. Anomaly detection with scores
4. Source code listing

```from sklearn.covariance import EllipticEnvelope
from sklearn.datasets import make_blobs
from numpy import quantile, where, random
import matplotlib.pyplot as plt
```

Preparing the data

We'll create a random sample dataset for this tutorial by using the make_blob() function.

```random.seed(2)
x, _ = make_blobs(n_samples=200, centers=1, cluster_std=.3, center_box=(20, 5)) ```

We'll check the dataset by visualizing it in a plot.

```plt.scatter(x[:,0], x[:,1])
plt.show()```

Defining the model and prediction

We'll define the model by using the EllipticEnvelope class of Scikit-learn API. We'll define the contamination value in a class definition. Contamination argument defines the proportion of outliers in a dataset.

```elenv = EllipticEnvelope(contamination=.02)
print(elenv)```
```EllipticEnvelope(assume_centered=False, contamination=0.02, random_state=None,
store_precision=True, support_fraction=None) ```

We'll fit the model with x dataset and get the prediction data with the fit_predict() method.

`pred = elenv.fit_predict(x)`

Next, we'll extract the negative outputs as the outliers.

```anom_index = where(pred==-1)
values = x[anom_index]```

Finally, we'll visualize the results in a plot by highlighting the anomalies with a color.

```plt.scatter(x[:,0], x[:,1])
plt.scatter(values[:,0],values[:,1], color='r')
plt.show()```

Anomaly detection with scores

We can find anomalies by using their scores. In this method, we'll define the model without setting the contamination argument. In this case, the model applies the default value.

```elenv = EllipticEnvelope()
print(elenv)```
```EllipticEnvelope(assume_centered=False, contamination=0.1, random_state=None,
store_precision=True, support_fraction=None) ```

We'll fit the model with x dataset, then extract the samples score.

`elenv.fit(x)`
`scores = elenv.score_samples(x) `

Next, we'll obtain the threshold value from the scores by using the quantile function. Here, we'll get the lowest 2 percent of score values as the anomalies.

```thresh = quantile(scores, .02)
print(thresh)```
`-9.469243838613968 `

Next, we'll extract the anomalies by comparing the threshold value and identify the values of elements.

```index = where(scores <= thresh)
values = x[index] ```

Finally, we can visualize the results in a plot by highlighting the anomalies with a color.

```plt.scatter(x[:,0], x[:,1])
plt.scatter(values[:,0],values[:,1], color='r')
plt.show()```

In both methods above we've got the same result. You can use any of them in your analysis. The threshold or contamination value can be changed to filter out more extreme cases.

In this tutorial, we've learned how to detect the anomalies with the Elliptical Envelope method by using the Scikit-learn's EllipticEnvelope class in Python. The full source code is listed below.

Source code listing

```from sklearn.covariance import EllipticEnvelope
from sklearn.datasets import make_blobs
from numpy import quantile, where, random
import matplotlib.pyplot as plt

random.seed(12)
x, _ = make_blobs(n_samples=200, centers=1, cluster_std=.3, center_box=(20, 5))

plt.scatter(x[:,0], x[:,1])
plt.show() ```
` `
```elenv = EllipticEnvelope(contamination=.02)
print(elenv)

pred = elenv.fit_predict(x)
anom_index=where(pred==-1)
values = x[anom_index]

plt.scatter(x[:,0], x[:,1])
plt.scatter(values[:,0],values[:,1], color='r')
plt.show()```
` `
` `
```elenv = EllipticEnvelope()
print(elenv)

elenv.fit(x)
scores = elenv.score_samples(x)

thresh = quantile(scores, .02)
print(thresh) ```
` `
```index = where(scores <= thresh)
values = x[index]

plt.scatter(x[:,0], x[:,1])
plt.scatter(values[:,0],values[:,1], color='r')
plt.show() ```

References:

1. 1. 