### Anomaly Detection Example with Kernel Density in Python

The Kernel Density estimation is a method to estimate the probability density function of a random variables. We can apply this model to detect outliers in a dataset.
In this tutorial, we'll learn how to detect the outliers of regression data by applying the KernelDensity class of Scikit-learn API in Python. The tutorial covers:
1. Preparing the data
2. Anomaly detection with KernelDensity
3. Testing with Boston housing dataset
4. Source code listing
5. Video tutorial

```from sklearn.neighbors import KernelDensity
from numpy import where, random, array, quantile
from sklearn.preprocessing import scale
import matplotlib.pyplot as plt

Preparing the data

We'll use randomly generated regression data as a target dataset. Here, we'll write simple function to generate sample data. To check the dataset we'll visualize it in a plot to check.

```random.seed(124)
def makeData(N):
x = []
for i in range(N):
a = i/1000 + random.uniform(-3, 2)
r = random.uniform(-5, 10)
if(r >= 9.8):
r = r + 10
elif(r<(-4.8)):
r = r +(-10)
x.append([a + r])
return array(x)

n = 500
x= makeData(n)

x_ax = range(n)
plt.plot(x_ax, x)
plt.show() ```

Next, we'll scale the dataset.

`x = scale(x)`

Anomaly detection with KernelDensity

We'll use Scikit-learn API's KernelDensity class to define the kernel density model.

```kernaldens = KernelDensity().fit(x)
print(kernaldens)```
KernelDensity(algorithm='auto', atol=0, bandwidth=1.0, breadth_first=True, kernel='gaussian', leaf_size=40, metric='euclidean', metric_params=None, rtol=0)

We'll obtain the scores of each sample in x dataset by using score_sample() method.

```scores = kernaldens.score_samples(x)
```

Then, we'll extract the threshold value from the scores data by using quantile() function.

```thresh = quantile(scores, .01)
print(thresh)
```
-4.071068385863522

By using threshold value, we'll find the samples with the scores that are equal to or lower than the threshold value.

```index = where(scores <= thresh)
values = x[index]
```

Finally, we'll visualize the results in a plot by highlighting the anomalies with a color.

```plt.plot(x_ax, x)
plt.scatter(index,values, color='r')
plt.show()```

Testing with Boston housing dataset

We can apply the same method to the Boston housing dataset. We'll use only y target data part of the dataset. We'll reshape and scale it to use it in the KernelDensity model.

```boston = load_boston()
y = boston.target

y = y.reshape(y.shape[0],1)
y = scale(y)```

Next, we'll define the model, fit the model on y data, and find out the scores of samples. Then, we'll collect the anomalies by using threshold value.

```kernaldens = KernelDensity().fit(y)
print(kernaldens)

scores = kernaldens.score_samples(y)
thresh = quantile(scores, .01)
print(thresh)
index = where(scores <= thresh)
values = y[index]
```

Finally, we'll visualize the results in a plot by highlighting the anomalies with a color.

```x_ax = range(y.shape[0])
plt.plot(x_ax, y)
plt.scatter(index,values, color='r')
plt.show()```

In this tutorial, we've briefly learned how to detect the anomalies by using the kernel density method by using the Scikit-learn's KernelDensity class in Python. The full source code is listed below.

