Pages

Scattered Data Spline Fitting Example in Python

Interpolation is a method of estimating unknown data points in a given range. Spline interpolation is a type of piecewise polynomial interpolation method. Spline interpolation is a useful method in smoothing the curve or surface data.
In my previous posts, I explained how to implement spline interpolation and B-spline curve fitting in Python.  We can apply the spline smoothing method to scattered data. In this tutorial, you'll learn how to fit scattered data by using spline functions in Python.

The tutorial covers,

1. Preparing test data
2. Spline curve fitting
3. Fitting on various knots number

from scipy import interpolate
import matplotlib.pyplot as plt
import numpy as np

Preparing test data

As a target data, we can use the Boston housing dataset. We need only label part of the dataset and  below code shows how to extract it and visualize on a graph.

y = boston.target
x = range(0, len(y))

plt.figure(figsize=(10, 5))
plt.plot(x, y, '.', c="g")
plt.grid()
plt.show()

Spline curve fitting

To construct a smoother spline fit, we need to specify the number of knots for the target data. Knots are joints of polynomial segments.
Based on knots number, we'll determine the new x data vector by using the 'quantile' function.

knot_numbers = 5
x_new = np.linspace(0, 1, knot_numbers+2)[1:-1]
q_knots = np.quantile(x, x_new)

Next, we'll find out the required coefficient values by using 'splrep'. The 'splrep' function returns t, c, k tuple containing the vector of knots, the B-spline coefficients, and the degree of the spline.

After taking the values, we'll use BSpline class to construct spline fit on x vector data.

t,c,k = interpolate.splrep(x, y, t=q_knots, s=1)
yfit = interpolate.BSpline(t,c,k)(x)

Finally, we can visualize the constructed spline curve on a graph.

plt.figure(figsize=(12, 6))
plt.title("Spline curve fitting")
plt.plot(x, y, '.', c="g", label="original")
plt.plot(x, yfit, '-', c="r", label="spline fit")
plt.grid()
plt.show()

Fitting on various knots number

The curve looks good but we need to check the options by changing the number of the knots. Here, we'll visualize the curves taken from various knot options. First, we'll write function that helps us to fit y data with different knots as shown below.

def spline(knots, y):
x = range(0, len(y))
x_new = np.linspace(0, 1, knots+2)[1:-1]
q_knots = np.quantile(x, x_new)
t, c, k = interpolate.splrep(x, y, t=q_knots, s=3)
yfit = interpolate.BSpline(t,c, k)(x)
return yfit

Then, we'll fit each knot value and visualize it on a graph.

knots = [3, 10, 20, 30]
i = 0

ig, ax = plt.subplots(nrows=2, ncols=2, figsize=(8, 5))

for row in range(2):
for col in range(2):
ax[row][col].plot(x, y, '.',c="g", markersize=2)
yfit = spline(knots[i], y)
ax[row][col].plot(x, yfit, 'r')
ax[row][col].set_title("Knots number - "+str(knots[i]))
ax[row][col].grid()
i=i+1

plt.tight_layout()
plt.show()

As you see the knots number changes the fitted curve line. After checking the the above graphs, we can define the number of knots for our target data according to our evaluation targets.

In this tutorial, we've briefly learned how to fit scattered data with spline method in Python. The full source code is listed below.

Source code listing

from scipy import interpolate
import matplotlib.pyplot as plt
import numpy as np

y = boston.target
x = range(0, len(y))

plt.figure(figsize=(12, 6))
plt.plot(x, y, '.', c="g")
plt.grid()
plt.show()

knot_numbers = 5
x_new = np.linspace(0, 1, knot_numbers+2)[1:-1]
q_knots = np.quantile(x, x_new)

t,c,k = interpolate.splrep(x, y, t=q_knots, s=1)
yfit = interpolate.BSpline(t,c,k)(x)

plt.figure(figsize=(8, 4))
plt.title("Spline curve fitting")
plt.plot(x, y, '.', c="g", label="original")
plt.plot(x, yfit, '-', c="r", label="spline fit")
plt.grid()
plt.show()

def spline(knots, y):
x = range(0, len(y))
x_new = np.linspace(0, 1, knots+2)[1:-1]
q_knots = np.quantile(x, x_new)
t, c, k = interpolate.splrep(x, y, t=q_knots, s=3)
yfit = interpolate.BSpline(t,c, k)(x)
return yfit

knots = [3, 10, 20, 30]
i = 0

ig, ax = plt.subplots(nrows=2, ncols=2, figsize=(8, 5))

for row in range(2):
for col in range(2):
ax[row][col].plot(x, y, '.',c="g", markersize=2)
yfit = spline(knots[i], y)
ax[row][col].plot(x, yfit, 'r')
ax[row][col].set_title("Knots number - "+str(knots[i]))
ax[row][col].grid()
i=i+1

plt.tight_layout()
plt.show()

References: