Scattered Data Spline Fitting Example in Python

    Interpolation is a method of estimating unknown data points in a given range. Spline interpolation is a type of piecewise polynomial interpolation method. Spline interpolation is a useful method in smoothing the curve or surface data.    
    In my previous posts, I explained how to implement spline interpolation and B-spline curve fitting in Python.  We can apply the spline smoothing method to scattered data. In this tutorial, you'll learn how to fit scattered data by using spline functions in Python. 

    The tutorial covers,

  1. Preparing test data
  2. Spline curve fitting
  3. Fitting on various knots number

     We'll start by loading the required libraries for this tutorial.

 
from sklearn.datasets import load_boston
from scipy import interpolate
import matplotlib.pyplot as plt
import numpy as np 
 
 
 
Preparing test data
 
    As a target data, we can use the Boston housing dataset. We need only label part of the dataset and  below code shows how to extract it and visualize on a graph.
 
 
boston = load_boston()
y = boston.target
x = range(0, len(y))

plt.figure(figsize=(10, 5))
plt.plot(x, y, '.', c="g")
plt.grid()
plt.show() 
 
 
 
Spline curve fitting
 
    To construct a smoother spline fit, we need to specify the number of knots for the target data. Knots are joints of polynomial segments.
    Based on knots number, we'll determine the new x data vector by using the 'quantile' function.
 

knot_numbers = 5
x_new = np.linspace(0, 1, knot_numbers+2)[1:-1]
q_knots = np.quantile(x, x_new) 

    Next, we'll find out the required coefficient values by using 'splrep'. The 'splrep' function returns t, c, k tuple containing the vector of knots, the B-spline coefficients, and the degree of the spline.

    After taking the values, we'll use BSpline class to construct spline fit on x vector data.


t,c,k = interpolate.splrep(x, y, t=q_knots, s=1)
yfit = interpolate.BSpline(t,c,k)(x) 
 

    Finally, we can visualize the constructed spline curve on a graph.
 

plt.figure(figsize=(12, 6))
plt.title("Spline curve fitting")
plt.plot(x, y, '.', c="g", label="original")
plt.plot(x, yfit, '-', c="r", label="spline fit")
plt.legend(loc='best', fancybox=True, shadow=True)
plt.grid()
plt.show() 
 
   
 


  Fitting on various knots number

     The curve looks good but we need to check the options by changing the number of the knots. Here, we'll visualize the curves taken from various knot options. First, we'll write function that helps us to fit y data with different knots as shown below.

 
def spline(knots, y):
    x = range(0, len(y))
    x_new = np.linspace(0, 1, knots+2)[1:-1]
    q_knots = np.quantile(x, x_new)
    t, c, k = interpolate.splrep(x, y, t=q_knots, s=3)
    yfit = interpolate.BSpline(t,c, k)(x)
    return yfit
 
  
    Then, we'll fit each knot value and visualize it on a graph.
 
 
knots = [3, 10, 20, 30]
i = 0

ig, ax = plt.subplots(nrows=2, ncols=2, figsize=(8, 5))
 
for row in range(2):
    for col in range(2):
        ax[row][col].plot(x, y, '.',c="g", markersize=2)
        yfit = spline(knots[i], y)
        ax[row][col].plot(x, yfit, 'r')
        ax[row][col].set_title("Knots number - "+str(knots[i]))
        ax[row][col].grid()
        i=i+1
        
plt.tight_layout()        
plt.show() 
 
  
 

    As you see the knots number changes the fitted curve line. After checking the the above graphs, we can define the number of knots for our target data according to our evaluation targets.

    In this tutorial, we've briefly learned how to fit scattered data with spline method in Python. The full source code is listed below. 
 
 
Source code listing

 
from sklearn.datasets import load_boston
from scipy import interpolate
import matplotlib.pyplot as plt
import numpy as np


boston = load_boston()
y = boston.target
x = range(0, len(y))

plt.figure(figsize=(12, 6))
plt.plot(x, y, '.', c="g")
plt.grid()
plt.show()


knot_numbers = 5
x_new = np.linspace(0, 1, knot_numbers+2)[1:-1]
q_knots = np.quantile(x, x_new) 
 
t,c,k = interpolate.splrep(x, y, t=q_knots, s=1)
yfit = interpolate.BSpline(t,c,k)(x)

plt.figure(figsize=(8, 4))
plt.title("Spline curve fitting")
plt.plot(x, y, '.', c="g", label="original")
plt.plot(x, yfit, '-', c="r", label="spline fit")
plt.legend(loc='best', fancybox=True, shadow=True)
plt.grid()
plt.show() 
 
  
def spline(knots, y):
    x = range(0, len(y))
    x_new = np.linspace(0, 1, knots+2)[1:-1]
    q_knots = np.quantile(x, x_new)
    t, c, k = interpolate.splrep(x, y, t=q_knots, s=3)
    yfit = interpolate.BSpline(t,c, k)(x)
    return yfit


knots = [3, 10, 20, 30]
i = 0

ig, ax = plt.subplots(nrows=2, ncols=2, figsize=(8, 5)) 
 
for row in range(2):
    for col in range(2):
        ax[row][col].plot(x, y, '.',c="g", markersize=2)
        yfit = spline(knots[i], y)
        ax[row][col].plot(x, yfit, 'r')
        ax[row][col].set_title("Knots number - "+str(knots[i]))
        ax[row][col].grid()       
        i=i+1
        
plt.tight_layout()        
plt.show() 
  
 
References:

No comments:

Post a Comment