Curve Fitting Example with leastsq() Function in Python

    The SciPy API provides a 'leastsq()' function in its optimization library to implement the least-square method to fit the curve data with a given function. The leastsq() is used for solving nonlinear least squares problems, which often arise in data fitting and parameter estimation. The function is specifically designed to minimize the sum of squared residuals between the observed and predicted values. 

    In this tutorial, we'll learn how to fit the data with the leastsq() function by using objective functions. 

    We'll start by loading the required libraries.

 
from numpy import array
from scipy.optimize import leastsq
import matplotlib.pyplot as plt 
    

   
We define a simple x-input and y-output data for this tutorial. Feel free to use your own data instead.
 
 
y = array([12, 8, 11, 7, 5, 2, 3, 5, 6, 4, 5, 7, 8, 13, 19, 22, 25])
x = array(range(len(y)))

    The leastsq() function requires the objective function, initial parameters, and x and y input data to perform fitting.

    In this tutorial, we define various objective functions to use and check differences in their fitting. In the code below, I have defined three types of functions to fit. You can also add or change the formulas to observe the fitting differences.

We use the following equations as fitting functions.

            y = ax^2 + bx + c

            y = ax^3 + bx + c

            y = ax^2 + bx

The curve-fitting function takes initial parameters of a, b, and c, and computes the residuals (differences between the observed y and the predicted values based on the quadratic model). This is implemented in Python as shown below.

 
def func1(params, x, y):
    a, b, c = params[0], params[1], params[2]
    residual = y - (a*x**2+b*x+c)
    return residual

def func2(params, x, y):
    a, b, c = params[0], params[1], params[2]
    residual = y - (a*x**3+b*x+c)
    return residual

def func3(params, x, y):
    a, b, c = params[0], params[1], params[2]
    residual = y - (a*x**2+b*x)
    return residual

 
    We define initial parameters, and we can start with values of 0.
 
   
params = [0, 0, 0]
  
   
    Now, we set the target function, initial parameters, and input data (x and y) into the leastsq() function to obtain output data containing optimized values for parameters a, b, and c. The function returns a tuple containing the optimized solution, covariance matrix, optimization information, and a message. 
    Then, we calculate the fitted y values using the derived a, b, and c values for each data point.
 
 
result = leastsq(func1, params, (x, y))
a, b, c = result[0][0], result[0][1], result[0][2]
yfit1 = a*x**2+b*x+c

result = leastsq(func2, params, (x, y))
a, b, c = result[0][0], result[0][1], result[0][2]
yfit2 = a*x**3+b*x+c

result = leastsq(func3, params, (x, y))
a, b, c = result[0][0], result[0][1], result[0][2]
yfit3 = a*x**2+b*x 
  

    Finally, we'll visualize the results in a plot to check the deference visually.

 
plt.plot(x, y, 'bo', label="y-original")
plt.plot(x, yfit1, color="red", label="y=ax^2+bx+c")
plt.plot(x, yfit2, color="orange", label="y=ax^3+b+c")
plt.plot(x, yfit3, color="green", label="y=ax^2+bx")
plt.xlabel('x')
plt.ylabel('y')
plt.legend(loc='best', fancybox=True, shadow=True)
plt.grid(True)
plt.show()



    In this tutorial, we've briefly learned curve fitting with SciPy leastsq() function in Python.  By minimizing the sum of squared residuals between observed and predicted values, it efficiently determines optimal parameter values for a given model. The full source code is listed below.


Source code listing

 
from numpy import array
from scipy.optimize import leastsq
import matplotlib.pyplot as plt

y = array([12, 8, 11, 7, 5, 2, 3, 5, 6, 4, 5, 7, 8, 13, 19, 22, 25])
x = array(range(len(y)))


def func1(params, x, y):
    a, b, c = params[0], params[1], params[2]
    residual = y - (a*x**2+b*x+c)
    return residual

def func2(params, x, y):
    a, b, c = params[0], params[1], params[2]
    residual = y - (a*x**3+b*x+c)
    return residual

def func3(params, x, y):
    a, b, c = params[0], params[1], params[2]
    residual = y - (a*x**2+b*x)
    return residual

params=[0, 0, 0]

result = leastsq(func1, params, (x, y))
a, b, c = result[0][0], result[0][1], result[0][2]
yfit1 = a*x**2+b*x+c

result = leastsq(func2, params, (x, y))
a, b, c = result[0][0], result[0][1], result[0][2]
yfit2 = a*x**3+b*x+c

result = leastsq(func3, params, (x, y))
a, b, c = result[0][0], result[0][1], result[0][2]
yfit3 = a*x**2+b*x

plt.plot(x, y, 'bo', label="y-original")
plt.plot(x, yfit1, color="red", label="y=ax^2+bx+c")
plt.plot(x, yfit2, color="orange", label="y=ax^3+b+c")
plt.plot(x, yfit3, color="green", label="y=ax^2+bx")
plt.xlabel('x')
plt.ylabel('y')
plt.legend(loc='best', fancybox=True, shadow=True)
plt.grid(True)
plt.show()
  
 
References:

4 comments:

  1. The way this algorithms are structured helped me to do a regression analysis to evaluate the performance of a chiller. Thanks!

    ReplyDelete
  2. Very useful example. Thank you!

    ReplyDelete
  3. small error, should be plt.plot(x, yfit2, color="orange", label="y=ax^3+b+c")

    ReplyDelete