Curve Fitting Example With Nonlinear Least Squares in R

    The Nonlinear Least Squares (NLS) fitting is a statistical method used to fit a model to data by minimizing the sum of the squares of the differences between the observed and predicted values.

    In this tutorial, we'll briefly learn how to fit nonlinear data by using the 'nls' function in R. The 'nls' comes in a 'stats' base package. The tutorial covers:
  1. The concept of NLS
  2. Preparing the data
  3. NLS fitting
  4. NLS fitting with multiple functions
  5. Source code listing

 

The concept of NLS

    Nonlinear Least Squares fitting is a statistical method used to fit nonlinear models to data by minimizing the sum of squared differences between observed and predicted values. It adjusts model parameters iteratively to find the best fit. Unlike linear regression, NLS can handle complex relationships between variables. 
    The method starts with initial parameter estimates and uses iterative optimization algorithms like Gauss-Newton or Levenberg-Marquardt to converge to the best-fitting parameters.  
    NLS fitting is commonly used in fields such as statistics, engineering, biology, and economics, where relationships between variables are nonlinear and it's helpful for estimating parameters in models that cannot be easily linearized.
 
 

Preparing the data

   We'll start by generating simple test data for this tutorial. First we define a polynomial function. Then we generate sequence data for x, evaluate the function f(x) at each point in x, add random numbers from a uniform distribution [-1, 1], and finally create a data frame with columns x and y, containing the generated data points.

 
# Define the function
f <- function(x) x^3 + 2*x^2 + 5

# Generate noisy data
set.seed(123)
x <- seq(-0.99, 1, by = 0.01)
y <- f(x) + runif(length(x), -1, 1)
df <- data.frame(x = x, y = y)
 
print(head(df))
 

 
x y
1 -0.99 5.565056
2 -0.98 6.556218
3 -0.97 5.787081
4 -0.96 6.724499
5 -0.95 6.828560
6 -0.94 5.027729 



NLS fitting
 
   The nls() function in R performs nonlinear least squares regression to fit a model to data. It starts the optimization process with initial estimates for the parameters and applies an iterative optimization algorithm to minimize the sum of squared residuals between the observed data and predicted data. The optimization algorithm continues iterating until certain convergence criteria are met. 
    We use the nls() function to fit a quadratic model y = ax^2 + bx + c to the data df. The start argument defines the initial parameter values for the optimization algorithm. In this case, we start with a = 0, b = 0, and c = 0. To check the fitted model parameters we can print the model.
 
 
# Fit the quadratic model
model <- nls(y ~ a * x^2 + b * x + c, data = df, start = list(a=0, b=0, c=0))
print(model)
 

The fitted model details looks as follows:
 

Nonlinear regression model
model: y ~ a * x^2 + b * x + c
data: df
a b c
2.2031 0.5897 4.9471
residual sum-of-squares: 62.86

Number of iterations to convergence: 1
Achieved convergence tolerance: 1.096e-08 

 
    We use the predict() function to generate predicted values based on the fitted model and the input x.
 
 
# Predict values using the fitted model
pred <- predict(model, x)

 
    Finally, we create a plot with original data points using gray dots and the fitted curve  on the same plot as a solid line.
 
 
# Plot the original data and the fitted curve
plot(x, y, pch = 20, col = "darkgray", main = "NLS fitting Example", xlab = "x", ylab = "y")
lines(x, pred, lwd = 3, col = "blue")

# Add legend
legend("topleft", legend = c("y ~ ax^2 + bx + c"), col = "blue", lty = 1, lwd = 2, bty = "n")

# Add grid
grid()


 
NLS fitting with multiple functions
 
    In this part of the tutorial, we perform NLS fitting with three different models and plot the original data along with the fitted curves.
    We use the following quadratic, cubic, and exponential models as the fitting functions:

            y = ax^2 + bx + c

            y = ax^3 + bx^2 + c

            y = a*exp(bx^2) + c

We use nls() function to fit the each model specifying the formula, the input data, and initial parameters values. The predicted y data can be taken using the predict() method for each trained model. Finally, we create a plot to visualize the original and fitted curves. 

 
# Fit a quadratic model
model_quad <- nls(y ~ a * x^2 + b * x + c,
data = df,
start = list(a = 0.5, b = 0, c=0))

# Fit a cubic model
model_cubic <- nls(y ~ a * x^3 + b * x^2 + c,
data = df,
start = list(a = 0.1, b = 0.1, c = 0))

# Fit an exponential model
model_exp <- nls(y ~ a * exp(b * x^2) + c,
data = df,
start = list(a = 1, b = 1, c = 0))

# Plot the data and fitted curves
plot(df$x, df$y, pch = 20, col = "darkgray",
main = "NLS fitting Example", xlab = "x", ylab = "y")
lines(df$x, predict(model_quad), type = "l", col = "red", lwd = 2)
lines(df$x, predict(model_cubic), type = "l", col = "green", lwd = 2)
lines(df$x, predict(model_exp), type = "l", col = "blue", lwd = 2)

# Add legend with same colors as lines
legend("topleft",
legend = c("y ~ ax^2 + bx + c",
"y ~ ax^3 + bx^2 + c",
"y ~ a * exp(b * x^2) + c"),
col = c("red", "green", "blue"),
lty = 1, lwd = 2, bty = "n")

# Add grid
grid()
 

     The results show slight differences between each fitting model. You can modify or add other fitting functions to achieve better results based on the characteristics of your input data.
 
 
Conclusion
 
    In this tutorial, we learned how to implement curve fitting with NLS method in R.     
    NLS curve fitting enables the modeling of nonlinear relationships between variables. By minimizing the sum of squared differences between observed and predicted values, NLS allows for the estimation of parameters in complex models.
 
 
 
Source code listing

 
# Define the function
f <- function(x) x^3 + 2*x^2 + 5

# Generate noisy data
set.seed(123)
x <- seq(-0.99, 1, by = 0.01)
y <- f(x) + runif(length(x), -1, 1)
df <- data.frame(x = x, y = y)
print(head(df))


# Fit the quadratic model
model <- nls(y ~ a * x^2 + b * x + c, data = df, start = list(a=0, b=0, c=0))
print(model)

# Predict values using the fitted model
pred <- predict(model, x)

# Plot the original data and the fitted curve
plot(x, y, pch = 20, col = "darkgray", main = "NLS fitting Example", xlab = "x", ylab = "y")
lines(x, pred, lwd = 3, col = "blue")

# Add legend
legend("topleft", legend = c("y ~ ax^2 + bx + c"), col = "blue", lty = 1, lwd = 2, bty = "n")

# Add grid
grid()


# NLS fitting with different fitting functions
# Fit a quadratic model
model_quad <- nls(y ~ a * x^2 + b * x + c,
data = df,
start = list(a = 0.5, b = 0, c=0))

# Fit a cubic model
model_cubic <- nls(y ~ a * x^3 + b * x^2 + c,
data = df,
start = list(a = 0.1, b = 0.1, c = 0))

# Fit an exponential model
model_exp <- nls(y ~ a * exp(b * x^2) + c,
data = df,
start = list(a = 1, b = 1, c = 0))

# Plot the data and fitted curves
plot(df$x, df$y, pch = 20, col = "darkgray",
main = "NLS fitting Example", xlab = "x", ylab = "y")
lines(df$x, predict(model_quad), type = "l", col = "red", lwd = 2)
lines(df$x, predict(model_cubic), type = "l", col = "green", lwd = 2)
lines(df$x, predict(model_exp), type = "l", col = "blue", lwd = 2)

# Add legend with same colors as lines
legend("topleft",
legend = c("y ~ ax^2 + bx + c",
"y ~ ax^3 + bx^2 + c",
"y ~ a * exp(b * x^2) + c"),
col = c("red", "green", "blue"),
lty = 1, lwd = 2, bty = "n")

# Add grid
grid()



2 comments:

  1. Your line: fit = nls(y~a*x^2+b*x, data = df, start(a=0, b=0))
    Should be: fit = nls(y~a*x^2+b*x, data = df, start=list(a=0, b=0))

    ReplyDelete
  2. y = peq(x) + runif(200) kicks back an error. I think you meant y = p(x) + runif(200)

    ReplyDelete