Understanding Moving Average with R

   The Moving Average (MA) technique calculates the mean value of a given subset by shifting the subset for the entire data series. Moving average is a simple and widely used technique to analyze time-series data in statistics. The technique smooths out the series of data by getting the mean values. In this process,  most of the ups and downs and sharp fluctuations in a series of data can be eliminated and a long term data trend can be created.


   In this tutorial, we'll learn how to calculates MA with the subset values of three and five for a given data in R. The tutorial covers.

  1. Creating sample data
  2. Implementing the MA function
  3. Visualizing in a plot
  4. Source code listing
Let's get started.


Creating sample data

First, we'll create sample numerical series data for this tutorial and visualize it in a plot.

n = 100
s = seq(.1, n/10, .1)
price = s*sin(s/20) + rnorm(n) + runif(n)
plot(price, type="l", col="blue")
grid()




Implementing the MA function

Next, we'll write a function to calculate MA in R. Here we can implement the subset value with three and five for moving average.

ma = function(x)
{
    n = length(x)
    y_ma3 = numeric()
    y_ma5 = numeric()
    for (i in 2:n-1)
    {
        y_ma3[i]=(x[i-1]+x[i]+x[i+1])/3
        j=i+1
        y_ma5[i]=(x[j-2]+x[j-1]+x[j]+x[j+1]+x[j+2])/5
    }
    return(list(ma3=na.omit(y_ma3), ma5=na.omit(y_ma5)))
}

Since the output data contains NA values at the beginning and the end of the series, we'll omit them from the series data.
Now we can calculate MA values for 'price' data.

y = ma(price)
str(y)
 
 List of 2
 $ ma3: atomic [1:98] 0.6664 1.0633 0.816 -0.0465 -0.0128 ...
  ..- attr(*, "na.action")=Class 'omit'  int 1
 $ ma5: atomic [1:96] 0.481 0.444 0.567 0.219 -0.147 ...
  ..- attr(*, "na.action")=Class 'omit'  int [1:3] 1 98 99


Visualizing in a plot

Next, we'll visualize the results in a graph.

par(mar = c(2,2,1,1))
par(mfrow = c(2,1))
plot(price, type="l", col="blue")
lines(y$ma3, type="l", col="red", lwd=2)
legend(1, 7, box.col = "white", legend=c("original", "MA-3"),
       col=c("blue", "red"), lty=1, cex=0.6) 
grid()
 
plot(price, type="l", col="blue")
lines(y$ma5, type="l", col="orange", lwd=2)
legend(1, 7, box.col = "white", legend=c("original", "MA-5"),
       col=c("blue", "orange"), lty=1, cex=0.6) 
grid()



The plot shows that MA values a more smooth compared to original data. And we can easily check the overall trend of data.

   In this tutorial, we've learned how to implement an MA with subset three and five for a given data series. The source code is listed below.


Source code listing

n = 100
s = seq(.1, n/10, .1)
price = s*sin(s/20) + rnorm(n) + runif(n)
 
plot(price, type="l", col="blue")
grid()
 
ma = function(x)
{
    n = length(x)
    y_ma3 = numeric()
    y_ma5 = numeric()
    for (i in 2:n-1)
    {
        y_ma3[i]=(x[i-1]+x[i]+x[i+1])/3
        j=i+1
        y_ma5[i]=(x[j-2]+x[j-1]+x[j]+x[j+1]+x[j+2])/5
    }
    return(list(ma3=na.omit(y_ma3), ma5=na.omit(y_ma5)))
}
 
y = ma(price)
str(y) 
 
par(mar = c(2,2,1,1))
par(mfrow = c(2,1))
plot(price, type="l", col="blue")
lines(y$ma3, type="l", col="red", lwd=2)
legend(1, 7, box.col = "white", legend=c("original", "MA-3"),
       col=c("blue", "red"), lty=1, cex=0.6) 
grid()
 
plot(price, type="l", col="blue")
lines(y$ma5, type="l", col="orange", lwd=2)
legend(1, 7, box.col = "white", legend=c("original", "MA-5"),
       col=c("blue", "orange"), lty=1, cex=0.6) 
grid()

No comments:

Post a Comment