How to Forecast Time-series Data in R


   Forecasting the future direction of time series data like the price, sales direction, or trend is an interesting topic in data analysis. Time series data forecasting is to create forecast data for future trend based on historical inputs.
   In this post, we'll learn how to forecast time series data and plot it in R by using the forecast package. We use a simple simulated time-series data in this tutorial. We'll start by loading the required libraries. You can install them by typing 'install.packages(c("forecast", "ggplot2"))' in R command prompt.

library(forecast)
library(ggplot2)

  First, we'll generate data that contains a daily trend of price starting from 2017/01/01 to 2018/04/01. Based on that period of data, we will forecast a price trend for the coming 30 days.

actual = seq.Date(from = as.Date("2017/01/01"),
                  to = as.Date("2018/04/01"),
                  by = "day")
 
forecas = seq.Date(from = as.Date("2018/04/02"),
                   to = as.Date("2018/05/01"),
                   by = "day")
 
n = length(actual_days)
s = seq(.1, n/10, .1)
price = 10 + s*sin(s/500) + rnorm(n) + runif(n)
df = data.frame(date = actual_days, price = price)


We can visualize the simulated data and check the trend.

ggplot(df, aes(x = date, y = price)) + 
   geom_line(color="blue") +
   scale_x_date(date_labels="%Y-%m", date_breaks="months") +
   theme(axis.text.x = element_text(angle=70, hjust=1))



Forecasting

   First, we need to create time-series object. A 'ts' function helps to create a time-series object from the give vector or matrix of observation data. Here, the price will be observation data and we'll set 7 for frequency parameter to sample daily base.

ts_price = ts(price, frequency = 7)
str(ts_price)
 Time-Series [1:456] from 1 to 66: 12.3 10.2 10.3 9.7 10.8 ... 

Now we can forecast ts_price object by using forecast() function. A forecast() function forecasts time-series data. To set the target period to forecast we use the h parameter and set 30 for 30 days.

fc = forecast(ts_price, h=30)
names(fc)
 [1] "model"     "mean"      "level"     "x"         "upper"    
 [6] "lower"     "fitted"    "method"    "series"    "residuals"


You can check the above attributes of the 'fc' object to know more about them.

Next, we'll visualize the forecasted data in a plot.

plot(fc, ylab = "price", xaxt = "n")
lines(fc$fitted, col = "red", lwd = 2)


   The plot shows forecast data in a default view format. To visualize dates, we can arrange the output data with the following methods and draw it again.

fc_df = rbind(df, data.frame(date=forecast_days, price = NA))
fc_df = cbind(fc_df, fitted = c(fc$fitted, fc$mean))
 
ggplot() + 
  geom_line(aes(date, price), fc_df,color = "blue") +  
  geom_line(aes(date, fitted), fc_df, color = "red", lwd = 1) +
  scale_x_date(date_labels = "%Y-%m", date_breaks = "months")+
  theme(axis.text.x = element_text(angle = 70, hjust = 1))



   The plot shows the period and the forecasted trend. In this post, we've briefly learned how to forecast time-series data in R by using the 'forecast' package. Thank you for reading!


Source code listing

 
library(forecast)
library(ggplot2)
 
actual = seq.Date(from = as.Date("2017/01/01"),
                  to = as.Date("2018/04/01"),
                  by = "day")
 
forecas = seq.Date(from = as.Date("2018/04/02"),
                   to = as.Date("2018/05/01"),
                   by = "day")
 
n = length(actual_days)
s = seq(.1, n/10, .1)
price = 10 + s*sin(s/500) + rnorm(n) + runif(n)
df = data.frame(date = actual_days, price = price)
 
ggplot(df, aes(x = date, y = price)) + 
  geom_line(color="blue") +
  scale_x_date(date_labels="%Y-%m", date_breaks="months") +
  theme(axis.text.x = element_text(angle=70, hjust=1))
 
ts_price = ts(price, frequency = 7)
str(ts_price) 
 
fc = forecast(ts_price, h=30)
names(fc)
 
plot(fc, ylab = "price", xaxt = "n")
lines(fc$fitted, col = "red", lwd = 2)
 
fc_df = rbind(df, data.frame(date=forecast_days, price = NA))
fc_df = cbind(fc_df, fitted = c(fc$fitted, fc$mean))
 
ggplot() + 
   geom_line(aes(date, price), fc_df,color = "blue") +  
   geom_line(aes(date, fitted), fc_df, color = "red", lwd = 1) +
   scale_x_date(date_labels = "%Y-%m", date_breaks = "months")+
   theme(axis.text.x = element_text(angle = 70, hjust = 1))
 

No comments:
Post a Comment