avatarHair Parra

Summary

The webpage provides a comprehensive guide on time series analysis using R, with a focus on classical decomposition models, trend estimation, seasonality analysis, and the practical application of these concepts to antidiabetic drug sales data.

Abstract

The article delves into the classical decomposition model for time series analysis, building upon previous discussions on trend estimation with moving averages. It introduces methods for estimating seasonality, which is crucial for understanding recurring patterns within data. The author illustrates these techniques by analyzing antidiabetic drug sales data, demonstrating how to detect and quantify seasonal trends and remove them to achieve a white noise residual. The process involves using R libraries to perform exponential smoothing, moving average fitting, and seasonality estimation. The article emphasizes the importance of obtaining white noise residuals for better model fitting, which can then be used for predictions by reincorporating the trend and seasonal components. The author concludes by hinting at the next topic, differencing, and encourages readers to apply these techniques independently.

Opinions

  • The author believes that understanding both trend and seasonality is essential for accurate time series analysis.
  • The use of R and its libraries, such as fpp2, is advocated for performing time series decomposition and analysis.
  • Exponential smoothing and moving average models are presented as viable options for trend estimation, with a preference for methods that yield residuals resembling white noise.
  • The article suggests that the classical decomposition analysis is a reliable approach for analyzing time series data, particularly when the goal is to isolate and understand seasonal effects.
  • There is an emphasis on the practical application of theoretical concepts, with a full example using real-world data to demonstrate the methodology.
  • The author implies that mastery of these techniques will empower readers to conduct their own time series analyses and improve their predictive modeling capabilities.
  • The anticipation of future articles on differencing indicates the author's view that there is always more to learn and apply in the field of time series analysis.

A Complete Introduction To Time Series Analysis (with R):: Classical Decomposition Model part II

The antidiabetic drug sales data shows a clear pattern: by the end of every year, antidiabetic drug sales spike. Perhaps those Christmas candies and cookies are to blame!

In the last article, we introduced the classical decomposition model, and had a comprehensive discussion of trend estimation, notably using the moving average filter. This time, we will discuss the final missing part: seasonality.

Estimating Seasonality

Assume that Yt is some White Noise process, and consider the classical decomposition model

The equations above are just mathematical formalizations of the notion that the process “kind of behaves similarly” every certain period or season. For instance, you can expect winter to start roughly at the same time every year (even in Canada, where we have winter 8 months a year). How can we estimate this seasonal component? Suppose we have some series

Suppose also, that these data have some seasonal function with a period d. We let k=1,..,d be an index ranging over each season. In order to estimate the seasonal component, we follow the next procedure:

The following table might be helpful for visualization for 12 observations and four seasons:

Visualization of data organization for n=12 and d=4

Classical Decomposition Analysis

Now that we are equipped with the tools to estimate seasonality, let’s jump right to the classical decomposition analysis , that is, the procedure to follow to analyze any time series, assuming a classical decomposition.

2. Estimate the seasonal component using the procedure presented before, and obtain deseasonalized data dt.

Hopefully, the resulting Yt-tilde should look like white noise, which tells us that our analysis was correct. Let’s see a full-worked example!

How to R

First, load all the required libraries:

In this example, we consider the anti-diabetic drugs sale data, which can be found in the package fpp2 :

It is clear that not only there is a trend, but also some seasonal component that repeats every year, when the sales suddenly spike. We can estimate the trend with the ma function (that is, we are fitting a moving-average) and also the ses function, for exponential smoothing with alpha=0.2.

Original data and trend estimations for the antidiabetic drugs data.

We see that these average estimates also capture some of the seasonality. How do we know which one is better? We can inspect the residuals:

We see from the residuals that the exponential smoothing was perhaps a bit too much of an overkill, and overall the MA5 residuals look closer to White Noise. Now, our task is to estimate seasonality: for this sake, we will obtain the frequency of the data observations (that is, the number of seasons), and plot what this seasonality looks like:

We see that indeed the peaks are exactly every 12 seasons (that is, every 12 months= a year). We can even obtain the coefficients for this model:

Next, we remove this seasonal component, and hope that the result looks somehow like a White Noise process:

The second graph here shows the residuals after removing the seasonal component. Don’t be fooled by the shape! Note that the range goes (mostly) from -2.5 to 2, as opposed to -3.5 to almost 6 in the MA5 residuals. Except for the last couple of years, it looks like the process is closer to White Noise. Let’s inspect the ACF plots:

The residual definitely look better for the detrended and deseasonalized series. For the sake of illustration and as a summary, we can compare the original data against the deseasonalized data only, as well as the detrended and deseasonalized data residuals:

That’s quite some progress! You might still be wondering “that’s awesome, but why are we doing this?”. We will answer that question in-depth much later, but the spoiler is the following: if we can obtain good, approximately White Noise residuals, we can fit better models, and then simply add back the trend and seasonal components at prediction time :)

Next time

That was a lot! If you have been following these article series, by now you should feel comfortable with performing full classical analysis on any time series on your own. In the next article, we will learn about another tool that will help us better analyze trend and seasonality: differencing. Stay tuned, and happy learning!

Last article: Classical Decomposition Model (Part I)

Main page

Follow me at

  1. https://blog.jairparraml.com/
  2. https://www.linkedin.com/in/hair-parra-526ba19b/
  3. https://github.com/JairParra
  4. https://medium.com/@hair.parra
Timeseries
Forecasting
Machine Learning
R
Statistics
Recommended from ReadMedium