avatarHair Parra

Free AI web copilot to create summaries, insights and extended knowledge, download it at here

5616

Abstract

urning point for any three points, so that the probability of a turning point is 4/6 = 2/3 at any time. For the expectation, we consider (n-2) since the edges cannot be a turning point. The variance is a lot more complicated so I won’t go over it here, but if you’re curious, you can read <a href="https://math.stackexchange.com/questions/1774706/variance-of-turning-point-time-series">this stackexchange post</a>. What is important to us is that we can use this statistic to test for stationarity, in particular, we test for an<a href="https://medium.com/@hair.parra/a-complete-introduction-to-time-series-analysis-with-r-stationary-processesii-e0f66d10051a"> <b>iid sequence</b></a> , considering the hypotheses</p><figure id="60d7"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*zBN1wDNEDnUH8kHPgjcsfw.png"><figcaption></figcaption></figure><figure id="d927"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*c3jsyKxVzel4cF6m6oFYnA.png"><figcaption></figcaption></figure><h2 id="4ede">Sign Test</h2><figure id="f0fd"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*tJix8XAuNJZmgG0_DG3A0Q.png"><figcaption></figcaption></figure><figure id="3d78"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*-FobKlUSpRJgflcdCEb0Sg.png"><figcaption></figcaption></figure><p id="1fd9">Under H0, it follows that</p><figure id="f292"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*hvYmzbu_-g18SjbV0PKwmQ.png"><figcaption></figcaption></figure><figure id="2751"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*QLgl76AqUkDxc66WGMsUvA.png"><figcaption></figcaption></figure><figure id="3788"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*7GABxemQl-k7jSbdixH7NA.png"><figcaption></figcaption></figure><h2 id="f1b9">Other tests</h2><p id="f3f8">Some other tests which I will not discuss in here but you will see, for instance in the R functions are :</p><ul><li><a href="https://pdfs.semanticscholar.org/e75e/0790e9532cc5d6787c4a388c51d2fa34b355.pdf">McLeod-Li Q test</a>: a bootstrapping version of the Ljung-Box-Q</li><li><a href="https://en.wikipedia.org/wiki/Wilcoxon_signed-rank_test">Rank Test</a>: to test for iid noise.</li><li><a href="https://en.wikipedia.org/wiki/Q%E2%80%93Q_plot">Normal QQ plot:</a> to test for distribution normality.</li></ul><h2 id="0e3c">How to R</h2><p id="af95">Great! Now we have 5 different tests to check for stationarity. Thankfully, you don’t have to manually implement these, as they already make part of <code>R</code> modules. Let’s first load the necessary packages</p> <figure id="528b"> <div> <div>

            <iframe class="gist-iframe" src="/gist/JairParra/82dc8799ff5462c255e48045bbdfdf51.js" allowfullscreen="" frameborder="0" height="undefined" width="undefined">
          </div>
        </div>
    </figure></iframe></div></div></figure><p id="e24e">Next, let’s inspect the data by plotting the points</p>
    <figure id="c209">
        <div>
          <div>
            
            <iframe class="gist-iframe" src="/gist/JairParra/6bcff1e1bd91a1995a90dd196bd96a24.js" allowfullscreen="" frameborder="0" height="undefined" width="undefined">
          </div>
        </div>
    </figure></iframe></div></div></figure><figure id="bf6b"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*UVc-h2JHAXFU1uW1wItYDw.png"><figcaption></figcaption></figure><p id="d790">Next, let’s apply the tests on the raw data using the <code>forecast::checkresiduals</code> and the <code>itsmr::test</code> functions:</p>
    <figure id="9613">
        <div>
          <div>
            
            <iframe class="gist-iframe" src="/gist/JairParra/e6389c44c0ee5325088e496076bd2e06.js" allowfullscreen="" frameborder="0" height="undefined" width="undefined">
          </div>
        </div>
    </figure></iframe></div></div></figure><figure id="7330"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*q8_7IMxwPFdBPQVIDJgmng.png"><figcaption></figcaption></figure><p id="f749">You might get the warning above; this means that the model is simply not stationary at all! So much that R could not find appropriate degrees of freedom for the Chi-square statistic. Indeed, we can observe from the ACF that most of the lag points fall off-bounds, and the underlying distribution does not look very Gaussian.</p>
    <figure id="a090">
        <div>
          <div>
            
            <iframe class="gist-iframe" src="/gist/JairParra/ca015a93a093302feaa95550fdebe49b.js" allowfullscreen="" frameborder="0" height="undefined" width="undefined">
          </div>
        </div>
    </figure></iframe></div></div></figure><figure id="cc94"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*1ClgzO9017Q-8X_affZYZw.png"><figcaption></figcaption></figure>
    <figure id="c282">
        <div>
          <div>
            
            <iframe class="gist-iframe" src="/gist/JairParra/33e52de5c41a470dc8b275450e9c047d.js" allowfullscreen="" frameborder="0" height="undefined" width="undefined">
          </div>
        </div>
    </figure></iframe></div></div></figure><p id="a33c">Recall that for the Ljung-Box Q test, the null hypothesis is that the series is stationary, which we clearly reject from a p-value equal to 0. Although I did not talk about the McLeod-Li Q test, the null hypothesis is the same. We also have rejection for the

Options

Turning points and Rank P test, which indicates strong non-stationarity. Let’s now difference the series and test again:</p> <figure id="37d9"> <div> <div>

            <iframe class="gist-iframe" src="/gist/JairParra/53a19f07852449c938a1814e60391506.js" allowfullscreen="" frameborder="0" height="undefined" width="undefined">
          </div>
        </div>
    </figure></iframe></div></div></figure><figure id="6941"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*iAlZL22cMl8JDOACcjhplw.png"><figcaption></figcaption></figure><figure id="12e0"><img src="https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*fuNWvHJZfyr5pg9NJdiE1Q.png"><figcaption></figcaption></figure>
    <figure id="377e">
        <div>
          <div>
            
            <iframe class="gist-iframe" src="/gist/JairParra/8b5a7f80ca882753ef25750c1bbdff61.js" allowfullscreen="" frameborder="0" height="undefined" width="undefined">
          </div>
        </div>
    </figure></iframe></div></div></figure><p id="4ff8">The tricky thing with these tests is that sometimes they must disagree, so it is up to the analyst or researcher to reach some meaningful conclusion, and decide what to do next. In this case, we observe that the residuals are more zero-centered, although still do not look much like actual white noise. Indeed, by inspecting the ACF and PACF, we can observe a strong correlation at the first couple of lags, which fall out of bounds, although in far less quantity than before. We do see that the empirical distribution appears to be more Gaussian, from the histogram of the empirical distribution, along with the QQ-Plot. We now see from the tests that the first three clearly reject for stationarity/iid noise (=&gt; stationarity), but the difference sign test fails to reject the null iid hypothesis. At the 0.05 confidence, the Rank P test also fails to reject, providing some evidence for stationarity. Notice that in this example, we only applied a lag-1 difference! You could try to apply <a href="https://readmedium.com/a-complete-introduction-to-time-series-analysis-with-r-differencing-db94bc4df0ae">higher lag differencing </a>or even performing a<a href="https://readmedium.com/a-complete-introduction-to-time-series-analysis-with-r-classical-decomposition-model-part-ii-aa43b524680d\"> full classical decomposition analysis</a>, as we studied in previous articles.</p><h2 id="3294">Next time</h2><p id="1419">With this, we close the chapter on stationarity tests. Next time, we will start seeing how we can obtain the <b>best linear predictor </b>we can obtain, employing probability and optimization theory. Stayed tuned, and happy learning!</p><div id="f968" class="link-block">
      <a href="https://readmedium.com/a-complete-introduction-to-time-series-analysis-with-r-tests-for-stationarity-prediction-1-a78c1cf16676">
        <div>
          <div>
            <h2>A Complete Introduction To Time Series Analysis (with R):: Tests for Stationarity:: Prediction 1 →…</h2>
            <div><h3>We’ve come a long way: from studying models to study time series, stationary processes such as the MA(1) and AR(1)…</h3></div>
            <div><p>medium.com</p></div>
          </div>
          <div>
            <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/1*UlrclDUaBEPQdFz42-Csnw.png)"></div>
          </div>
        </div>
      </a>
    </div><h2 id="f9b9">Last time</h2><p id="efbc"><a href="https://medium.com/@hair.parra/a-complete-introduction-to-time-series-analysis-with-r-estimating-autocorrelation-8271448937c2">Estimating Autocorrelation</a></p><div id="1ec9" class="link-block">
      <a href="https://readmedium.com/a-complete-introduction-to-time-series-analysis-with-r-estimating-autocorrelation-8271448937c2">
        <div>
          <div>
            <h2>A Complete Introduction To Time Series Analysis (with R):: Estimating Autocorrelation</h2>
            <div><h3>In the last article, we went over a couple of important properties of the autocovariance function, and in previous…</h3></div>
            <div><p>medium.com</p></div>
          </div>
          <div>
            <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/1*7JEBP60-JXa5OCp0BMOO0w.png)"></div>
          </div>
        </div>
      </a>
    </div><h2 id="2d64">Main page</h2><div id="dddc" class="link-block">
      <a href="https://readmedium.com/a-complete-introduction-to-time-series-analysis-with-r-9882f2d44c9d">
        <div>
          <div>
            <h2>A Complete Introduction To Time Series Analysis (with R)</h2>
            <div><h3>During these times of the Covid19 pandemic, you have perhaps heard about the collaborative efforts to predict new…</h3></div>
            <div><p>medium.com</p></div>
          </div>
          <div>
            <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/1*TL2PeOANEN4zG0_OqoHptQ.jpeg)"></div>
          </div>
        </div>
      </a>
    </div><h2 id="653d">Follow me at</h2><ol><li><a href="https://www.linkedin.com/in/hair-parra-526ba19b/">https://www.linkedin.com/in/hair-parra-526ba19b/</a></li><li><a href="https://github.com/JairParra">https://github.com/JairParra</a></li><li><a href="https://medium.com/@hair.parra">https://medium.com/@hair.parra</a></li></ol></article></body>

A Complete Introduction To Time Series Analysis (with R):: Tests for Stationarity

Different stationarity tests for the Lynx data

In the last article, we saw how we could estimate autocovariance by using a slightly modified version of the typical covariance sample estimator. Further, we saw that in reasonably large samples, this converges in distribution to a normal random variable. In this article, we will use this fact to construct some useful hypothesis tests for stationarity, to check, for instance, whether our decomposition analysis of a series in trend + seasonal component, that is, the residuals after having estimated and removed these, are correct.

Confidence bounds for the ACF

Assuming the ACF follows an underlying normal distribution,

Lagwise Test

We can make direct use of the C.I. above to estimate whether a series is truly stationary: we know that a true stationary series should have 0 autocovariance and therefore 0 autocorrelation, so that we can employ the hypothesis

So in particular, if Ho is true, we should have that

That is, the estimated confidence interval should contain the value 0 for most lags.

Portmanteau Test

This test is also quite straightforward; consider the hypothesis

Then, using the statistic

we can test the hypothesis above.

Ljung-Box Q Test

Consider the hypothesis

Then, using the statistic

we can test the hypothesis above. Note that this is no more than a modified version of the Portmanteau Test, however, this test is more “conservative”, that is, this test is more inclined to reject H0, due to the weighting distribution.

Nonparametric tests

So far, we have seen tests that rely on normality distribution assumptions. However, there exist the so-called nonparametric tests, which do not make these assumptions, but still attempt to test whether the series is actually stationary. However, in general, these tend to be “harsher” , in that they are more likely to reject H0.

Turning points test

Illustration of a turning point

Consider a time series

and consider any three observations from it

Then, a point xi is called a turning point if

as depicted in the figure at the beginning. Now, for an I.I.D (stationary) sequence let T be the random variable that represents the number of turning points it contains. Then, at any time t, we have that

And of course, that’s entirely clear and intuitive. NOT!! WHAT THE HECK IS THIS?? Let’s think about it: the probability of a point is a turning point is 2/3 because there are 3!=6 possible permutations of three points. Out of these, only four permutations give rise to a turning point for any three points, so that the probability of a turning point is 4/6 = 2/3 at any time. For the expectation, we consider (n-2) since the edges cannot be a turning point. The variance is a lot more complicated so I won’t go over it here, but if you’re curious, you can read this stackexchange post. What is important to us is that we can use this statistic to test for stationarity, in particular, we test for an iid sequence , considering the hypotheses

Sign Test

Under H0, it follows that

Other tests

Some other tests which I will not discuss in here but you will see, for instance in the R functions are :

How to R

Great! Now we have 5 different tests to check for stationarity. Thankfully, you don’t have to manually implement these, as they already make part of R modules. Let’s first load the necessary packages

Next, let’s inspect the data by plotting the points

Next, let’s apply the tests on the raw data using the forecast::checkresiduals and the itsmr::test functions:

You might get the warning above; this means that the model is simply not stationary at all! So much that R could not find appropriate degrees of freedom for the Chi-square statistic. Indeed, we can observe from the ACF that most of the lag points fall off-bounds, and the underlying distribution does not look very Gaussian.

Recall that for the Ljung-Box Q test, the null hypothesis is that the series is stationary, which we clearly reject from a p-value equal to 0. Although I did not talk about the McLeod-Li Q test, the null hypothesis is the same. We also have rejection for the Turning points and Rank P test, which indicates strong non-stationarity. Let’s now difference the series and test again:

The tricky thing with these tests is that sometimes they must disagree, so it is up to the analyst or researcher to reach some meaningful conclusion, and decide what to do next. In this case, we observe that the residuals are more zero-centered, although still do not look much like actual white noise. Indeed, by inspecting the ACF and PACF, we can observe a strong correlation at the first couple of lags, which fall out of bounds, although in far less quantity than before. We do see that the empirical distribution appears to be more Gaussian, from the histogram of the empirical distribution, along with the QQ-Plot. We now see from the tests that the first three clearly reject for stationarity/iid noise (=> stationarity), but the difference sign test fails to reject the null iid hypothesis. At the 0.05 confidence, the Rank P test also fails to reject, providing some evidence for stationarity. Notice that in this example, we only applied a lag-1 difference! You could try to apply higher lag differencing or even performing a full classical decomposition analysis, as we studied in previous articles.

Next time

With this, we close the chapter on stationarity tests. Next time, we will start seeing how we can obtain the best linear predictor we can obtain, employing probability and optimization theory. Stayed tuned, and happy learning!

Last time

Estimating Autocorrelation

Main page

Follow me at

  1. https://www.linkedin.com/in/hair-parra-526ba19b/
  2. https://github.com/JairParra
  3. https://medium.com/@hair.parra
Statistics
Machine Learning
R
Forecasting
Timeseries
Recommended from ReadMedium