A Practical Guide to ARIMA Models using PyCaret — Part 2
Understanding the Trend Term
📚 Introduction
In the previous article, we saw a brief overview of the ARIMA model and its various hyperparameters. We started with a very simple ARIMA model, one that does not exhibit and temporal dependence. This article will look at the “trend” component of the ARIMA model and see how this is modeled and what it represents.
📖 Suggested Previous Reads
The previous article in this series can be found below. I would recommend that readers go through it first before continuing with this article. This article builds upon the concepts described in the previous one as well as reuses some work done in it.
A Practical Guide to ARIMA Models using PyCaret — Part 1
1️⃣ “Trend” Overview in ARIMA Models
As the name suggests, a trend determines how a time series changes over time. In ARIMA models, this can be modeled using the following underlying equation [1]. “a” is called the “intercept” and “b” is called the “drift” term. “Drift” is nothing but the slope of a straight line.

There are several ways in which the trend can be modeled using this equation.
(1) A trend component that does not change with time. In this case, drift “b” = 0.
(2) A trend component that starts from 0 and linearly changes with time (zero intercept “a” and non-zero slope/drift “b”).
(3) A trend component that does not start from 0 and linearly change with time (non-zero intercept “a” and non-zero slope/drift “b”).
2️⃣️ Understanding the Trend Term using PyCaret
To understand this concept better, we will use the same dataset that we used in the last part of this series. As a refresher, this dataset was a “white noise” dataset. Details can be found in the Jupyter notebook for this article (available at the end of the article).

👉 Step 1 & 2: Setup PyCaret Time Series Experiment and Perform EDA
Since we have already performed these steps in the previous article, we will not go over them again. Please refer to the previous article for details.
👉 Step 3: Theoretical Calculations
As mentioned above, “trend” in ARIMA models can be calculated in 3 ways.
(1) The first method where the trend component is fixed is the same as the model that was covered in the previous article. Hence the same theoretical calculations apply in this case as well. As a refresher, the estimate of the predictions (both in-sample and out-of-sample) was the mean of the training dataset (i.e. “a” = 176.02). The estimate of the prediction interval was 174.08–177.96.
(2) The second method models the trend component as a straight line with intercept “a”= 0. In order to model this theoretically, we can recreate this using scikit-learn’s Linear Regression model. When we eventually model this using an ARIMA model, we would expect to see the same results.
X_train = np.arange(len(y_train)).reshape(-1, 1)
reg = LinearRegression(fit_intercept=False).fit(X_train, y_train)
print(f"Expected Intercept: {reg.intercept_}")
print(f"Expected Slope: {reg.coef_}")>>> Expected Intercept: 0.0
>>> Expected Slope: [0.85317393](3) The third method models the trend component as a straight line with intercept “a” which is not 0. We can again recreate this using scikit-learn as follows.
X_train = np.arange(len(y_train)).reshape(-1, 1)
reg = LinearRegression(fit_intercept=True).fit(X_train, y_train)
print(f"Expected Intercept: {reg.intercept_}")
print(f"Expected Slope: {reg.coef_}")>>> Expected Intercept: 175.95815015173943
>>> Expected Slope: [0.00038807]👉 Step 4: Build the Models
Now, that we have covered the theoretical calculations, let’s see these models in practice.
# Trend with method 1 ----
model2a = exp.create_model(
"arima",
order=(0, 0, 0),
seasonal_order=(0, 0, 0, 0),
trend="c"
)# Trend with method 2 ----
model2b = exp.create_model(
"arima",
order=(0, 0, 0),
seasonal_order=(0, 0, 0, 0),
trend="t"
)# Trend with method 3 ----
model2c = exp.create_model(
"arima",
order=(0, 0, 0),
seasonal_order=(0, 0, 0, 0),
trend="ct"
)👉 Step 5: Analyze the Results
We will reuse the same helper functions that we created in the previous article to analyze the results.
Method 1: trend = “c”
summarize_model(model2a)
get_residual_properties(model2a)
These results match with our theoretical calculations and are exactly same as those in the previous article. We will not cover them in detail again. You can refer to the underlying Jupyter Notebook at the end of this article for more details.
Method 2: trend = “t”
summarize_model(model2b)
get_residual_properties(model2b)
As we can see the drift from the model (0.8532) matches with theoretical calculation using scikit-learn.

Similarly, the residual Sigma2 (~ 7777) manual calculation matches the calculations in the model. One thing to note is that this is not a good model for this dataset since the unexplained variance (sigma2) is much higher than before). Let’s also look at the forecasts.
plot_predictions(model2b)

We can see that the forecast (in-sample) starts from close to 0 and increases linearly. PyCaret provides interactive plots so users can zoom into sections of the forecasts to analyze them in greater detail. For example, one could hover over consecutive points and notice that they differ by exactly the drift value of 0.8532.
The out-of-sample forecasts are a continuation of the in-sample forecasts and again rise linearly with a slope/drift of 0.8532.
A slight nuance for the sharp eyed observer. The first in-sample forecast is not 0. It starts at the value of 0.8532 (the drift value). Could you explain why?
Method 3: trend = “ct”
summarize_model(model2c)
get_residual_properties(model2c)
As we can see the intercept (175.9578) and drift (0.0004) from the model matches with theoretical calculation using scikit-learn. The drift being so close to 0 is expected since this is white noise data. Also, we can see that there is almost no difference in the “goodness of fit” (lower sigma2 = better fit) between this model and the model from Method 1.
Let’s also look at the forecasts. They should look very much like the ones from method 1 with the exception of the small slope/drift component. Again, the interactive plots in PyCaret allow for easy examination of the results as shown in the zoomed results below.

Also, it may be obvious, but it is worth noting explicitly that although both Method 1 and Method 3 gave similar fit in this case (given that the data was white noise), this is not always the case. When the data exhibits a linear trend, Method 3 will generally give a better fit compared to Method 1. I would encourage the readers to create a dummy linear dataset and try this out.
🚀 Conclusion
Hopefully this simple model has laid a good foundation for us to understand the inner workings of the trend component of an ARIMA model. In the next set of articles, we will start covering the other parameters one by one and see the impact that they have on the model’s behavior. Until then, if you would like to connect with me on my social channels (I post about Time Series Analysis frequently), you can find me below. That’s it for now. Happy forecasting!
📘 GitHub
Loved the article? Become a Medium member to continue learning without limits. I’ll receive a portion of your membership fee if you use the following link, with no extra cost to you.
📗 Resources
- Jupyter Notebook containing the code for this article
📚 References
[1] Constants and ARIMA models in R, Hyndsight
[2] Chapter 8 ARIMA Models, Forecasting: Principles and Practice, Rob J Hyndman and George Athanasopoulos





