Summary

This operator creates a seasonal ARIMA model for time-related observations. A forecast is created with the help of this model.

In time series analysis, an autoregressive integrated moving average (ARIMA) model is a generalization of an autoregressive moving average (ARMA) model. These models are fitted to time series data either to better understand the data or to predict future points in the series (forecasting). They are applied in some cases where data show evidence of non-stationarity, where an initial differencing step (corresponding to the "integrated" part of the model) can be applied to reduce the non-stationarity (wikipedia).

Example: Estimating a SARIMA model

Situation

A value is measured across 14 days. The data show that the value increases by 1 each day and decreases by 1 on the seventh day. 

Settings

  • Add the operation "Seasonal ARIMA Analysis 5.0" to the data node.
  • Enter the settings shown below.
  • We want to see both historical and forecast data in the resulting data node, therefore we chose "Forecast + History" to deliver as result.

Result

  • The data node shown below is the result of the SARIMA analysis.
  • The first 14 rows are the observed values (history) and are therefore indicated by an "H". The following 7 rows are the forecasted values and therefore indicated by "F".

The operator settings show also the parameter estimates for Lambda (BOX-COX Transformation), coefficients for AR- and MA-processes of the basic and seasonal model. Additionally, the estimated ACF vector of the residues is shown.

To visualize the results, e.g., in a chart, please add a new data node with the operation Chart: Histogram Time Pattern. The result will look similar to the chart below.

Project-File

Confluence Op SARIMA.gzip

Want to learn more?

Settings

This operator creates a seasonal ARIMA model for time-related observations. A forecast is created with the help of this model.

Columns of input table

Parameter

More detailed information

Data requirements

  • Input data always need to be sorted by time stamps in column "Date + time". SARIMA Analysis cannot be conducted if this is not the case.
  • There always need to be 1 season + 1 observation to conduct SARIMA analysis. It does not make sense to calculate a SARIMA model with less data.
  • Missing time stamps in the input data will be completed as missing rows and treated as missing observations.

Using SARIMA

The TIS-GUI and the SARIMA operator description provide additional information and (if necessary) warnings.

  • To calculate an ARIMA analysis without seasonality, please choose 000 as seasonal model and 1 as duration of the season.
  • Days known to be 0 (e.g., Sundays in retail) should be excluded BEFORE from the time series data.

Statistical info

  • It is difficult to calculate several periods into the future. If you want to forecast further, calculate on an aggregated level (e.g., with weeks instead of days). This will have the disadvantage of losing information, though.
  • For some data it makes sense to use Autoregression = 2 or Moving Average = 2. However, this leads to slow parameter estimation and rarely provides much better forecasts. Therefore, these models are not provided in TIS at the moment. They can, however, be provided on demand.
  • Hint: Random Walk = AR1 processes with coefficient 1; this means: the basic model 010 without seasonality. Since no parameter needs to be estimated in the random walk model, and forecast is simply the last observation, this model cannnot directly be chosen. A random walk modell can be estimated by chosing e.g. basic model 011 and seasonal model 000, and the estimated parameter is very close to 0.
  • Autocorrelation function of residuals  = ACF Auto Correlation Function
    • e.g. Value = 0.7 ... shows that the model does not fit
    • e.g. ABS(x) <0.15 indicates a good model
  • Parameter estimation in TIS is based on MLE (Maximum Likelihood Estimation)

 

Examples

Example 2: SARIMA model with hourly interval data

Situation

On two subsequent days, a value is measured each hour in the time between 09:00 to 15:00 hours.

Settings

  • Add the operation "Seasonal ARIMA Analysis 5.0" to the data node.
  • Enter the settings shown below.
  • We want to see both historical and forecast data in the resulting data node, therefore we chose "Forecast + History" to deliver as result.

Result

  • The data node shown below is the result of the SARIMA analysis.
  • The first 14 rows are the observed values (history) and are therefore indicated by an "H". The following 7 rows are the forecasted values and therefore indicated by "F".

The operator settings show also the parameter estimates for Lambda (BOX-COX Transformation), coefficients for AR- and MA-processes of the basic and seasonal model. Additionally, the estimated ACF vector of the residues is shown.

Troubleshooting

Problem

Frequent Cause

Solutions

Error message, or "n. def."

The error can be caused by the raw data of the combination of identifiers.

E.g., there are too few values to calculate a certain figure.

Create larger groups, i.e., less filtering by identifier instances.

Related topics