162 lines
7.5 KiB
TeX
162 lines
7.5 KiB
TeX
\subsection{Results by Model Families}
|
|
\label{fams}
|
|
|
|
\begin{center}
|
|
\captionof{table}{Ranking of benchmark and horizontal models
|
|
($1~\text{km}^2$ pixel size, 60-minute time steps):
|
|
the table shows the ranks for cases with $2.5 < ADD < 25$
|
|
(and $25 < ADD < \infty$ in parentheses if they differ)}
|
|
\label{t:hori}
|
|
\begin{tabular}{|c|ccc|cccccccc|}
|
|
\hline
|
|
\multirow{2}{*}{\rotatebox{90}{\thead{\scriptsize{Training}}}}
|
|
& \multicolumn{3}{c|}{\thead{Benchmarks}}
|
|
& \multicolumn{8}{c|}{\thead{Horizontal (whole-day-ahead)}} \\
|
|
\cline{2-12}
|
|
~ & \textit{naive} & \textit{fnaive} & \textit{paive}
|
|
& \textit{harima} & \textit{hcroston} & \textit{hets} & \textit{hholt}
|
|
& \textit{hhwinters} & \textit{hses} & \textit{hsma} & \textit{htheta} \\
|
|
\hline \hline
|
|
3 & 11 & 7 (2) & 8 (5) & 5 (7) & 4 & 3
|
|
& 9 (10) & 10 (9) & 2 (6) & 1 & 6 (8) \\
|
|
4 & 11 & 7 (2) & 8 (3) & 5 (6) & 4 (5) & 3 (1)
|
|
& 9 (10) & 10 (9) & 2 (7) & 1 (4) & 6 (8) \\
|
|
5 & 11 & 7 (2) & 8 (4) & 5 (3) & 4 (9) & 3 (1)
|
|
& 9 (10) & 10 (5) & 2 (8) & 1 (6) & 6 (7) \\
|
|
6 & 11 & 8 (5) & 9 (6) & 5 (4) & 4 (7) & 2 (1)
|
|
& 10 & 7 (2) & 3 (8) & 1 (9) & 6 (3) \\
|
|
7 & 11 & 8 (5) & 10 (6) & 5 (4) & 4 (7) & 2 (1)
|
|
& 9 (10) & 7 (2) & 3 (8) & 1 (9) & 6 (3) \\
|
|
8 & 11 & 9 (5) & 10 (6) & 5 (4) & 4 (7) & 2 (1)
|
|
& 8 (10) & 7 (2) & 3 (8) & 1 (9) & 6 (3) \\
|
|
\hline
|
|
\end{tabular}
|
|
\end{center}
|
|
\
|
|
|
|
Besides the overall results, we provide an in-depth comparison of models
|
|
within a family.
|
|
Instead of reporting the MASE per model, we rank the models holding the
|
|
training horizon fixed to make comparison easier.
|
|
Table \ref{t:hori} presents the models trained on horizontal time series.
|
|
In addition to \textit{naive}, we include \textit{fnaive} and \textit{pnaive}
|
|
already here as more competitive benchmarks.
|
|
The tables in this section report two rankings simultaneously:
|
|
The first number is the rank resulting from lumping the low and medium
|
|
clusters together, which yields almost the same rankings when analyzed
|
|
individually.
|
|
The ranks from only high demand pixels are in parentheses if they differ.
|
|
|
|
A first insight is that \textit{fnaive} is the best benchmark in all
|
|
scenarios:
|
|
Decomposing flexibly by tuning the $ns$ parameter is worth the computational
|
|
cost.
|
|
Further, if one is limited in the number of non-na\"{i}ve methods,
|
|
\textit{hets} is the best compromise and works well across all demand
|
|
levels.
|
|
It is also the best model independent of the training horizon for high demand.
|
|
With low or medium demand, \textit{hsma} is the clear overall winner; yet,
|
|
with high demand, models with a seasonal fit (i.e., \textit{harima},
|
|
\textit{hets}, and \textit{hhwinters}) are more accurate, in particular,
|
|
for longer training horizons.
|
|
This is due to demand patterns in the weekdays becoming stronger with higher
|
|
overall demand.
|
|
|
|
\begin{center}
|
|
\captionof{table}{Ranking of classical models on vertical time series
|
|
($1~\text{km}^2$ pixel size, 60-minute time steps):
|
|
the table shows the ranks for cases with $2.5 < ADD < 25$
|
|
(and $25 < ADD < \infty$ in parentheses if they differ)}
|
|
\label{t:vert}
|
|
\begin{tabular}{|c|cc|ccccc|ccccc|}
|
|
\hline
|
|
\multirow{2}{*}{\rotatebox{90}{\thead{\scriptsize{Training}}}}
|
|
& \multicolumn{2}{c|}{\thead{Benchmarks}}
|
|
& \multicolumn{5}{c|}{\thead{Vertical (whole-day-ahead)}}
|
|
& \multicolumn{5}{c|}{\thead{Vertical (real-time)}} \\
|
|
\cline{2-13}
|
|
~ & \textit{hets} & \textit{hsma} & \textit{varima} & \textit{vets}
|
|
& \textit{vholt} & \textit{vses} & \textit{vtheta} & \textit{rtarima}
|
|
& \textit{rtets} & \textit{rtholt} & \textit{rtses} & \textit{rttheta} \\
|
|
\hline \hline
|
|
3 & 2 (10) & 1 (7) & 6 (4) & 8 (6) & 10 (9)
|
|
& 7 (5) & 11 (12) & 4 (1) & 5 (3) & 9 (8) & 3 (2) & 12 (11) \\
|
|
4 & 2 (8) & 1 (10) & 6 (4) & 8 (6) & 10 (9)
|
|
& 7 (5) & 12 (11) & 3 (1) & 5 (3) & 9 (7) & 4 (2) & 11 (12) \\
|
|
5 & 2 (3) & 1 (10) & 7 (5) & 8 (7) & 10 (9)
|
|
& 6 & 11 & 4 (1) & 5 (4) & 9 (8) & 3 (2) & 12 \\
|
|
6 & 2 (1) & 1 (10) & 6 (5) & 8 (7) & 10 (9)
|
|
& 7 (6) & 11 (12) & 3 (2) & 5 (4) & 9 (8) & 4 (3) & 12 (11) \\
|
|
7 & 2 (1) & 1 (10) & 8 (5) & 7 & 10 (9)
|
|
& 6 & 11 (12) & 5 (2) & 4 & 9 (8) & 3 & 12 (11) \\
|
|
8 & 2 (1) & 1 (9) & 8 (5) & 7 (6) & 10 (8)
|
|
& 6 & 12 (10) & 5 (2) & 4 & 9 (7) & 3 & 11 \\
|
|
\hline
|
|
\end{tabular}
|
|
\end{center}
|
|
\
|
|
|
|
Table \ref{t:vert} extends the previous analysis to classical models trained
|
|
on vertical time series.
|
|
Now, the winners from before, \textit{hets} and \textit{hsma}, serve as
|
|
benchmarks.
|
|
Whereas for low and medium demand, no improvements can be obtained,
|
|
\textit{rtarima} and \textit{rtses} are the most accurate with high demand
|
|
and short training horizons.
|
|
For six or more training weeks, \textit{hets} is still optimal.
|
|
Independent of retraining and the demand level, the models' relative
|
|
performances are consistent:
|
|
The \textit{*arima} and \textit{*ses} models are best, followed by
|
|
\textit{*ets}, \textit{*holt}, and \textit{*theta}.
|
|
Thus, models that can deal with auto-correlations and short-term forecasting
|
|
errors, as expressed by moving averages, and that cannot be distracted by
|
|
trend terms are optimal for vertical series.
|
|
|
|
Finally, Table \ref{t:ml} compares the two ML-based models against the
|
|
best-performing classical models and answers \textbf{Q2}:
|
|
With low and medium demand, no improvements can be obtained again; however,
|
|
with high demand, \textit{vrfr} has the edge over \textit{rtarima} for
|
|
training horizons up to six weeks.
|
|
We conjecture that \textit{vrfr} fits auto-correlations better than
|
|
\textit{varima} and is not distracted by short-term noise as
|
|
\textit{rtarima} may be due to the retraining.
|
|
With seven or eight training weeks, \textit{hets} remains the overall winner.
|
|
Interestingly, \textit{vsvr} is more accurate than \textit{vrfr} for low and
|
|
medium demand.
|
|
We assume that \textit{vrfr} performs well only with strong auto-correlations,
|
|
which are not present with low and medium demand.
|
|
|
|
\begin{center}
|
|
\captionof{table}{Ranking of ML models on vertical time series
|
|
($1~\text{km}^2$ pixel size, 60-minute time steps):
|
|
the table shows the ranks for cases with $2.5 < ADD < 25$
|
|
(and $25 < ADD < \infty$ in parentheses if they differ)}
|
|
\label{t:ml}
|
|
\begin{tabular}{|c|cccc|cc|}
|
|
\hline
|
|
\multirow{2}{*}{\rotatebox{90}{\thead{\scriptsize{Training}}}}
|
|
& \multicolumn{4}{c|}{\thead{Benchmarks}}
|
|
& \multicolumn{2}{c|}{\thead{ML}} \\
|
|
\cline{2-7}
|
|
~ & \textit{fnaive} & \textit{hets} & \textit{hsma}
|
|
& \textit{rtarima} & \textit{vrfr} & \textit{vsvr} \\
|
|
\hline \hline
|
|
3 & 6 & 2 (5) & 1 (4) & 3 (1) & 5 (2) & 4 (3) \\
|
|
4 & 6 (5) & 2 (4) & 1 (6) & 3 (2) & 5 (1) & 4 (3) \\
|
|
5 & 6 (5) & 2 (4) & 1 (6) & 4 (2) & 5 (1) & 3 \\
|
|
6 & 6 (5) & 2 & 1 (6) & 4 & 5 (1) & 3 \\
|
|
7 & 6 (5) & 2 (1) & 1 (6) & 4 & 5 (2) & 3 \\
|
|
8 & 6 (5) & 2 (1) & 1 (6) & 4 & 5 (2) & 3 \\
|
|
\hline
|
|
\end{tabular}
|
|
\end{center}
|
|
\
|
|
|
|
Analogously, we created tables like Table \ref{t:hori} to \ref{t:ml} for the
|
|
forecasts with time steps of 90 and 120 minutes and find that the relative
|
|
rankings do not change significantly.
|
|
The same holds true for the rankings with changing pixel sizes.
|
|
For conciseness reasons, we do not include these additional tables in this
|
|
article.
|
|
In summary, the relative performances exhibited by certain model families
|
|
are shown to be rather stable in this case study.
|