240 lines
9.5 KiB
TeX
240 lines
9.5 KiB
TeX
\subsection{Overall Results}
|
|
\label{overall_results}
|
|
|
|
Table \ref{t:results} summarizes the overall best-performing models grouped by
|
|
training horizon and a pixel's average daily demand (ADD) for a
|
|
pixel size of $1~\text{km}^2$ and 60-minute time steps.
|
|
Each combination of pixel and test day counts as one case, and the total
|
|
number of cases is denoted as $n$.
|
|
Clustering the individual results revealed that a pixel's ADD over the
|
|
training horizon is the primary indicator of similarity and three to four
|
|
clusters suffice to obtain cohesive clusters:
|
|
We labeled them "no", "low", "medium", and "high" demand pixels with
|
|
increasing ADD, and present the average MASE per cluster.
|
|
The $n$ do not vary significantly across the training horizons, which confirms
|
|
that the platform did not grow area-wise and is indeed in a steady-state.
|
|
|
|
\begin{center}
|
|
\captionof{table}{Top-3 models by training weeks and average demand
|
|
($1~\text{km}^2$ pixel size, 60-minute time steps)}
|
|
\label{t:results}
|
|
\begin{tabular}{|c|c|*{12}{c|}}
|
|
|
|
\hline
|
|
\multirow{3}{*}{\rotatebox{90}{\thead{Training}}}
|
|
& \multirow{3}{*}{\rotatebox{90}{\thead{Rank}}}
|
|
& \multicolumn{3}{c|}{\thead{No Demand}}
|
|
& \multicolumn{3}{c|}{\thead{Low Demand}}
|
|
& \multicolumn{3}{c|}{\thead{Medium Demand}}
|
|
& \multicolumn{3}{c|}{\thead{High Demand}} \\
|
|
~ & ~
|
|
& \multicolumn{3}{c|}{(0 - 2.5)}
|
|
& \multicolumn{3}{c|}{(2.5 - 10)}
|
|
& \multicolumn{3}{c|}{(10 - 25)}
|
|
& \multicolumn{3}{c|}{(25 - $\infty$)} \\
|
|
\cline{3-14}
|
|
~ & ~
|
|
& Method & MASE & $n$
|
|
& Method & MASE & $n$
|
|
& Method & MASE & $n$
|
|
& Method & MASE & $n$ \\
|
|
|
|
\hline \hline
|
|
\multirow{3}{*}{3} & 1
|
|
& \textbf{\textit{trivial}}
|
|
& 0.785 & \multirow{3}{*}{\rotatebox{90}{4586}}
|
|
& \textbf{\textit{hsma}}
|
|
& 0.819 & \multirow{3}{*}{\rotatebox{90}{2975}}
|
|
& \textbf{\textit{hsma}}
|
|
& 0.839 & \multirow{3}{*}{\rotatebox{90}{2743}}
|
|
& \textbf{\textit{rtarima}}
|
|
& 0.872 & \multirow{3}{*}{\rotatebox{90}{2018}} \\
|
|
~ & 2
|
|
& \textit{hsma} & 0.809 & ~
|
|
& \textit{hses} & 0.844 & ~
|
|
& \textit{hses} & 0.858 & ~
|
|
& \textit{rtses} & 0.873 & ~ \\
|
|
~ & 3
|
|
& \textit{pnaive} & 0.958 & ~
|
|
& \textit{hets} & 0.846 & ~
|
|
& \textit{hets} & 0.859 & ~
|
|
& \textit{rtets} & 0.877 & ~ \\
|
|
|
|
\hline
|
|
\multirow{3}{*}{4} & 1
|
|
& \textbf{\textit{trivial}}
|
|
& 0.770 & \multirow{3}{*}{\rotatebox{90}{4532}}
|
|
& \textbf{\textit{hsma}}
|
|
& 0.825 & \multirow{3}{*}{\rotatebox{90}{3033}}
|
|
& \textbf{\textit{hsma}}
|
|
& 0.837 & \multirow{3}{*}{\rotatebox{90}{2687}}
|
|
& \textbf{\textit{vrfr}}
|
|
& 0.855 & \multirow{3}{*}{\rotatebox{90}{2016}} \\
|
|
~ & 2
|
|
& \textit{hsma} & 0.788 & ~
|
|
& \textit{hses} & 0.848 & ~
|
|
& \textit{hses} & 0.850 & ~
|
|
& \textbf{\textit{rtarima}} & 0.855 & ~ \\
|
|
~ & 3
|
|
& \textit{pnaive} & 0.917 & ~
|
|
& \textit{hets} & 0.851 & ~
|
|
& \textit{hets} & 0.854 & ~
|
|
& \textit{rtses} & 0.860 & ~ \\
|
|
|
|
\hline
|
|
\multirow{3}{*}{5} & 1
|
|
& \textbf{\textit{trivial}}
|
|
& 0.780 & \multirow{3}{*}{\rotatebox{90}{4527}}
|
|
& \textbf{\textit{hsma}}
|
|
& 0.841 & \multirow{3}{*}{\rotatebox{90}{3055}}
|
|
& \textbf{\textit{hsma}}
|
|
& 0.837 & \multirow{3}{*}{\rotatebox{90}{2662}}
|
|
& \textbf{\textit{vrfr}}
|
|
& 0.850 & \multirow{3}{*}{\rotatebox{90}{2019}} \\
|
|
~ & 2
|
|
& \textit{hsma} & 0.803 & ~
|
|
& \textit{hses} & 0.859 & ~
|
|
& \textit{hets} & 0.845 & ~
|
|
& \textbf{\textit{rtarima}} & 0.852 & ~ \\
|
|
~ & 3
|
|
& \textit{pnaive} & 0.889 & ~
|
|
& \textit{hets} & 0.861 & ~
|
|
& \textit{hses} & 0.845 & ~
|
|
& \textit{vsvr} & 0.854 & ~ \\
|
|
|
|
\hline
|
|
\multirow{3}{*}{6} & 1
|
|
& \textbf{\textit{trivial}}
|
|
& 0.741 & \multirow{3}{*}{\rotatebox{90}{4470}}
|
|
& \textbf{\textit{hsma}}
|
|
& 0.847 & \multirow{3}{*}{\rotatebox{90}{3086}}
|
|
& \textbf{\textit{hsma}}
|
|
& 0.840 & \multirow{3}{*}{\rotatebox{90}{2625}}
|
|
& \textbf{\textit{vrfr}}
|
|
& 0.842 & \multirow{3}{*}{\rotatebox{90}{2025}} \\
|
|
~ & 2
|
|
& \textit{hsma} & 0.766 & ~
|
|
& \textit{hses} & 0.863 & ~
|
|
& \textit{hets} & 0.842 & ~
|
|
& \textbf{\textit{hets}} & 0.847 & ~ \\
|
|
~ & 3
|
|
& \textit{pnaive} & 0.837 & ~
|
|
& \textit{hets} & 0.865 & ~
|
|
& \textit{hses} & 0.848 & ~
|
|
& \textit{vsvr} & 0.848 & ~ \\
|
|
|
|
\hline
|
|
\multirow{3}{*}{7} & 1
|
|
& \textbf{\textit{trivial}}
|
|
& 0.730 & \multirow{3}{*}{\rotatebox{90}{4454}}
|
|
& \textbf{\textit{hsma}}
|
|
& 0.858 & \multirow{3}{*}{\rotatebox{90}{3132}}
|
|
& \textbf{\textit{hets}}
|
|
& 0.845 & \multirow{3}{*}{\rotatebox{90}{2597}}
|
|
& \textbf{\textit{hets}}
|
|
& 0.840 & \multirow{3}{*}{\rotatebox{90}{2007}} \\
|
|
~ & 2
|
|
& \textit{hsma} & 0.754 & ~
|
|
& \textit{hses} & 0.871 & ~
|
|
& \textit{hsma} & 0.847 & ~
|
|
& \textbf{\textit{vrfr}} & 0.845 & ~ \\
|
|
~ & 3
|
|
& \textit{pnaive} & 0.813 & ~
|
|
& \textit{hets} & 0.872 & ~
|
|
& \textbf{\textit{vsvr}} & 0.850 & ~
|
|
& \textit{vsvr} & 0.847 & ~ \\
|
|
|
|
\hline
|
|
\multirow{3}{*}{8} & 1
|
|
& \textbf{\textit{trivial}}
|
|
& 0.735 & \multirow{3}{*}{\rotatebox{90}{4402}}
|
|
& \textbf{\textit{hsma}}
|
|
& 0.867 & \multirow{3}{*}{\rotatebox{90}{3159}}
|
|
& \textbf{\textit{hets}}
|
|
& 0.846 & \multirow{3}{*}{\rotatebox{90}{2575}}
|
|
& \textbf{\textit{hets}}
|
|
& 0.836 & \multirow{3}{*}{\rotatebox{90}{2002}} \\
|
|
~ & 2
|
|
& \textit{hsma} & 0.758 & ~
|
|
& \textit{hets} & 0.877 & ~
|
|
& \textbf{\textit{vsvr}} & 0.850 & ~
|
|
& \textbf{\textit{vrfr}} & 0.842 & ~ \\
|
|
~ & 3
|
|
& \textit{pnaive} & 0.811 & ~
|
|
& \textit{hses} & 0.880 & ~
|
|
& \textit{hsma} & 0.851 & ~
|
|
& \textit{vsvr} & 0.849 & ~ \\
|
|
|
|
\hline
|
|
\end{tabular}
|
|
\end{center}
|
|
\
|
|
|
|
We use this table to answer \textbf{Q1} regarding the overall best methods
|
|
under different ADDs.
|
|
All result tables in the main text report MASEs calculated with all time
|
|
steps of a day.
|
|
In contrast, \ref{peak_results} shows the same tables with MASEs calculated
|
|
with time steps within peak times only (i.e., lunch from 12 pm to 2 pm and
|
|
dinner from 6 pm to 8 pm).
|
|
The differences lie mainly in the decimals of the individual MASE
|
|
averages while the ranks of the forecasting methods do not change except
|
|
in rare cases.
|
|
That shows that the presented accuracies are driven by the forecasting methods'
|
|
accuracies at peak times.
|
|
Intuitively, they all correctly predict zero demand for non-peak times.
|
|
|
|
Unsurprisingly, the best model for pixels without demand (i.e.,
|
|
$0 < \text{ADD} < 2.5$) is \textit{trivial}.
|
|
Whereas \textit{hsma} also adapts well, its performance is worse.
|
|
None of the more sophisticated models reaches a similar accuracy.
|
|
The intuition behind is that \textit{trivial} is the least distorted by the
|
|
relatively large proportion of noise given the low-count nature of the
|
|
time series.
|
|
|
|
For low demand (i.e., $2.5 < \text{ADD} < 10$), there is also a clear
|
|
best-performing model, namely \textit{hsma}.
|
|
As the non-seasonal \textit{hses} reaches a similar accuracy as its
|
|
potentially seasonal generalization, the \textit{hets}, we conclude that
|
|
the seasonal pattern from weekdays is not yet strong enough to be
|
|
recognized in low demand pixels.
|
|
So, in the absence of seasonality, models that only model a trend part are
|
|
the least susceptible to the noise.
|
|
|
|
For medium demand (i.e., $10 < \text{ADD} < 25$) and training horizons up to
|
|
six weeks, the best-performing models are the same as for low demand.
|
|
For longer horizons, \textit{hets} provides the highest accuracy.
|
|
Thus, to fit a seasonal pattern, longer training horizons are needed.
|
|
While \textit{vsvr} enters the top three, \textit{hets} has the edge as they
|
|
neither require parameter tuning nor real-time data.
|
|
|
|
In summary, except for high demand, simple models trained on horizontal time
|
|
series work best.
|
|
By contrast, high demand (i.e., $25 < \text{ADD} < \infty$) and less than
|
|
six training weeks is the only situation where classical models trained on
|
|
vertical time series work well.
|
|
Then, \textit{rtarima} outperforms their siblings from Sub-sections
|
|
\ref{vert} and \ref{rt}.
|
|
We conjecture that intra-day auto-correlations as caused, for example, by
|
|
weather, are the reason for that.
|
|
Intuitively, a certain amount of demand (i.e., a high enough signal-to-noise
|
|
ratio) is required such that models with auto-correlations can see them
|
|
through all the noise.
|
|
That idea is supported by \textit{vrfr} reaching a similar accuracy under
|
|
high demand as their tree-structure allows them to fit auto-correlations.
|
|
As both \textit{rtarima} and \textit{vrfr} incorporate recent demand,
|
|
real-time information can indeed improve accuracy.
|
|
However, once models are trained on longer horizons, \textit{hets} is more
|
|
accurate than \textit{vrfr}.
|
|
Thus, to answer \textbf{Q4}, we conclude that real-time information only
|
|
improves accuracy if three or four weeks of training material are
|
|
available.
|
|
|
|
In addition to looking at the results in tables covering the entire one-year
|
|
horizon, we also created sub-analyses on the distinct seasons spring,
|
|
summer (incl. the long holiday season in France), and fall.
|
|
Yet, none of the results portrayed in this and the subsequent sections change
|
|
is significant ways.
|
|
We conjecture that there could be differences if the overall demand of the UDP
|
|
increased to a scale beyond the one this case study covers and leave that
|
|
up to a follow-up study with a bigger UDP.
|