1
0
Fork 0
urban-meal-delivery-demand-.../tex/3_mod/7_models/5_ml.tex
2020-10-04 23:39:20 +02:00

54 lines
3 KiB
TeX

\subsubsection{Vertical and Real-time Forecasts without Retraining.}
\label{ml_models}
The lower-right in Figure \ref{f:inputs} shows how ML models take
real-time order data into account without retraining.
Based on the seasonally-adjusted time series $a_t$, we employ the feature
matrix and label vector representations from Sub-section \ref{learning}
and set $n$ to the number of daily time steps, $H$, to cover all potential
auto-correlations.
The ML models are trained once before a test day starts.
For training, the matrix and vector are populated such that $y_T$ is set to
the last time step of the day before the forecasts, $a_T$.
As the splitting during CV is done with whole days, the \gls{ml} models are
trained with training sets consisting of samples from all times of a day
in an equal manner.
Thus, the ML models learn to predict each time of the day.
For prediction on a test day, the $H$ observations preceding the time
step to be forecast are used as the input vector after seasonal
adjustment.
As a result, real-time data are included.
The models in this family are:
\begin{enumerate}
\item \textit{\gls{vrfr}}: RF trained on the matrix as described
\item \textit{\gls{vsvr}}: SVR trained on the matrix as described
\end{enumerate}
We tried other ML models such as gradient boosting machines but found
only RFs and SVRs to perform well in our study.
In the case of gradient boosting machines, this is to be expected as they are
known not to perform well in the presence of high noise - as is natural
with low count data - as shown, for example, by \cite{ma2018} or
\cite{mason2000}.
Also, deep learning methods are not applicable as the feature matrices only
consist of several hundred to thousands of rows (cf., Sub-section
\ref{params}).
In \ref{tabular_ml_models}, we provide an alternative feature matrix
representation that exploits the two-dimensional structure of time tables
without decomposing the time series.
In \ref{enhanced_feats}, we show how feature matrices are extended
to include predictors other than historical order data.
However, to answer \textbf{Q5} already here, none of the external data sources
improves the results in our study.
Due to the high number of time series in our study, to investigate why
no external sources improve the forecasts, we must us some automated
approach to analyzing individual time series.
\cite{barbour2014} provide a spectral density estimation approach, called
the Shannon entropy, that measures the signal-to-noise ratio in a
database with a number normalized between 0 and 1 where lower values
indicate a higher signal-to-noise ratio.
We then looked at averages of the estimates on a daily level per pixel and
find that including any of the external data sources from
\ref{enhanced_feats} always leads to significantly lower signal-to-noise
ratios.
Thus, we conclude that at least for the demand faced by our industry partner
the historical data contains all of the signal.