Add Model section
This commit is contained in:
parent
7c203cb87c
commit
91bd4ba083
25 changed files with 1354 additions and 6 deletions
20
tex/3_mod/7_models/1_intro.tex
Normal file
20
tex/3_mod/7_models/1_intro.tex
Normal file
|
|
@ -0,0 +1,20 @@
|
|||
\subsection{Forecasting Models}
|
||||
\label{models}
|
||||
|
||||
This sub-section describes the concrete models in our study.
|
||||
Figure \ref{f:inputs} shows how we classify them into four families with
|
||||
regard to the type of the time series, horizontal or vertical, and the
|
||||
moment at which a model is trained:
|
||||
Solid lines indicate that the corresponding time steps lie before the
|
||||
training, and dotted lines show the time horizon predicted by a model.
|
||||
For conciseness, we only show the forecasts for one test day.
|
||||
The setup is the same for each inner validation day.
|
||||
|
||||
\
|
||||
|
||||
\begin{center}
|
||||
\captionof{figure}{Classification of the models by input type and training
|
||||
moment}
|
||||
\label{f:inputs}
|
||||
\includegraphics[width=.95\linewidth]{static/model_inputs_gray.png}
|
||||
\end{center}
|
||||
42
tex/3_mod/7_models/2_hori.tex
Normal file
42
tex/3_mod/7_models/2_hori.tex
Normal file
|
|
@ -0,0 +1,42 @@
|
|||
\subsubsection{Horizontal and Whole-day-ahead Forecasts.}
|
||||
\label{hori}
|
||||
|
||||
The upper-left in Figure \ref{f:inputs} illustrates the simplest way to
|
||||
generate forecasts for a test day before it has started:
|
||||
For each time of the day, the corresponding horizontal slice becomes the input
|
||||
for a model.
|
||||
With whole days being the unified time interval, each model is trained $H$
|
||||
times, providing a one-step-ahead forecast.
|
||||
While it is possible to have models of a different type be selected per time
|
||||
step, that did not improve the accuracy in the empirical study.
|
||||
As the models in this family do not include the test day's demand data in
|
||||
their training sets, we see them as benchmarks to answer \textbf{Q4},
|
||||
checking if a UDP can take advantage of real-time information.
|
||||
The models in this family are as follows; we use prefixes, such as "h" here,
|
||||
when methods are applied in other families as well:
|
||||
\begin{enumerate}
|
||||
\item \textit{\gls{naive}}:
|
||||
Observation from the same time step one week prior
|
||||
\item \textit{\gls{trivial}}:
|
||||
Predict $0$ for all time steps
|
||||
\item \textit{\gls{hcroston}}:
|
||||
Intermittent demand method introduced by \cite{croston1972}
|
||||
\item \textit{\gls{hholt}},
|
||||
\textit{\gls{hhwinters}},
|
||||
\textit{\gls{hses}},
|
||||
\textit{\gls{hsma}}, and
|
||||
\textit{\gls{htheta}}:
|
||||
Exponential smoothing without calibration
|
||||
\item \textit{\gls{hets}}:
|
||||
ETS calibrated as described by \cite{hyndman2008b}
|
||||
\item \textit{\gls{harima}}:
|
||||
ARIMA calibrated as described by \cite{hyndman2008a}
|
||||
\end{enumerate}
|
||||
\textit{naive} and \textit{trivial} provide an absolute benchmark for the
|
||||
actual forecasting methods.
|
||||
\textit{hcroston} is often mentioned in the context of intermittent demand;
|
||||
however, the method did not perform well at all.
|
||||
Besides \textit{hhwinters} that always fits a seasonal component, the
|
||||
calibration heuristics behind \textit{hets} and \textit{harima} may do so
|
||||
as well.
|
||||
With $k=7$, an STL decomposition is unnecessary here.
|
||||
39
tex/3_mod/7_models/3_vert.tex
Normal file
39
tex/3_mod/7_models/3_vert.tex
Normal file
|
|
@ -0,0 +1,39 @@
|
|||
\subsubsection{Vertical and Whole-day-ahead Forecasts without Retraining.}
|
||||
\label{vert}
|
||||
|
||||
The upper-right in Figure \ref{f:inputs} shows an alternative way to
|
||||
generate forecasts for a test day before it has started:
|
||||
First, a seasonally-adjusted time series $a_t$ is obtained from a vertical
|
||||
time series by STL decomposition.
|
||||
Then, the actual forecasting model, trained on $a_t$, makes an $H$-step-ahead
|
||||
prediction.
|
||||
Lastly, we add the $H$ seasonal na\"{i}ve forecasts for the seasonal component
|
||||
$s_t$ to them to obtain the actual predictions for the test day.
|
||||
Thus, only one training is required per model type, and no real-time data is
|
||||
used.
|
||||
By decomposing the raw time series, all long-term patterns are assumed to be
|
||||
in the seasonal component $s_t$, and $a_t$ only contains the level with
|
||||
a potential trend and auto-correlations.
|
||||
The models in this family are:
|
||||
\begin{enumerate}
|
||||
\item \textit{\gls{fnaive}},
|
||||
\textit{\gls{pnaive}}:
|
||||
Sum of STL's trend and seasonal components' na\"{i}ve forecasts
|
||||
\item \textit{\gls{vholt}},
|
||||
\textit{\gls{vses}}, and
|
||||
\textit{\gls{vtheta}}:
|
||||
Exponential smoothing without calibration and seasonal
|
||||
fit
|
||||
\item \textit{\gls{vets}}:
|
||||
ETS calibrated as described by \cite{hyndman2008b}
|
||||
\item \textit{\gls{varima}}:
|
||||
ARIMA calibrated as described by \cite{hyndman2008a}
|
||||
\end{enumerate}
|
||||
As mentioned in Sub-section \ref{unified_cv}, we include the sum of the
|
||||
(seasonal) na\"{i}ve forecasts of the STL's trend and seasonal components
|
||||
as forecasts on their own:
|
||||
For \textit{fnaive}, we tune the "flexible" $ns$ parameter, and for
|
||||
\textit{pnaive}, we set it to a "periodic" value.
|
||||
Thus, we implicitly assume that there is no signal in the remainder $r_t$, and
|
||||
predict $0$ for it.
|
||||
\textit{fnaive} and \textit{pnaive} are two more simple benchmarks.
|
||||
22
tex/3_mod/7_models/4_rt.tex
Normal file
22
tex/3_mod/7_models/4_rt.tex
Normal file
|
|
@ -0,0 +1,22 @@
|
|||
\subsubsection{Vertical and Real-time Forecasts with Retraining.}
|
||||
\label{rt}
|
||||
|
||||
The lower-left in Figure \ref{f:inputs} shows how models trained on vertical
|
||||
time series are extended with real-time order data as it becomes available
|
||||
during a test day:
|
||||
Instead of obtaining an $H$-step-ahead forecast, we retrain a model after
|
||||
every time step and only predict one step.
|
||||
The remainder is as in the previous sub-section, and the models are:
|
||||
\begin{enumerate}
|
||||
\item \textit{\gls{rtholt}},
|
||||
\textit{\gls{rtses}}, and
|
||||
\textit{\gls{rttheta}}:
|
||||
Exponential smoothing without calibration and seasonal fit
|
||||
\item \textit{\gls{rtets}}:
|
||||
ETS calibrated as described by \cite{hyndman2008b}
|
||||
\item \textit{\gls{rtarima}}:
|
||||
ARIMA calibrated as described by \cite{hyndman2008a}
|
||||
\end{enumerate}
|
||||
Retraining \textit{fnaive} and \textit{pnaive} did not increase accuracy, and
|
||||
thus we left them out.
|
||||
A downside of this family is the significant increase in computing costs.
|
||||
54
tex/3_mod/7_models/5_ml.tex
Normal file
54
tex/3_mod/7_models/5_ml.tex
Normal file
|
|
@ -0,0 +1,54 @@
|
|||
\subsubsection{Vertical and Real-time Forecasts without Retraining.}
|
||||
\label{ml_models}
|
||||
|
||||
The lower-right in Figure \ref{f:inputs} shows how ML models take
|
||||
real-time order data into account without retraining.
|
||||
Based on the seasonally-adjusted time series $a_t$, we employ the feature
|
||||
matrix and label vector representations from Sub-section \ref{learning}
|
||||
and set $n$ to the number of daily time steps, $H$, to cover all potential
|
||||
auto-correlations.
|
||||
The ML models are trained once before a test day starts.
|
||||
For training, the matrix and vector are populated such that $y_T$ is set to
|
||||
the last time step of the day before the forecasts, $a_T$.
|
||||
As the splitting during CV is done with whole days, the \gls{ml} models are
|
||||
trained with training sets consisting of samples from all times of a day
|
||||
in an equal manner.
|
||||
Thus, the ML models learn to predict each time of the day.
|
||||
For prediction on a test day, the $H$ observations preceding the time
|
||||
step to be forecast are used as the input vector after seasonal
|
||||
adjustment.
|
||||
As a result, real-time data are included.
|
||||
The models in this family are:
|
||||
\begin{enumerate}
|
||||
\item \textit{\gls{vrfr}}: RF trained on the matrix as described
|
||||
\item \textit{\gls{vsvr}}: SVR trained on the matrix as described
|
||||
\end{enumerate}
|
||||
We tried other ML models such as gradient boosting machines but found
|
||||
only RFs and SVRs to perform well in our study.
|
||||
In the case of gradient boosting machines, this is to be expected as they are
|
||||
known not to perform well in the presence of high noise - as is natural
|
||||
with low count data - as shown, for example, by \cite{ma2018} or
|
||||
\cite{mason2000}.
|
||||
Also, deep learning methods are not applicable as the feature matrices only
|
||||
consist of several hundred to thousands of rows (cf., Sub-section
|
||||
\ref{params}).
|
||||
In \ref{tabular_ml_models}, we provide an alternative feature matrix
|
||||
representation that exploits the two-dimensional structure of time tables
|
||||
without decomposing the time series.
|
||||
In \ref{enhanced_feats}, we show how feature matrices are extended
|
||||
to include predictors other than historical order data.
|
||||
However, to answer \textbf{Q5} already here, none of the external data sources
|
||||
improves the results in our study.
|
||||
Due to the high number of time series in our study, to investigate why
|
||||
no external sources improve the forecasts, we must us some automated
|
||||
approach to analyzing individual time series.
|
||||
\cite{barbour2014} provide a spectral density estimation approach, called
|
||||
the Shannon entropy, that measures the signal-to-noise ratio in a
|
||||
database with a number normalized between 0 and 1 where lower values
|
||||
indicate a higher signal-to-noise ratio.
|
||||
We then looked at averages of the estimates on a daily level per pixel and
|
||||
find that including any of the external data sources from
|
||||
\ref{enhanced_feats} always leads to significantly lower signal-to-noise
|
||||
ratios.
|
||||
Thus, we conclude that at least for the demand faced by our industry partner
|
||||
the historical data contains all of the signal.
|
||||
Loading…
Add table
Add a link
Reference in a new issue