diff --git a/paper.tex b/paper.tex index 4659df3..8c7c3d7 100644 --- a/paper.tex +++ b/paper.tex @@ -52,8 +52,6 @@ \newpage \input{tex/apx/peak_results} \newpage -\input{tex/apx/glossary} -\newpage \bibliographystyle{static/elsarticle-harv} \bibliography{tex/references} diff --git a/tex/1_intro.tex b/tex/1_intro.tex index c35cf91..cdab6ad 100644 --- a/tex/1_intro.tex +++ b/tex/1_intro.tex @@ -11,8 +11,7 @@ A common feature of these platforms is that they do not operate kitchens but related processes in simple smartphone apps, and managing the delivery via a fleet of either employees or crowd-sourced sub-contractors. -Various kinds of urban delivery platforms - (\gls{udp}; \ref{glossary} provides a glossary with all abbreviations) +Various kinds of urban delivery platforms (UDP) have received attention in recent scholarly publications. \cite{hou2018} look into heuristics to simultaneously optimize courier scheduling and routing in general, while \cite{masmoudi2018} do so @@ -20,8 +19,7 @@ Various kinds of urban delivery platforms the effect of different fulfillment strategies in the context of urban meal delivery. \cite{ehmke2018} and \cite{alcaraz2019} focus their research on the routing - aspect, which is commonly modeled as a so-called vehicle routing problem - (\gls{vrp}). + aspect, which is commonly modeled as a so-called vehicle routing problem (VRP). Not covered in the recent literature is research focusing on the demand forecasting problem a UDP faces. @@ -69,7 +67,7 @@ In this paper, we develop a rigorous methodology as to how to build and We implement such a system with a broad set of commonly used forecasting methods. We not only apply established (i.e., "classical") time series methods but also - machine learning (\gls{ml}) models that have gained traction in recent + machine learning (ML) models that have gained traction in recent years due to advancements in computing power and availability of larger amounts of data. In that regard, the classical methods serve as benchmarks for the ML methods. @@ -100,4 +98,4 @@ The subsequent Section \ref{lit} reviews the literature on the forecasting Section \ref{mod} introduces our forecasting system, and Section \ref{stu} discusses the results obtained in the empirical study. Lastly, Section \ref{con} summarizes our findings and concludes - with an outlook on further research opportunities. \ No newline at end of file + with an outlook on further research opportunities. diff --git a/tex/2_lit/2_class/4_stl.tex b/tex/2_lit/2_class/4_stl.tex index 39c987a..f832f0e 100644 --- a/tex/2_lit/2_class/4_stl.tex +++ b/tex/2_lit/2_class/4_stl.tex @@ -40,8 +40,7 @@ Their main advantages stem from the fact that the models calibrate themselves \cite{cleveland1990} introduce a seasonal and trend decomposition using a repeated locally weighted regression - the so-called Loess procedure - to smoothen the trend and seasonal components, which can be viewed as a - generalization of the methods above and is denoted by the acronym - \gls{stl}. + generalization of the methods above and is denoted by the acronym STL. In contrast to the X11, X13, and SEATS methods, the STL supports seasonalities of any lag $k$ that must, however, be determined with additional statistical tests or set with out-of-band knowledge by the forecaster diff --git a/tex/2_lit/3_ml/1_intro.tex b/tex/2_lit/3_ml/1_intro.tex index f04f137..a763822 100644 --- a/tex/2_lit/3_ml/1_intro.tex +++ b/tex/2_lit/3_ml/1_intro.tex @@ -4,8 +4,7 @@ ML methods have been employed in all kinds of prediction tasks in recent years. In this section, we restrict ourselves to the models that performed well in - our study: Random Forest (\gls{rf}) and Support Vector Regression - (\gls{svr}). + our study: Random Forest (RF) and Support Vector Regression (SVR). RFs are in general well-suited for datasets without a priori knowledge about the patterns, while SVR is known to perform well on time series data, as shown by \cite{hansen2006} in general and \cite{bao2004} specifically for diff --git a/tex/2_lit/3_ml/3_cv.tex b/tex/2_lit/3_ml/3_cv.tex index abfdebc..8662189 100644 --- a/tex/2_lit/3_ml/3_cv.tex +++ b/tex/2_lit/3_ml/3_cv.tex @@ -5,7 +5,7 @@ Because ML models are trained by minimizing a loss function $L$, the resulting value of $L$ underestimates the true error we see when predicting into the actual future by design. To counter that, one popular and model-agnostic approach is cross-validation - (\gls{cv}), as summarized, for example, by \cite{hastie2013}. + (CV), as summarized, for example, by \cite{hastie2013}. CV is a resampling technique, which ranomdly splits the samples into a training and a test set. Trained on the former, an ML model makes forecasts on the latter. diff --git a/tex/2_lit/3_ml/4_rf.tex b/tex/2_lit/3_ml/4_rf.tex index eeb7161..7ffd79f 100644 --- a/tex/2_lit/3_ml/4_rf.tex +++ b/tex/2_lit/3_ml/4_rf.tex @@ -2,7 +2,7 @@ \label{rf} \cite{breiman1984} introduce the classification and regression tree - (\gls{cart}) model that is built around the idea that a single binary + (CART) model that is built around the idea that a single binary decision tree maps learned combinations of intervals of the feature columns to a label. Thus, each sample in the training set is associated with one leaf node that diff --git a/tex/2_lit/3_ml/5_svm.tex b/tex/2_lit/3_ml/5_svm.tex index 528fe9e..c98d2ea 100644 --- a/tex/2_lit/3_ml/5_svm.tex +++ b/tex/2_lit/3_ml/5_svm.tex @@ -2,7 +2,7 @@ \label{svm} \cite{vapnik1963} and \cite{vapnik1964} introduce the so-called support vector - machine (\gls{svm}) model, and \cite{vapnik2013} summarizes the research + machine (SVM) model, and \cite{vapnik2013} summarizes the research conducted since then. In its basic version, SVMs are linear classifiers, modeling a binary decision, that fit a hyperplane into the feature space of $\mat{X}$ to diff --git a/tex/3_mod/5_mase.tex b/tex/3_mod/5_mase.tex index 173d433..ebab8bc 100644 --- a/tex/3_mod/5_mase.tex +++ b/tex/3_mod/5_mase.tex @@ -41,7 +41,7 @@ These numerical instabilities occurred so often in our studies that we argue against using such measures. \item \textbf{Scaled Errors}: \cite{hyndman2006} contribute this category and introduce the mean absolute - scaled error (\gls{mase}). + scaled error (MASE). It is defined as the MAE from the actual forecasting method on the test day (i.e., "out-of-sample") divided by the MAE from the (seasonal) na\"{i}ve method on the entire training set (i.e., "in-sample"). @@ -84,4 +84,4 @@ We conjecture that percentage error measures may be usable for UDPs facing a higher overall demand with no intra-day down-times in between but have to leave that to a future study. Yet, even with high and steady demand, divide-by-zero errors are likely to - occur. \ No newline at end of file + occur. diff --git a/tex/3_mod/7_models/2_hori.tex b/tex/3_mod/7_models/2_hori.tex index 21cc627..899bee2 100644 --- a/tex/3_mod/7_models/2_hori.tex +++ b/tex/3_mod/7_models/2_hori.tex @@ -15,21 +15,21 @@ As the models in this family do not include the test day's demand data in The models in this family are as follows; we use prefixes, such as "h" here, when methods are applied in other families as well: \begin{enumerate} -\item \textit{\gls{naive}}: +\item \textit{naive}: Observation from the same time step one week prior -\item \textit{\gls{trivial}}: +\item \textit{trivial}: Predict $0$ for all time steps -\item \textit{\gls{hcroston}}: +\item \textit{hcroston}: Intermittent demand method introduced by \cite{croston1972} -\item \textit{\gls{hholt}}, - \textit{\gls{hhwinters}}, - \textit{\gls{hses}}, - \textit{\gls{hsma}}, and - \textit{\gls{htheta}}: +\item \textit{hholt}, + \textit{hhwinters}, + \textit{hses}, + \textit{hsma}, and + \textit{htheta}: Exponential smoothing without calibration -\item \textit{\gls{hets}}: +\item \textit{hets}: ETS calibrated as described by \cite{hyndman2008b} -\item \textit{\gls{harima}}: +\item \textit{harima}: ARIMA calibrated as described by \cite{hyndman2008a} \end{enumerate} \textit{naive} and \textit{trivial} provide an absolute benchmark for the diff --git a/tex/3_mod/7_models/3_vert.tex b/tex/3_mod/7_models/3_vert.tex index 43aaaf1..949a485 100644 --- a/tex/3_mod/7_models/3_vert.tex +++ b/tex/3_mod/7_models/3_vert.tex @@ -16,17 +16,17 @@ By decomposing the raw time series, all long-term patterns are assumed to be a potential trend and auto-correlations. The models in this family are: \begin{enumerate} -\item \textit{\gls{fnaive}}, - \textit{\gls{pnaive}}: +\item \textit{fnaive}, + \textit{pnaive}: Sum of STL's trend and seasonal components' na\"{i}ve forecasts -\item \textit{\gls{vholt}}, - \textit{\gls{vses}}, and - \textit{\gls{vtheta}}: +\item \textit{vholt}, + \textit{vses}, and + \textit{vtheta}: Exponential smoothing without calibration and seasonal fit -\item \textit{\gls{vets}}: +\item \textit{vets}: ETS calibrated as described by \cite{hyndman2008b} -\item \textit{\gls{varima}}: +\item \textit{varima}: ARIMA calibrated as described by \cite{hyndman2008a} \end{enumerate} As mentioned in Sub-section \ref{unified_cv}, we include the sum of the diff --git a/tex/3_mod/7_models/4_rt.tex b/tex/3_mod/7_models/4_rt.tex index 6fa038d..94a5756 100644 --- a/tex/3_mod/7_models/4_rt.tex +++ b/tex/3_mod/7_models/4_rt.tex @@ -8,13 +8,13 @@ Instead of obtaining an $H$-step-ahead forecast, we retrain a model after every time step and only predict one step. The remainder is as in the previous sub-section, and the models are: \begin{enumerate} -\item \textit{\gls{rtholt}}, - \textit{\gls{rtses}}, and - \textit{\gls{rttheta}}: +\item \textit{rtholt}, + \textit{rtses}, and + \textit{rttheta}: Exponential smoothing without calibration and seasonal fit -\item \textit{\gls{rtets}}: +\item \textit{rtets}: ETS calibrated as described by \cite{hyndman2008b} -\item \textit{\gls{rtarima}}: +\item \textit{rtarima}: ARIMA calibrated as described by \cite{hyndman2008a} \end{enumerate} Retraining \textit{fnaive} and \textit{pnaive} did not increase accuracy, and diff --git a/tex/3_mod/7_models/5_ml.tex b/tex/3_mod/7_models/5_ml.tex index 7ca00c4..9f37684 100644 --- a/tex/3_mod/7_models/5_ml.tex +++ b/tex/3_mod/7_models/5_ml.tex @@ -10,7 +10,7 @@ Based on the seasonally-adjusted time series $a_t$, we employ the feature The ML models are trained once before a test day starts. For training, the matrix and vector are populated such that $y_T$ is set to the last time step of the day before the forecasts, $a_T$. -As the splitting during CV is done with whole days, the \gls{ml} models are +As the splitting during CV is done with whole days, the ML models are trained with training sets consisting of samples from all times of a day in an equal manner. Thus, the ML models learn to predict each time of the day. @@ -20,8 +20,8 @@ For prediction on a test day, the $H$ observations preceding the time As a result, real-time data are included. The models in this family are: \begin{enumerate} -\item \textit{\gls{vrfr}}: RF trained on the matrix as described -\item \textit{\gls{vsvr}}: SVR trained on the matrix as described +\item \textit{vrfr}: RF trained on the matrix as described +\item \textit{vsvr}: SVR trained on the matrix as described \end{enumerate} We tried other ML models such as gradient boosting machines but found only RFs and SVRs to perform well in our study. diff --git a/tex/4_stu/4_overall.tex b/tex/4_stu/4_overall.tex index c36c876..43b5bc1 100644 --- a/tex/4_stu/4_overall.tex +++ b/tex/4_stu/4_overall.tex @@ -2,7 +2,7 @@ \label{overall_results} Table \ref{t:results} summarizes the overall best-performing models grouped by - training horizon and a pixel's average daily demand (\gls{add}) for a + training horizon and a pixel's average daily demand (ADD) for a pixel size of $1~\text{km}^2$ and 60-minute time steps. Each combination of pixel and test day counts as one case, and the total number of cases is denoted as $n$. diff --git a/tex/5_con/4_further_research.tex b/tex/5_con/4_further_research.tex index be2a006..4e605ee 100644 --- a/tex/5_con/4_further_research.tex +++ b/tex/5_con/4_further_research.tex @@ -19,11 +19,11 @@ Thus, we suggest conducting more detailed analyses on how to incorporate model Future research should also integrate our forecasting system into a predictive routing application and evaluate its business impact. This embeds our research into the vast literature on the VRP. -Initially introduced by \cite{dantzig1959}, \gls{vrp}s are concerned with +Initially introduced by \cite{dantzig1959}, VRPs are concerned with finding optimal routes serving customers. We refer to \cite{toth2014} for a comprehensive overview. The two variants relevant for the UDP case are the dynamic VRP and - the pickup and delivery problem (\gls{pdp}). + the pickup and delivery problem (PDP). A VRP is dynamic if the data to solve a problem only becomes available as the operations are underway. \cite{thomas2010}, \cite{pillac2013}, and \cite{psaraftis2016} describe how @@ -39,7 +39,7 @@ Forecasts by our system extend this idea naturally as dummy nodes could be The concrete case of a meal delivering UDP is contained in a recent literature stream started by \cite{ulmer2017} and extended by \cite{reyes2018} and \cite{yildiz2018}: They coin the term meal delivery - routing problem (\gls{mdrp}). + routing problem (MDRP). The MDRP is a special case of the dynamic PDP where the defining characteristic is that once a vehicle is scheduled, a modification of the route is inadmissible. diff --git a/tex/apx/glossary.tex b/tex/apx/glossary.tex deleted file mode 100644 index 77eb8b3..0000000 --- a/tex/apx/glossary.tex +++ /dev/null @@ -1,144 +0,0 @@ -\section{Glossary} -\label{glossary} - -% Abbreviations for technical terms. -\newglossaryentry{add}{ - name=ADD, description={Average Daily Demand} -} -\newglossaryentry{cart}{ - name=CART, description={Classification and Regression Trees} -} -\newglossaryentry{cv}{ - name=CV, description={Cross Validation} -} -\newglossaryentry{mase}{ - name=MASE, description={Mean Absolute Scaled Error} -} -\newglossaryentry{mdrp}{ - name=MDRP, description={Meal Delivery Routing Proplem} -} -\newglossaryentry{ml}{ - name=ML, description={Machine Learning} -} -\newglossaryentry{pdp}{ - name=PDP, description={Pickup and Delivery Problem} -} -\newglossaryentry{rf}{ - name=RF, description={Random Forest} -} -\newglossaryentry{stl}{ - name=STL, description={Seasonal and Trend Decomposition using Loess} -} -\newglossaryentry{svm}{ - name=SVM, description={Support Vector Machine} -} -\newglossaryentry{svr}{ - name=SVR, description={Support Vector Regression} -} -\newglossaryentry{udp}{ - name=UDP, description={Urban Delivery Platform} -} -\newglossaryentry{vrp}{ - name=VRP, description={Vehicle Routing Problem} -} - -% Model names. -\newglossaryentry{naive}{ - name=naive, description={(Seasonal) Na\"{i}ve Method} -} -\newglossaryentry{fnaive}{ - name=fnaive, description={"Flexible" STL Decomposition, - with tuned ns parameter} -} -\newglossaryentry{pnaive}{ - name=pnaive, description={"Periodic" STL Decomposition, - with ns parameter set to large number} -} -\newglossaryentry{trivial}{ - name=trivial, description={Trivial Method} -} -\newglossaryentry{hcroston}{ - name=hcroston, description={Croston's Method, - trained on horizontal time series} -} -\newglossaryentry{hholt}{ - name=hholt, description={Holt's Linear Trend Method, - trained on horizontal time series} -} -\newglossaryentry{vholt}{ - name=vholt, description={Holt's Linear Trend Method, - trained on vertical time series} -} -\newglossaryentry{rtholt}{ - name=rtholt, description={Holt's Linear Trend Method, - (re)trained on vertical time series} -} -\newglossaryentry{hhwinters}{ - name=hhwinters, description={Holt-Winter's Seasonal Method, - trained on horizontal time series} -} -\newglossaryentry{hses}{ - name=hses, description={Simple Exponential Smoothing Method, - trained on horizontal time series} -} -\newglossaryentry{vses}{ - name=vses, description={Simple Exponential Smoothing Method, - trained on vertical time series} -} -\newglossaryentry{rtses}{ - name=rtses, description={Simple Exponential Smoothing Method, - (re)trained on vertical time series} -} -\newglossaryentry{hsma}{ - name=hsma, description={Simple Moving Average Method, - trained on horizontal time series} -} -\newglossaryentry{htheta}{ - name=htheta, description={Theta Method, - trained on horizontal time series} -} -\newglossaryentry{vtheta}{ - name=vtheta, description={Theta Method, - trained on vertical time series} -} -\newglossaryentry{rttheta}{ - name=rttheta, description={Theta Method, - (re)trained on vertical time series} -} -\newglossaryentry{hets}{ - name=hets, description={ETS State Space Method, - trained on horizontal time series} -} -\newglossaryentry{vets}{ - name=vets, description={ETS State Space Method, - trained on vertical time series} -} -\newglossaryentry{rtets}{ - name=rtets, description={ETS State Space Method, - (re)trained on vertical time series} -} -\newglossaryentry{harima}{ - name=harima, description={Autoregressive Integrated Moving Average - Method, - trained on horizontal time series} -} -\newglossaryentry{varima}{ - name=varima, description={Autoregressive Integrated Moving Average - Method, - trained on vertical time series} -} -\newglossaryentry{rtarima}{ - name=rtarima, description={Autoregressive Integrated Moving Average - Method, - (re)trained on vertical time series} -} -\newglossaryentry{vrfr}{ - name=vrfr, description={Random Forest Regression Method, - trained on vertical time series} -} -\newglossaryentry{vsvr}{ - name=vsvr, description={Support Vector Regression Method, - trained on vertical time series} -} - -\printglossary[title=] \ No newline at end of file diff --git a/tex/preamble.tex b/tex/preamble.tex index 207afd3..27b5f3b 100644 --- a/tex/preamble.tex +++ b/tex/preamble.tex @@ -1,9 +1,6 @@ % Use the document width more effectively. \usepackage[margin=2.5cm]{geometry} -\usepackage[acronym]{glossaries} -\makeglossaries - % Enable captions for figures and tables. \usepackage{caption}