Merge branch 'main' into develop

2021-03-01 14:20:08 +01:00 · 2021-03-01 14:20:08 +01:00 · 6c03261b07
commit 6c03261b07
parent 915aa4d3b4 f5ced933d6
22 changed files with 618055 additions and 12 deletions
--- a/README.md
+++ b/README.md
@ -16,7 +16,7 @@ that iteratively build on each other.
 ### Data Cleaning
 The UDP provided its raw data as a PostgreSQL dump.
-This [notebook](https://nbviewer.jupyter.org/github/webartifex/urban-meal-delivery/blob/develop/research/clean_data.ipynb)
+This [notebook](https://nbviewer.jupyter.org/github/webartifex/urban-meal-delivery/blob/develop/research/01_clean_data.ipynb)
 cleans the data extensively
 and maps them onto the [ORM models](https://github.com/webartifex/urban-meal-delivery/tree/develop/src/urban_meal_delivery/db)
 defined in the `urban-meal-delivery` package
@ -28,7 +28,27 @@ neither the raw nor the cleaned data are published as of now.
 However, previews of the data can be seen throughout the [research/](https://github.com/webartifex/urban-meal-delivery/tree/develop/research) folder.
-### Real-time Demand Forecasting
+### Tactical Demand Forecasting
 Before any optimizations of the UDP's operations are done,
 a **demand forecasting** system for *tactical* purposes is implemented.
 To achieve that, the cities first undergo a **gridification** step
 where each *pickup* location is assigned into a pixel on a "checker board"-like grid.
 The main part of the source code that implements that is in this [file](https://github.com/webartifex/urban-meal-delivery/blob/develop/src/urban_meal_delivery/db/grids.py#L60).
 Visualizations of the various grids can be found in the [visualizations/](https://github.com/webartifex/urban-meal-delivery/tree/develop/research/visualizations) folder
 and in this [notebook](https://nbviewer.jupyter.org/github/webartifex/urban-meal-delivery/blob/develop/research/03_grid_visualizations.ipynb).
 Then, demand is aggregated on a per-pixel level
 and different kinds of order time series are generated.
 The latter are the input to different kinds of forecasting `*Model`s.
 They all have in common that they predict demand into the *short-term* future (e.g., one hour)
 and are thus used for tactical purposes, in particular predictive routing (cf., next section).
 The details of how this works can be found in the first academic paper
 published in the context of this research project
 and titled "*Real-time Demand Forecasting for an Urban Delivery Platform*"
 (cf., the [repository](https://github.com/webartifex/urban-meal-delivery-demand-forecasting) with the LaTeX files).
 All demand forecasting related code is in the [forecasts/](https://github.com/webartifex/urban-meal-delivery/tree/develop/src/urban_meal_delivery/forecasts) sub-package.
 ### Predictive Routing
--- a/research/00_r_dependencies.ipynb
+++ b/research/00_r_dependencies.ipynb
@ -114,7 +114,7 @@
    "    shutil.rmtree(r_libs_path)\n",
    "except FileNotFoundError:\n",
    "    pass\n",
-    "os.mkdir(r_libs_path)"
+    "os.makedirs(r_libs_path)"
   ]
  },
  {
--- a/research/01_clean_data.ipynb
+++ b/research/01_clean_data.ipynb
@ -192,7 +192,7 @@
   "source": [
    "%cd -q ..\n",
    "!alembic upgrade f11cd76d2f45\n",
-    "%cd -q notebooks"
+    "%cd -q research"
   ]
  },
  {
@ -7647,7 +7647,7 @@
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
-   "version": "3.8.5"
+   "version": "3.8.6"
  }
 },
 "nbformat": 4,
--- a/research/02_gridification.ipynb
+++ b/research/02_gridification.ipynb
@ -0,0 +1,168 @@
 {
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Gridification"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This notebook runs the gridification script and creates all the pixels in the database."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[32murban-meal-delivery\u001b[0m, version \u001b[34m0.3.0\u001b[0m\n"
     ]
    }
   ],
   "source": [
    "!umd --version"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Upgrade Database Schema"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This database migration also de-duplicates redundant addresses and removes obvious outliers."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "%cd -q .."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "INFO  [alembic.runtime.migration] Context impl PostgresqlImpl.\n",
      "INFO  [alembic.runtime.migration] Will assume transactional DDL.\n",
      "INFO  [alembic.runtime.migration] Running upgrade f11cd76d2f45 -> 888e352d7526, Add pixel grid.\n",
      "INFO  [alembic.runtime.migration] Running upgrade 888e352d7526 -> e40623e10405, Add demand forecasting.\n",
      "INFO  [alembic.runtime.migration] Running upgrade e40623e10405 -> 26711cd3f9b9, Add confidence intervals to forecasts.\n",
      "INFO  [alembic.runtime.migration] Running upgrade 26711cd3f9b9 -> e86290e7305e, Remove orders from restaurants with invalid location ...\n"
     ]
    }
   ],
   "source": [
    "!alembic upgrade e86290e7305e"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Create the Grids"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Put all restaurant locations in pixels."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "3 cities retrieved from the database\n",
      "\n",
      "Creating grids for Lyon\n",
      "Creating grid with a side length of 707 meters\n",
      " -> created 62 pixels\n",
      "Creating grid with a side length of 1000 meters\n",
      " -> created 38 pixels\n",
      "Creating grid with a side length of 1414 meters\n",
      " -> created 24 pixels\n",
      "=> assigned 358 out of 48058 addresses in Lyon\n",
      "\n",
      "Creating grids for Paris\n",
      "Creating grid with a side length of 707 meters\n",
      " -> created 199 pixels\n",
      "Creating grid with a side length of 1000 meters\n",
      " -> created 111 pixels\n",
      "Creating grid with a side length of 1414 meters\n",
      " -> created 66 pixels\n",
      "=> assigned 1133 out of 108135 addresses in Paris\n",
      "\n",
      "Creating grids for Bordeaux\n",
      "Creating grid with a side length of 707 meters\n",
      " -> created 30 pixels\n",
      "Creating grid with a side length of 1000 meters\n",
      " -> created 22 pixels\n",
      "Creating grid with a side length of 1414 meters\n",
      " -> created 15 pixels\n",
      "=> assigned 123 out of 21742 addresses in Bordeaux\n"
     ]
    }
   ],
   "source": [
    "!umd gridify"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "%cd -q research"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
 }
--- a/research/03_grid_visualizations.ipynb
+++ b/research/03_grid_visualizations.ipynb
--- a/research/04_visualizing_restaurants.ipynb
+++ b/research/04_visualizing_restaurants.ipynb
--- a/research/05_visualizing_customers.ipynb
+++ b/research/05_visualizing_customers.ipynb
--- a/research/06_tactical_demand_forecasting.ipynb
+++ b/research/06_tactical_demand_forecasting.ipynb
--- a/research/07_visualizing_demand_forecasting.ipynb
+++ b/research/07_visualizing_demand_forecasting.ipynb
--- a/research/visualizations/addresses_in_bordeaux_by_zip_code.html
+++ b/research/visualizations/addresses_in_bordeaux_by_zip_code.html
--- a/research/visualizations/addresses_in_lyon_by_zip_code.html
+++ b/research/visualizations/addresses_in_lyon_by_zip_code.html
--- a/research/visualizations/addresses_in_paris_by_zip_code.html
+++ b/research/visualizations/addresses_in_paris_by_zip_code.html
--- a/research/visualizations/restaurants_in_bordeaux.html
+++ b/research/visualizations/restaurants_in_bordeaux.html
--- a/research/visualizations/restaurants_in_bordeaux_on_a_grid_with_side_length_1000.html
+++ b/research/visualizations/restaurants_in_bordeaux_on_a_grid_with_side_length_1000.html
--- a/research/visualizations/restaurants_in_lyon.html
+++ b/research/visualizations/restaurants_in_lyon.html
--- a/research/visualizations/restaurants_in_lyon_on_a_grid_with_side_length_1000.html
+++ b/research/visualizations/restaurants_in_lyon_on_a_grid_with_side_length_1000.html
--- a/research/visualizations/restaurants_in_paris.html
+++ b/research/visualizations/restaurants_in_paris.html
--- a/research/visualizations/restaurants_in_paris_on_a_grid_with_side_length_1000.html
+++ b/research/visualizations/restaurants_in_paris_on_a_grid_with_side_length_1000.html
--- a/src/urban_meal_delivery/configuration.py
+++ b/src/urban_meal_delivery/configuration.py
@ -49,9 +49,9 @@ class Config:
    TIME_STEPS = [60]
    # Training horizons (in full weeks) used to train the forecasting models.
-    # For now, we only use 8 weeks as that was the best performing in
+    # For now, we only use 7 and 8 weeks as that was the best performing in
    # a previous study (note:4f79e8fa).
-    TRAIN_HORIZONS = [8]
+    TRAIN_HORIZONS = [7, 8]
    # The demand forecasting methods used in the simulations.
    FORECASTING_METHODS = ['hets', 'rtarima']
--- a/src/urban_meal_delivery/console/forecasts.py
+++ b/src/urban_meal_delivery/console/forecasts.py
@ -104,8 +104,13 @@ def tactical_heuristic(  # noqa:C901,WPS213,WPS216,WPS231
        # commands are added the make `Forecast`s without the heuristic!
        # Continue with forecasting on the day the last prediction was made ...
        last_predict_at = (  # noqa:ECE001
-            db.session.query(func.max(db.Forecast.start_at))
+            db.session.query(func.max(db.Forecast.start_at))  # noqa:WPS221
            .join(db.Pixel, db.Forecast.pixel_id == db.Pixel.id)
            .join(db.Grid, db.Pixel.grid_id == db.Grid.id)
            .filter(db.Forecast.pixel == pixel)
            .filter(db.Grid.side_length == side_length)
            .filter(db.Forecast.time_step == time_step)
            .filter(db.Forecast.train_horizon == train_horizon)
            .first()
        )[0]
        # ... or start `train_horizon` weeks after the first `Order`
--- a/src/urban_meal_delivery/forecasts/timify.py
+++ b/src/urban_meal_delivery/forecasts/timify.py
@ -542,9 +542,9 @@ class OrderHistory:
            pixel_id=pixel_id, predict_day=predict_day, train_horizon=train_horizon,
        )
-        # For now, we only make forecasts with 8 weeks
+        # For now, we only make forecasts with 7 and 8 weeks
        # as the training horizon (note:4f79e8fa).
-        if train_horizon == 8:
+        if train_horizon in {7, 8}:
            if add >= 25:  # = "high demand"
                return models.HorizontalETSModel(order_history=self)
            elif add >= 10:  # = "medium demand"
--- a/src/urban_meal_delivery/init_r.py
+++ b/src/urban_meal_delivery/init_r.py
@ -3,7 +3,7 @@
 The purpose of this module is to import all the R packages that are installed
 into a sub-folder (see `config.R_LIBS_PATH`) in the project's root directory.
-The Jupyter notebook "research/r_dependencies.ipynb" can be used to install all
+The Jupyter notebook "research/00_r_dependencies.ipynb" can be used to install all
 R dependencies on a Ubuntu/Debian based system.
 """
@ -24,5 +24,5 @@ try:  # noqa:WPS229
    rpackages.importr('zoo')
 except rpackages.PackageNotInstalledError:  # pragma: no cover
-    msg = 'See the "research/r_dependencies.ipynb" notebook!'
+    msg = 'See the "research/00_r_dependencies.ipynb" notebook!'
    raise rpackages.PackageNotInstalledError(msg) from None