urban-meal-delivery/notebooks
Alexander Hess 6333f1af1e
Clean the raw data
- clean the raw data given by the undisclosed meal delivery platform:
  + keep data only for the three target citis:
    * Bordeaux
    * Lyon
    * Paris
  + merge duplicates
    * it appears as redundant addresses were created
      for each order by the same customer
      =>  significant reduction in the number of addresses
    * propagate the merges to the other tables
      that reference records merged away
  + cast data types and keep their scopes narrow
  + normalize the data
  + remove obvious outliers
  + adjust/discard unplausible values
- map the cleaned data onto the ORM models
- store the cleaned data in a new database schema
2020-09-30 13:39:48 +02:00
..
00_clean_data.ipynb Clean the raw data 2020-09-30 13:39:48 +02:00