- clean the raw data given by the undisclosed meal delivery platform:
+ keep data only for the three target citis:
* Bordeaux
* Lyon
* Paris
+ merge duplicates
* it appears as redundant addresses were created
for each order by the same customer
=> significant reduction in the number of addresses
* propagate the merges to the other tables
that reference records merged away
+ cast data types and keep their scopes narrow
+ normalize the data
+ remove obvious outliers
+ adjust/discard unplausible values
- map the cleaned data onto the ORM models
- store the cleaned data in a new database schema