Newer
Older
This repository is a case study of applying various machine learning models
to the problem of predicting house prices.
The dataset is publicly available
and can be downloaded, for example, at [Kaggle](https://www.kaggle.com/c/house-prices-advanced-regression-techniques).
The case study is based on this [research paper](static/paper.pdf).
A video presentation of the case study is available on [YouTube <img height="12" style="display: inline-block" src="static/link/to_yt.png">](https://www.youtube.com/watch?v=VSeGseoJsNA).
### Table of Contents
The analyses are presented in four notebooks that may be interactively worked
with by following these links:
- *Notebook 0*: [Data Cleaning](https://mybinder.org/v2/gh/webartifex/ames-housing/main?urlpath=lab/tree/00_data_cleaning.ipynb)
- *Notebook 1*: [Correlations](https://mybinder.org/v2/gh/webartifex/ames-housing/main?urlpath=lab/tree/01_pairwise_correlations.ipynb)
- *Notebook 2*: [Visualizations](https://mybinder.org/v2/gh/webartifex/ames-housing/main?urlpath=lab/tree/02_descriptive_visualizations.ipynb)
- *Notebook 3*: [Predictions](https://mybinder.org/v2/gh/webartifex/ames-housing/main?urlpath=lab/tree/03_predictive_models.ipynb)
The **main goal** is to **show** students
how **Python** can be used to solve a typical **data science** task.
To be suitable for *beginners*, there are *no* formal prerequisites.
It is only expected that the student has:
- a *solid* understanding of the **English** language and
- knowledge of **basic mathematics** from high school.
Some background knowledge in Python is still helpful.
To learn about Python and programming in detail,
this [introductory course <img height="12" style="display: inline-block" src="static/link/to_gh.png">](https://github.com/webartifex/intro-to-python) is recommended.
To follow this workshop, an installation of **Python 3.9** or higher is expected.
A popular and beginner friendly way is
to install the [Anaconda Distribution](https://www.anaconda.com/download)
that not only ships Python itself
but also comes pre-packaged with a lot of third-party libraries
Detailed instructions can be found [here <img height="12" style="display: inline-block" src="static/link/to_gh.png">](https://github.com/webartifex/intro-to-python#installation).
As this project assumes a couple of third-party packages
that are *not* part of the Anaconda Distribution,
it is most likely necessary
to run the command `pip install -r requirements.txt`
before working with the notebook files.
## Contributing
Feedback **is highly encouraged** and will be incorporated.
Open an issue in the [issues tracker <img height="12" style="display: inline-block" src="static/link/to_gh.png">](https://github.com/webartifex/ames-housing/issues)
or initiate a [pull request <img height="12" style="display: inline-block" src="static/link/to_gh.png">](https://help.github.com/en/articles/about-pull-requests)
if you are familiar with the concept.
Simple issues that *anyone* can **help fix** are, for example,
**spelling mistakes** or **broken links**.
If you feel that some topic is missing entirely, you may also mention that.
The materials here are considered a **permanent work-in-progress**.
## About the Author
Alexander Hess is a PhD student
at the Chair of Logistics Management at [WHU - Otto Beisheim School of Management](https://www.whu.edu)
where he conducts research on urban delivery platforms
and teaches coding courses based on Python in the BSc and MBA programs.
Connect with him on [LinkedIn](https://www.linkedin.com/in/webartifex).