A case study on predicting house prices in Ames, Iowa
Find a file
2024-07-09 17:48:50 +02:00
data Run notebooks with updates and custom kernel 2024-07-09 17:48:50 +02:00
static Update the overall project info 2021-05-25 07:55:52 +02:00
.gitignore Ignore __pycache__ folders 2024-07-09 17:40:37 +02:00
00_data_cleaning.ipynb Run notebooks with updates and custom kernel 2024-07-09 17:48:50 +02:00
01_pairwise_correlations.ipynb Run notebooks with updates and custom kernel 2024-07-09 17:48:50 +02:00
02_descriptive_visualizations.ipynb Run notebooks with updates and custom kernel 2024-07-09 17:48:50 +02:00
03_predictive_models.ipynb Run notebooks with updates and custom kernel 2024-07-09 17:48:50 +02:00
LICENSE.txt Add txt extension and update copyright year 2021-05-25 07:37:37 +02:00
poetry.lock Add invoke to the dev dependencies 2024-07-09 17:40:38 +02:00
pyproject.toml Add invoke to the dev dependencies 2024-07-09 17:40:38 +02:00
README.md Update links 2024-07-09 15:20:36 +02:00
requirements.txt Pin the dependencies 2024-07-09 15:20:23 +02:00
tasks.py Add tasks to install and remove kernels 2024-07-09 17:40:38 +02:00
utils.py Add a simple version of predictive models 2018-09-05 22:30:42 +02:00

Ames Housing

This repository is a case study of applying various machine learning models to the problem of predicting house prices.

The dataset is publicly available and can be downloaded, for example, at Kaggle.

The case study is based on this research paper.

A video presentation of the case study is available on YouTube .

Table of Contents

The analyses are presented in four notebooks that may be interactively worked with by following these links:

Objective

The main goal is to show students how Python can be used to solve a typical data science task.

Prerequisites

To be suitable for beginners, there are no formal prerequisites. It is only expected that the student has:

  • a solid understanding of the English language and
  • knowledge of basic mathematics from high school.

Some background knowledge in Python is still helpful. To learn about Python and programming in detail, this introductory course is recommended.

Getting started & Installation

To follow this workshop, an installation of Python 3.9 or higher is expected.

A popular and beginner friendly way is to install the Anaconda Distribution that not only ships Python itself but also comes pre-packaged with a lot of third-party libraries including Python's scientific stack.

Detailed instructions can be found here .

As this project assumes a couple of third-party packages that are not part of the Anaconda Distribution, it is most likely necessary to run the command pip install -r requirements.txt before working with the notebook files.

Contributing

Feedback is highly encouraged and will be incorporated. Open an issue in the issues tracker or initiate a pull request if you are familiar with the concept. Simple issues that anyone can help fix are, for example, spelling mistakes or broken links. If you feel that some topic is missing entirely, you may also mention that. The materials here are considered a permanent work-in-progress.

About the Author

Alexander Hess is a PhD student at the Chair of Logistics Management at WHU - Otto Beisheim School of Management where he conducts research on urban delivery platforms and teaches coding courses based on Python in the BSc and MBA programs.

Connect with him on LinkedIn.