Merge branch 'refurbish-project' into develop

This commit is contained in:
Alexander Hess 2021-05-25 08:38:05 +02:00
commit f3ab015744
Signed by: alexander
GPG key ID: 344EA5AB10D868E0
18 changed files with 13651 additions and 13239 deletions

4353
00_data_cleaning.ipynb Normal file

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because it is too large Load diff

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View file

@ -1,6 +1,6 @@
MIT License
Copyright (c) 2018-2020 Alexander Hess [alexander@webartifex.biz]
Copyright (c) 2018-2021 Alexander Hess [alexander@webartifex.biz]
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal

View file

@ -1,52 +1,80 @@
# Ames Housing
This repository is a case study of applying various machine learning models to
the problem of predicting house prices.
This repository is a case study of applying various machine learning models
to the problem of predicting house prices.
The dataset is publicly available and can be downloaded, for example, at
[Kaggle](https://www.kaggle.com/c/house-prices-advanced-regression-techniques).
The dataset is publicly available
and can be downloaded, for example, at [Kaggle](https://www.kaggle.com/c/house-prices-advanced-regression-techniques).
The case study is based on this [research paper](paper.pdf).
The case study is based on this [research paper](static/paper.pdf).
A video presentation of the case study is available on [YouTube <img height="12" style="display: inline-block" src="static/link/to_yt.png">](https://www.youtube.com/watch?v=VSeGseoJsNA).
### Table of Contents
The analyses are presented in four notebooks that may be interactively worked
with by following these links:
- [Data Cleaning](https://mybinder.org/v2/gh/webartifex/ames-housing/master?urlpath=lab/tree/01_data_cleaning.ipynb)
- [Correlations](https://mybinder.org/v2/gh/webartifex/ames-housing/master?urlpath=lab/tree/02_pairwise_correlations.ipynb)
- [Visualizations](https://mybinder.org/v2/gh/webartifex/ames-housing/master?urlpath=lab/tree/03_descriptive_visualizations.ipynb)
- [Predictions](https://mybinder.org/v2/gh/webartifex/ames-housing/master?urlpath=lab/tree/04_predictive_models.ipynb)
A video presentation of the case study is available on
[YouTube <img height="12" style="display: inline-block" src="link_to_yt.png">](https://www.youtube.com/watch?v=VSeGseoJsNA).
- *Notebook 0*: [Data Cleaning](https://mybinder.org/v2/gh/webartifex/ames-housing/main?urlpath=lab/tree/00_data_cleaning.ipynb)
- *Notebook 1*: [Correlations](https://mybinder.org/v2/gh/webartifex/ames-housing/main?urlpath=lab/tree/01_pairwise_correlations.ipynb)
- *Notebook 2*: [Visualizations](https://mybinder.org/v2/gh/webartifex/ames-housing/main?urlpath=lab/tree/02_descriptive_visualizations.ipynb)
- *Notebook 3*: [Predictions](https://mybinder.org/v2/gh/webartifex/ames-housing/main?urlpath=lab/tree/03_predictive_models.ipynb)
## Installation
### Objective
The project can be cloned and may be worked with under the MIT open source
license.
Python 3.7 was used to prepare and test the provided code.
Albeit the [poetry](https://python-poetry.org/) tool was used to manage the
dependencies, a [requirements.txt](requirements.txt) file is also provided as
an alternative.
The **main goal** is to **show** students
how **Python** can be used to solve a typical **data science** task.
On a Unix system, run:
- `git clone https://github.com/webartifex/ames-housing.git` (or use HTTPS
instead)
- either `poetry install` or `pip install -r requirements.txt` (in the latter
case, it is suggested that a virtual environment be used)
- after installation, `jupyter lab` opens a new tab in one's web browser where
the notebooks and data files may be opened
Alternatively, the project should also be runnable with the
[Anaconda Distribution](https://www.anaconda.com/products/individual).
### Prerequisites
To be suitable for *beginners*, there are *no* formal prerequisites.
It is only expected that the student has:
- a *solid* understanding of the **English** language and
- knowledge of **basic mathematics** from high school.
Some background knowledge in Python is still helpful.
To learn about Python and programming in detail,
this [introductory course <img height="12" style="display: inline-block" src="static/link/to_gh.png">](https://github.com/webartifex/intro-to-python) is recommended.
### Getting started & Installation
To follow this workshop, an installation of **Python 3.8** or higher is expected.
A popular and beginner friendly way is
to install the [Anaconda Distribution](https://www.anaconda.com/products/individual)
that not only ships Python itself
but also comes pre-packaged with a lot of third-party libraries
including [Python's scientific stack](https://scipy.org/about.html).
Detailed instructions can be found [here <img height="12" style="display: inline-block" src="static/link/to_gh.png">](https://github.com/webartifex/intro-to-python#installation).
As this project assumes a couple of third-party packages
that are *not* part of the Anaconda Distribution,
it is most likely necessary
to run the command `pip install -r requirements.txt`
before working with the notebook files.
## Contributing
Feedback **is highly encouraged** and will be incorporated.
Open an issue in the [issues tracker <img height="12" style="display: inline-block" src="static/link/to_gh.png">](https://github.com/webartifex/ames-housing/issues)
or initiate a [pull request <img height="12" style="display: inline-block" src="static/link/to_gh.png">](https://help.github.com/en/articles/about-pull-requests)
if you are familiar with the concept.
Simple issues that *anyone* can **help fix** are, for example,
**spelling mistakes** or **broken links**.
If you feel that some topic is missing entirely, you may also mention that.
The materials here are considered a **permanent work-in-progress**.
## About the Author
Alexander Hess is a PhD student at the Chair of Logistics Management at the
[WHU - Otto Beisheim School of Management](https://www.whu.edu) where he
conducts research on urban delivery platforms and teaches an introductory
course on Python (cf., [Fall Term 2019](https://vlv.whu.edu/campus/all/event.asp?objgguid=0xE57C2715B01B441AAFD3E79AA05CACCF&from=vvz&gguid=0x6A2B0ED5B2B949E69957A2099E7DE2F1&mode=own&tguid=0x3980A9BBC3BF4A638E977F2DC163F44B&lang=en),
[Spring Term 2020](https://vlv.whu.edu/campus/all/event.asp?objgguid=0x3354F4C108FF4E959CDD692A325D9AFE&from=vvz&gguid=0x262E29795DD742CFBDE72B12B69CEFD6&mode=own&lang=en&tguid=0x2E4A7D1FF3C34AD08FF07685461781C9)).
Connect him on [LinkedIn](https://www.linkedin.com/in/webartifex).
Alexander Hess is a PhD student
at the Chair of Logistics Management at [WHU - Otto Beisheim School of Management](https://www.whu.edu)
where he conducts research on urban delivery platforms
and teaches coding courses based on Python in the BSc and MBA programs.
Connect with him on [LinkedIn](https://www.linkedin.com/in/webartifex).

View file

@ -428,7 +428,6 @@ Order,PID,1st Flr SF,1st Flr SF (box-cox-0),2nd Flr SF,3Ssn Porch,Bedroom AbvGr,
430,528108130,2020.0,7.61085279039525,0.0,0.0,3,TA,Av,1,0,Ex,788.0,1232.0,0.0,GLQ,Unf,SBrkr,0.0,NA,Gd,1,2,Typ,896.0,3,TA,RFn,TA,2020.0,7.61085279039525,0,1,Ex,Gtl,12350.0,15.654685543608181,Reg,0.0,450.0,0.0,5,98.0,5,9,Y,0.0,NA,0.0,7,3.0,2020.0,290.0,4040.0,21.31759972987615,AllPub,192.0,0,1,1,0,0,0,0,1,0,1,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,1,0,1,1,404000.0,12.909170156943286
431,528108140,2020.0,7.61085279039525,0.0,0.0,3,TA,No,1,0,Ex,570.0,1436.0,0.0,GLQ,Unf,SBrkr,0.0,NA,Gd,1,2,Typ,900.0,3,TA,Fin,TA,2020.0,7.61085279039525,1,1,Ex,Gtl,12220.0,15.627551834008008,Reg,0.0,305.0,0.0,9,54.0,5,10,Y,0.0,NA,0.0,9,3.5,2006.0,210.0,4026.0,21.29933447351643,AllPub,156.0,0,1,1,0,0,0,1,0,0,1,1,1,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,1,1,0,0,0,402861.0,12.906346868281233
432,528110010,1728.0,7.454719949364001,568.0,0.0,3,TA,Gd,1,0,Ex,384.0,1338.0,0.0,GLQ,Unf,SBrkr,0.0,NA,Gd,1,2,Typ,842.0,3,TA,RFn,TA,2296.0,7.738923757439457,1,1,Ex,Gtl,13478.0,15.879897099046154,IR1,0.0,420.0,0.0,6,274.0,5,10,Y,0.0,NA,0.0,10,3.5,1722.0,656.0,4018.0,21.28887435905055,AllPub,382.0,0,1,1,0,0,0,0,1,1,1,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,1,0,1,1,451950.0,13.021326833226556
433,528110020,2674.0,7.891330757661889,0.0,0.0,2,TA,Gd,2,0,Ex,342.0,2288.0,0.0,GLQ,Unf,SBrkr,0.0,NA,Gd,2,2,Typ,762.0,3,TA,Fin,TA,2674.0,7.891330757661889,1,1,Ex,Gtl,13693.0,15.920887120524267,Reg,0.0,472.0,0.0,3,50.0,5,10,Y,0.0,NA,0.0,8,4.5,2630.0,410.0,5304.0,22.790138257062,AllPub,360.0,0,1,1,0,0,0,0,1,0,1,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,1,1,0,2,2,610000.0,13.321214236149494
434,528110090,1734.0,7.458186157340487,1088.0,0.0,4,TA,Gd,0,0,Ex,1734.0,0.0,0.0,Unf,Unf,SBrkr,0.0,NA,Gd,1,3,Typ,1020.0,3,TA,RFn,TA,2822.0,7.945201132412759,1,1,Ex,Gtl,13891.0,15.958126895168148,Reg,0.0,424.0,0.0,1,170.0,5,9,Y,0.0,NA,192.0,12,3.5,1734.0,414.0,4556.0,21.95794279603217,AllPub,52.0,0,1,1,0,0,0,0,1,1,1,1,1,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,1,1,1,1,0,582933.0,13.275827535915461
435,528112040,1736.0,7.459338895220296,0.0,0.0,3,TA,No,0,0,Ex,1736.0,0.0,0.0,Unf,Unf,SBrkr,0.0,NA,Gd,1,2,Typ,834.0,3,TA,RFn,TA,1736.0,7.459338895220296,0,1,Ex,Gtl,11578.0,15.489619617350394,Reg,0.0,302.0,0.0,7,90.0,5,9,Y,0.0,NA,0.0,7,2.0,1736.0,409.0,3472.0,20.53206536701657,AllPub,319.0,0,1,1,0,0,0,0,1,0,1,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,1,0,1,1,360000.0,12.793859310432293
436,528116010,1782.0,7.485491608030754,0.0,0.0,3,TA,Gd,1,0,Ex,251.0,1531.0,0.0,GLQ,Unf,SBrkr,0.0,NA,Gd,2,2,Typ,932.0,3,TA,Fin,TA,1782.0,7.485491608030754,0,1,Gd,Gtl,16870.0,16.46741385670273,IR1,0.0,238.0,0.0,4,82.0,5,8,Y,0.0,NA,0.0,7,3.0,1782.0,181.0,3564.0,20.66596179182348,AllPub,99.0,0,1,1,0,0,0,0,1,0,1,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,1,1,5,4,296000.0,12.598114733306197
@ -700,7 +699,6 @@ Order,PID,1st Flr SF,1st Flr SF (box-cox-0),2nd Flr SF,3Ssn Porch,Bedroom AbvGr,
706,902201110,768.0,6.643789733147672,560.0,0.0,3,TA,No,0,0,TA,384.0,384.0,0.0,BLQ,Unf,FuseA,0.0,MnPrv,NA,0,1,Typ,308.0,1,TA,Unf,TA,1328.0,7.191429330036379,1,1,TA,Gtl,6000.0,13.86795031058074,Reg,0.0,0.0,0.0,3,12.0,7,5,Y,0.0,NA,0.0,6,1.5,768.0,12.0,2096.0,18.080661941177674,AllPub,0.0,0,1,1,0,0,0,1,0,1,1,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,59,39,129500.0,11.771436160121729
707,902202150,811.0,6.698268054115413,576.0,0.0,3,TA,No,0,0,TA,811.0,0.0,0.0,Unf,Unf,FuseA,0.0,NA,NA,0,2,Typ,256.0,1,TA,Unf,TA,1387.0,7.234898420314831,0,2,Gd,Gtl,6000.0,13.86795031058074,Reg,0.0,0.0,0.0,11,0.0,4,5,Y,0.0,NA,0.0,7,2.0,811.0,0.0,2198.0,18.30105269110471,AllPub,0.0,0,1,0,1,0,0,1,0,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,55,55,93000.0,11.440354772135393
708,902204080,861.0,6.7580945044277305,0.0,0.0,2,TA,No,0,0,Fa,861.0,0.0,0.0,Unf,Unf,SBrkr,128.0,NA,NA,0,1,Typ,288.0,2,TA,Unf,TA,861.0,6.7580945044277305,0,1,TA,Gtl,7404.0,14.375113185074957,Reg,0.0,0.0,0.0,11,0.0,6,4,N,0.0,NA,0.0,5,1.0,861.0,128.0,1722.0,17.190987754073017,AllPub,0.0,0,1,1,0,0,1,0,0,0,1,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,89,59,80000.0,11.289781913656018
709,902205010,612.0,6.416732282512326,0.0,0.0,1,NA,NA,0,0,NA,0.0,0.0,0.0,NA,NA,FuseA,25.0,NA,NA,0,1,Typ,308.0,1,Fa,Unf,TA,612.0,6.416732282512326,0,1,TA,Gtl,5925.0,13.837946210420888,Reg,0.0,0.0,0.0,10,0.0,4,2,N,0.0,NA,0.0,4,1.0,0.0,25.0,612.0,13.043479427110126,AllPub,0.0,0,0,1,0,0,1,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,69,59,45000.0,10.714417768752456
710,902205030,600.0,6.396929655216146,368.0,0.0,2,TA,No,0,0,TA,600.0,0.0,0.0,Unf,Unf,SBrkr,0.0,GdWo,NA,0,1,Typ,0.0,0,NA,NA,NA,968.0,6.875232087276577,0,1,TA,Gtl,5925.0,13.837946210420888,Reg,0.0,0.0,0.0,5,0.0,6,3,Y,0.0,NA,0.0,6,1.0,600.0,0.0,1568.0,16.7790642448278,AllPub,0.0,1,0,1,0,0,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,99,59,37900.0,10.542706391070517
711,902206260,886.0,6.786716950605081,0.0,0.0,2,TA,No,0,0,Fa,190.0,0.0,0.0,Unf,Unf,FuseA,80.0,NA,NA,0,1,Typ,273.0,1,TA,Unf,TA,886.0,6.786716950605081,0,1,TA,Gtl,5784.0,13.780601121178693,Reg,0.0,0.0,0.0,12,20.0,8,5,Y,0.0,NA,0.0,4,1.0,190.0,244.0,1076.0,15.19912043380998,AllPub,144.0,0,1,1,0,0,1,0,0,0,1,0,1,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,71,13,91300.0,11.421906066583059
712,902207170,792.0,6.674561391814426,0.0,0.0,2,TA,No,0,0,Fa,624.0,0.0,0.0,Unf,Unf,SBrkr,81.0,GdWo,NA,0,1,Typ,287.0,1,TA,Unf,TA,792.0,6.674561391814426,0,1,TA,Gtl,8520.0,14.719743466735071,Reg,0.0,0.0,0.0,2,0.0,8,5,Y,0.0,NA,0.0,5,1.0,624.0,81.0,1416.0,16.33942215073927,AllPub,0.0,0,1,1,0,0,0,0,1,0,1,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,86,59,99500.0,11.507912923146684
@ -1745,6 +1743,7 @@ Order,PID,1st Flr SF,1st Flr SF (box-cox-0),2nd Flr SF,3Ssn Porch,Bedroom AbvGr,
1770,528354110,1383.0,7.232010331664759,1015.0,0.0,3,TA,No,1,0,Gd,660.0,719.0,0.0,GLQ,Unf,SBrkr,0.0,NA,TA,1,2,Typ,834.0,3,TA,Fin,TA,2398.0,7.7823903355874595,1,1,Gd,Gtl,11787.0,15.53526259606626,IR1,0.0,594.0,0.0,8,60.0,5,7,Y,0.0,NA,0.0,8,3.5,1379.0,299.0,3777.0,20.96566210764097,AllPub,239.0,0,1,1,0,0,0,0,1,1,1,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,11,10,315750.0,12.662706040212738
1771,528358040,1214.0,7.101675971619444,1306.0,0.0,4,TA,No,0,0,Gd,638.0,565.0,0.0,GLQ,Unf,SBrkr,0.0,NA,TA,1,2,Typ,721.0,3,TA,RFn,TA,2520.0,7.832014180505469,1,1,Gd,Gtl,9950.0,15.106276534404287,IR1,0.0,290.0,0.0,6,114.0,5,7,Y,0.0,NA,0.0,9,2.5,1203.0,338.0,3723.0,20.8909872937205,AllPub,224.0,1,1,1,0,0,0,0,1,1,1,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,12,12,290000.0,12.577636201962656
1772,528364110,1328.0,7.191429330036379,1203.0,0.0,4,TA,Av,0,0,Gd,1198.0,56.0,64.0,LwQ,ALQ,SBrkr,0.0,NA,TA,1,2,Typ,752.0,3,TA,RFn,TA,2531.0,7.836369760545124,1,1,Gd,Gtl,12257.0,15.635300851369571,IR1,0.0,513.0,0.0,11,98.0,5,8,Y,0.0,NA,0.0,9,2.5,1318.0,320.0,3849.0,21.06391115400039,AllPub,222.0,0,1,1,0,0,0,0,1,1,1,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,12,12,290000.0,12.577636201962656
1773,528366050,3228.0,8.07961802938984,0.0,0.0,4,TA,No,1,0,Gd,1969.0,1231.0,0.0,GLQ,Unf,SBrkr,291.0,NA,Gd,1,3,Typ,546.0,2,TA,RFn,TA,3228.0,8.07961802938984,0,1,Gd,Gtl,12692.0,15.724859195735945,IR1,0.0,0.0,0.0,5,75.0,5,8,Y,0.0,NA,0.0,10,4.0,3200.0,630.0,6428.0,23.879201299438567,AllPub,264.0,0,1,1,0,0,0,0,1,0,1,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,15,14,430000.0,12.971540487669746
1774,528366070,1105.0,7.007600613951853,1097.0,0.0,4,TA,No,1,0,Ex,770.0,335.0,0.0,GLQ,Unf,SBrkr,0.0,NA,TA,1,2,Typ,517.0,2,TA,RFn,TA,2202.0,7.697121317282625,1,1,Gd,Gtl,11762.0,15.529841439857076,Reg,0.0,309.0,0.0,9,65.0,5,8,Y,0.0,NA,144.0,9,3.5,1105.0,209.0,3307.0,20.284644002311033,AllPub,0.0,0,1,1,0,0,0,0,1,1,1,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,15,14,263000.0,12.479909311159902
1775,528376010,1335.0,7.19668657083435,1203.0,0.0,4,TA,No,0,0,Gd,100.0,1225.0,0.0,GLQ,Unf,SBrkr,0.0,NA,TA,1,2,Typ,933.0,3,TA,RFn,TA,2538.0,7.839131648274333,1,1,Gd,Gtl,9044.0,14.867724959619174,IR1,0.0,526.0,0.0,5,92.0,5,8,Y,0.0,NA,0.0,8,2.5,1325.0,290.0,3863.0,21.08284412453433,AllPub,198.0,0,1,1,0,0,0,0,1,1,1,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,11,10,330000.0,12.706847933442663
1776,528427040,1369.0,7.221835825288449,0.0,0.0,3,TA,No,0,0,Gd,1369.0,0.0,0.0,Unf,Unf,SBrkr,0.0,NA,NA,0,2,Typ,605.0,2,TA,Unf,TA,1369.0,7.221835825288449,0,1,Gd,Gtl,9910.0,15.096165253640338,Reg,0.0,0.0,0.0,9,203.0,6,7,Y,0.0,NA,0.0,5,2.0,1369.0,203.0,2738.0,19.34762998557629,AllPub,0.0,0,1,1,0,0,0,0,1,0,1,0,1,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,1,1,0,0,0,213133.0,12.269671662972327
@ -2621,6 +2620,7 @@ Order,PID,1st Flr SF,1st Flr SF (box-cox-0),2nd Flr SF,3Ssn Porch,Bedroom AbvGr,
2664,902329070,684.0,6.52795791762255,684.0,0.0,3,TA,No,0,0,TA,684.0,0.0,0.0,Unf,Unf,FuseA,0.0,NA,NA,0,1,Typ,216.0,1,Fa,Unf,TA,1368.0,7.221105098182496,0,1,TA,Gtl,3600.0,12.679331552660544,Reg,0.0,0.0,0.0,10,158.0,7,6,N,0.0,NA,0.0,7,1.0,684.0,158.0,2052.0,17.982934302945672,AllPub,0.0,0,0,1,0,0,0,1,0,1,1,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,96,13,114504.0,11.648365035864053
2665,902329090,998.0,6.905753276311464,764.0,0.0,4,TA,No,1,0,Fa,596.0,0.0,0.0,Unf,Unf,SBrkr,221.0,NA,NA,0,1,Typ,576.0,2,TA,Unf,TA,1762.0,7.474204806496124,1,1,Gd,Gtl,7200.0,14.307105706203597,Reg,0.0,0.0,0.0,10,0.0,7,7,N,0.0,NA,0.0,8,2.5,596.0,257.0,2358.0,18.630818497230923,AllPub,36.0,0,1,1,0,0,1,0,0,1,1,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,86,56,157000.0,11.964001084330445
2666,902400090,1200.0,7.090076835776092,0.0,0.0,4,TA,No,0,0,TA,1200.0,0.0,0.0,Unf,Unf,FuseA,228.0,NA,NA,0,1,Typ,312.0,1,Fa,Unf,Fa,1200.0,7.090076835776092,0,1,TA,Gtl,11340.0,15.43673148163438,Reg,0.0,0.0,0.0,3,0.0,5,6,Y,0.0,NA,0.0,7,1.0,1200.0,228.0,2400.0,18.71440609779312,AllPub,0.0,0,1,1,0,0,1,0,0,0,1,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,83,56,125000.0,11.736069016284437
2667,902400110,1518.0,7.325148957955575,1518.0,0.0,4,TA,Mn,0,0,TA,1107.0,0.0,0.0,Unf,Unf,SBrkr,0.0,GdPrv,TA,2,2,Typ,840.0,3,TA,Unf,Ex,3608.0,8.190908881182514,1,1,Ex,Gtl,22950.0,17.294696462432224,IR2,572.0,0.0,0.0,6,260.0,9,10,Y,0.0,NA,410.0,12,2.5,1107.0,670.0,4715.0,22.143531063375193,AllPub,0.0,0,1,1,0,0,1,0,0,1,1,1,1,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,114,13,475000.0,13.071070083016778
2668,902401090,624.0,6.436150368369428,624.0,0.0,2,TA,No,0,0,TA,624.0,0.0,0.0,Unf,Unf,FuseA,256.0,NA,NA,0,2,Typ,0.0,0,NA,NA,NA,1248.0,7.129297548929373,0,2,TA,Gtl,5976.0,13.858385901870495,Reg,0.0,0.0,0.0,12,130.0,7,5,N,0.0,NA,0.0,8,2.0,624.0,386.0,1872.0,17.56478361538355,AllPub,0.0,0,0,0,1,0,0,0,1,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,86,56,93500.0,11.44571671527678
2669,902406030,960.0,6.866933284461882,0.0,0.0,3,TA,No,0,0,TA,960.0,0.0,0.0,Unf,Unf,SBrkr,0.0,NA,NA,0,1,Typ,624.0,2,TA,Unf,TA,960.0,6.866933284461882,0,1,TA,Gtl,9750.0,15.055349293727405,Reg,0.0,0.0,4500.0,7,0.0,5,5,Y,0.0,NA,0.0,5,1.0,960.0,0.0,1920.0,17.679331552660543,AllPub,0.0,0,1,1,0,0,0,1,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,48,48,125000.0,11.736069016284437
2670,902456015,1020.0,6.927557906278317,0.0,0.0,2,Fa,No,0,0,TA,1020.0,0.0,0.0,Unf,Unf,FuseP,105.0,NA,NA,0,1,Typ,0.0,0,NA,NA,NA,1020.0,6.927557906278317,0,1,Fa,Gtl,4761.0,13.322216261602419,Reg,0.0,0.0,0.0,10,0.0,3,3,N,0.0,NA,0.0,5,1.0,1020.0,105.0,2040.0,17.95599057784113,AllPub,0.0,0,0,1,0,0,1,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,88,56,64500.0,11.074420502783864

Can't render this file because it is too large.

2261
poetry.lock generated

File diff suppressed because it is too large Load diff

View file

@ -1,14 +1,15 @@
[build-system]
build-backend = "poetry.masonry.api"
requires = ["poetry>=0.12"]
[tool.black]
line-length = 79
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"
[tool.poetry]
authors = ["Alexander Hess <alexander@webartifex.biz>"]
name = "ames-housing"
version = "0.1.0.dev0"
authors = [
"Alexander Hess <alexander@webartifex.biz>",
]
description = "A case study on predicting house prices in Ames, Iowa"
homepage = "https://github.com/webartifex/ames-housing"
keywords = [
"data-science",
"data-cleaning",
@ -18,27 +19,26 @@ keywords = [
"predictive-analytics",
]
license = "MIT"
name = "ames-housing"
readme = "README.md"
homepage = "https://github.com/webartifex/ames-housing"
repository = "https://github.com/webartifex/ames-housing"
version = "0.1.0"
[tool.poetry.dependencies]
jupyterlab = "^2.1.5"
matplotlib = "^3.2.2"
python = "^3.8"
jupyterlab = "^3.0.16"
missingno = "^0.4.2"
numpy = "^1.19.0"
pandas = "^1.0.5"
python = "^3.7"
requests = "^2.24.0"
seaborn = "^0.10.1"
sklearn = "^0.0"
statsmodels = "^0.11.1"
tabulate = "^0.8.7"
tqdm = "^4.47.0"
xlrd = "^1.2.0"
xlwt = "^1.3.0"
matplotlib = "^3.4.2"
numpy = "^1.20.3"
pandas = "^1.2.4"
requests = "^2.25.1"
scikit-learn = "^0.24.2"
seaborn = "^0.11.1"
statsmodels = "^0.12.2"
tabulate = "^0.8.9"
tqdm = "^4.61.0"
xlrd = "^2.0.1"
[tool.poetry.dev-dependencies]
black = "^19.10b0"
pylint = "^2.5.3"
black = "^21.5b1"
pylint = "^2.8.2"

View file

@ -1,82 +1,85 @@
appdirs==1.4.4
astroid==2.4.2
attrs==19.3.0
backcall==0.2.0
black==19.10b0
bleach==3.1.5
certifi==2020.6.20
chardet==3.0.4
click==7.1.2
cycler==0.10.0
decorator==4.4.2
defusedxml==0.6.0
entrypoints==0.3
idna==2.10
importlib-metadata==1.7.0
ipykernel==5.3.0
ipython==7.16.1
ipython-genutils==0.2.0
isort==4.3.21
jedi==0.17.1
Jinja2==2.11.2
joblib==0.15.1
json5==0.9.5
jsonschema==3.2.0
jupyter-client==6.1.3
jupyter-core==4.6.3
jupyterlab==2.1.5
jupyterlab-server==1.1.5
kiwisolver==1.2.0
lazy-object-proxy==1.4.3
MarkupSafe==1.1.1
matplotlib==3.2.2
mccabe==0.6.1
anyio==3.1.0; python_full_version >= "3.6.2" and python_version >= "3.6"
appnope==0.1.2; sys_platform == "darwin" and python_version >= "3.7" and platform_system == "Darwin"
argon2-cffi==20.1.0; python_version >= "3.6"
async-generator==1.10; python_full_version >= "3.6.1" and python_version >= "3.6"
attrs==21.2.0; python_version >= "3.6" and python_full_version < "3.0.0" or python_full_version >= "3.5.0" and python_version >= "3.6"
babel==2.9.1; python_version >= "3.6" and python_full_version < "3.0.0" or python_full_version >= "3.4.0" and python_version >= "3.6"
backcall==0.2.0; python_version >= "3.7"
bleach==3.3.0; python_version >= "3.6" and python_full_version < "3.0.0" or python_full_version >= "3.5.0" and python_version >= "3.6"
certifi==2020.12.5; python_version >= "3.6" and python_full_version < "3.0.0" or python_full_version >= "3.5.0" and python_version >= "3.6"
cffi==1.14.5; implementation_name == "pypy" and python_version >= "3.6"
chardet==4.0.0; python_version >= "3.6" and python_full_version < "3.0.0" or python_full_version >= "3.5.0" and python_version >= "3.6"
colorama==0.4.4; python_version >= "3.7" and python_full_version < "3.0.0" and sys_platform == "win32" or sys_platform == "win32" and python_version >= "3.7" and python_full_version >= "3.5.0"
cycler==0.10.0; python_version >= "3.7"
decorator==5.0.9; python_version >= "3.7"
defusedxml==0.7.1; python_version >= "3.6" and python_full_version < "3.0.0" or python_full_version >= "3.5.0" and python_version >= "3.6"
entrypoints==0.3; python_version >= "3.6"
idna==2.10; python_full_version >= "3.6.2" and python_version >= "3.6" and (python_version >= "3.6" and python_full_version < "3.0.0" or python_full_version >= "3.5.0" and python_version >= "3.6")
ipykernel==5.5.5; python_version >= "3.6"
ipython-genutils==0.2.0; python_version >= "3.7"
ipython==7.23.1; python_version >= "3.7"
jedi==0.18.0; python_version >= "3.7"
jinja2==3.0.1; python_version >= "3.6"
joblib==1.0.1; python_version >= "3.6"
json5==0.9.5; python_version >= "3.6"
jsonschema==3.2.0; python_version >= "3.6"
jupyter-client==6.2.0; python_full_version >= "3.6.1" and python_version >= "3.6"
jupyter-core==4.7.1; python_full_version >= "3.6.1" and python_version >= "3.6"
jupyter-server==1.8.0; python_version >= "3.6"
jupyterlab-pygments==0.1.2; python_version >= "3.6"
jupyterlab-server==2.5.2; python_version >= "3.6"
jupyterlab==3.0.16; python_version >= "3.6"
kiwisolver==1.3.1; python_version >= "3.7"
markupsafe==2.0.1; python_version >= "3.6"
matplotlib-inline==0.1.2; python_version >= "3.7"
matplotlib==3.4.2; python_version >= "3.7"
missingno==0.4.2
mistune==0.8.4
nbconvert==5.6.1
nbformat==5.0.7
notebook==6.0.3
numpy==1.19.0
packaging==20.4
pandas==1.0.5
pandocfilters==1.4.2
parso==0.7.0
pathspec==0.8.0
patsy==0.5.1
pexpect==4.8.0
pickleshare==0.7.5
prometheus-client==0.8.0
prompt-toolkit==3.0.5
ptyprocess==0.6.0
Pygments==2.6.1
pylint==2.5.3
pyparsing==2.4.7
pyrsistent==0.16.0
python-dateutil==2.8.1
pytz==2020.1
pyzmq==19.0.1
regex==2020.6.8
requests==2.24.0
scikit-learn==0.23.1
scipy==1.5.0
seaborn==0.10.1
Send2Trash==1.5.0
six==1.15.0
sklearn==0.0
statsmodels==0.11.1
tabulate==0.8.7
terminado==0.8.3
testpath==0.4.4
threadpoolctl==2.1.0
toml==0.10.1
tornado==6.0.4
tqdm==4.47.0
traitlets==4.3.3
typed-ast==1.4.1
urllib3==1.25.9
wcwidth==0.2.5
webencodings==0.5.1
wrapt==1.12.1
xlrd==1.2.0
xlwt==1.3.0
zipp==3.1.0
mistune==0.8.4; python_version >= "3.6"
nbclassic==0.3.1; python_version >= "3.6"
nbclient==0.5.3; python_full_version >= "3.6.1" and python_version >= "3.6"
nbconvert==6.0.7; python_version >= "3.6"
nbformat==5.1.3; python_full_version >= "3.6.1" and python_version >= "3.6"
nest-asyncio==1.5.1; python_full_version >= "3.6.1" and python_version >= "3.6"
notebook==6.4.0; python_version >= "3.6"
numpy==1.20.3; python_version >= "3.7"
packaging==20.9; python_version >= "3.6" and python_full_version < "3.0.0" or python_full_version >= "3.5.0" and python_version >= "3.6"
pandas==1.2.4; python_full_version >= "3.7.1"
pandocfilters==1.4.3; python_version >= "3.6" and python_full_version < "3.0.0" or python_full_version >= "3.4.0" and python_version >= "3.6"
parso==0.8.2; python_version >= "3.7"
patsy==0.5.1; python_version >= "3.6"
pexpect==4.8.0; sys_platform != "win32" and python_version >= "3.7"
pickleshare==0.7.5; python_version >= "3.7"
pillow==8.2.0; python_version >= "3.7"
prometheus-client==0.10.1; python_version >= "3.6" and python_full_version < "3.0.0" or python_full_version >= "3.4.0" and python_version >= "3.6"
prompt-toolkit==3.0.18; python_full_version >= "3.6.1" and python_version >= "3.7"
ptyprocess==0.7.0; sys_platform != "win32" and python_version >= "3.7" and os_name != "nt"
py==1.10.0; python_version >= "3.6" and python_full_version < "3.0.0" and implementation_name == "pypy" or implementation_name == "pypy" and python_version >= "3.6" and python_full_version >= "3.4.0"
pycparser==2.20; python_version >= "3.6" and python_full_version < "3.0.0" or python_full_version >= "3.4.0" and python_version >= "3.6"
pygments==2.9.0; python_version >= "3.7"
pyparsing==2.4.7; python_version >= "3.7" and python_full_version < "3.0.0" or python_full_version >= "3.4.0" and python_version >= "3.7"
pyrsistent==0.17.3; python_version >= "3.6"
python-dateutil==2.8.1; python_full_version >= "3.7.1" and python_version >= "3.7" and (python_version >= "3.7" and python_full_version < "3.0.0" or python_full_version >= "3.3.0" and python_version >= "3.7")
pytz==2021.1; python_full_version >= "3.7.1" and python_version >= "3.6" and (python_version >= "3.6" and python_full_version < "3.0.0" or python_full_version >= "3.4.0" and python_version >= "3.6")
pywin32==300; sys_platform == "win32" and python_version >= "3.6"
pywinpty==1.1.1; os_name == "nt" and python_version >= "3.6"
pyzmq==22.0.3; python_full_version >= "3.6.1" and python_version >= "3.6"
requests==2.25.1; (python_version >= "2.7" and python_full_version < "3.0.0") or (python_full_version >= "3.5.0")
scikit-learn==0.24.2; python_version >= "3.6"
scipy==1.6.1; python_version >= "3.7"
seaborn==0.11.1; python_version >= "3.6"
send2trash==1.5.0; python_version >= "3.6"
six==1.16.0; python_version >= "3.7" and python_full_version < "3.0.0" or python_full_version >= "3.5.0" and python_version >= "3.7"
sniffio==1.2.0; python_full_version >= "3.6.2" and python_version >= "3.6"
statsmodels==0.12.2; python_version >= "3.6"
tabulate==0.8.9
terminado==0.10.0; python_version >= "3.6"
testpath==0.5.0; python_version >= "3.6"
threadpoolctl==2.1.0; python_version >= "3.6"
tornado==6.1; python_full_version >= "3.6.1" and python_version >= "3.6"
tqdm==4.61.0; (python_version >= "2.7" and python_full_version < "3.0.0") or (python_full_version >= "3.4.0")
traitlets==5.0.5; python_full_version >= "3.6.1" and python_version >= "3.7"
urllib3==1.26.4; python_version >= "3.6" and python_full_version < "3.0.0" or python_full_version >= "3.5.0" and python_version < "4" and python_version >= "3.6"
wcwidth==0.2.5; python_full_version >= "3.6.1" and python_version >= "3.7"
webencodings==0.5.1; python_version >= "3.6" and python_full_version < "3.0.0" or python_full_version >= "3.5.0" and python_version >= "3.6"
websocket-client==1.0.1; python_version >= "3.6"
xlrd==2.0.1; (python_version >= "2.7" and python_full_version < "3.0.0") or (python_full_version >= "3.6.0")

2
static/link/README.md Normal file
View file

@ -0,0 +1,2 @@
This folder contains small images
that are used to enhance the links in the notebooks and markdown files.

BIN
static/link/to_gh.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 1.2 KiB

View file

Before

Width:  |  Height:  |  Size: 912 B

After

Width:  |  Height:  |  Size: 912 B