Merge branch 'develop' into main
This commit is contained in:
commit
ea94290970
18 changed files with 13651 additions and 13239 deletions
4353
00_data_cleaning.ipynb
Normal file
4353
00_data_cleaning.ipynb
Normal file
File diff suppressed because one or more lines are too long
2426
01_pairwise_correlations.ipynb
Normal file
2426
01_pairwise_correlations.ipynb
Normal file
File diff suppressed because one or more lines are too long
4970
02_descriptive_visualizations.ipynb
Normal file
4970
02_descriptive_visualizations.ipynb
Normal file
File diff suppressed because one or more lines are too long
File diff suppressed because it is too large
Load diff
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
|
@ -1,6 +1,6 @@
|
|||
MIT License
|
||||
|
||||
Copyright (c) 2018-2020 Alexander Hess [alexander@webartifex.biz]
|
||||
Copyright (c) 2018-2021 Alexander Hess [alexander@webartifex.biz]
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
98
README.md
98
README.md
|
@ -1,52 +1,80 @@
|
|||
# Ames Housing
|
||||
|
||||
This repository is a case study of applying various machine learning models to
|
||||
the problem of predicting house prices.
|
||||
This repository is a case study of applying various machine learning models
|
||||
to the problem of predicting house prices.
|
||||
|
||||
The dataset is publicly available and can be downloaded, for example, at
|
||||
[Kaggle](https://www.kaggle.com/c/house-prices-advanced-regression-techniques).
|
||||
The dataset is publicly available
|
||||
and can be downloaded, for example, at [Kaggle](https://www.kaggle.com/c/house-prices-advanced-regression-techniques).
|
||||
|
||||
The case study is based on this [research paper](paper.pdf).
|
||||
The case study is based on this [research paper](static/paper.pdf).
|
||||
|
||||
A video presentation of the case study is available on [YouTube <img height="12" style="display: inline-block" src="static/link/to_yt.png">](https://www.youtube.com/watch?v=VSeGseoJsNA).
|
||||
|
||||
|
||||
### Table of Contents
|
||||
|
||||
The analyses are presented in four notebooks that may be interactively worked
|
||||
with by following these links:
|
||||
- [Data Cleaning](https://mybinder.org/v2/gh/webartifex/ames-housing/master?urlpath=lab/tree/01_data_cleaning.ipynb)
|
||||
- [Correlations](https://mybinder.org/v2/gh/webartifex/ames-housing/master?urlpath=lab/tree/02_pairwise_correlations.ipynb)
|
||||
- [Visualizations](https://mybinder.org/v2/gh/webartifex/ames-housing/master?urlpath=lab/tree/03_descriptive_visualizations.ipynb)
|
||||
- [Predictions](https://mybinder.org/v2/gh/webartifex/ames-housing/master?urlpath=lab/tree/04_predictive_models.ipynb)
|
||||
|
||||
A video presentation of the case study is available on
|
||||
[YouTube <img height="12" style="display: inline-block" src="link_to_yt.png">](https://www.youtube.com/watch?v=VSeGseoJsNA).
|
||||
- *Notebook 0*: [Data Cleaning](https://mybinder.org/v2/gh/webartifex/ames-housing/main?urlpath=lab/tree/00_data_cleaning.ipynb)
|
||||
- *Notebook 1*: [Correlations](https://mybinder.org/v2/gh/webartifex/ames-housing/main?urlpath=lab/tree/01_pairwise_correlations.ipynb)
|
||||
- *Notebook 2*: [Visualizations](https://mybinder.org/v2/gh/webartifex/ames-housing/main?urlpath=lab/tree/02_descriptive_visualizations.ipynb)
|
||||
- *Notebook 3*: [Predictions](https://mybinder.org/v2/gh/webartifex/ames-housing/main?urlpath=lab/tree/03_predictive_models.ipynb)
|
||||
|
||||
|
||||
## Installation
|
||||
### Objective
|
||||
|
||||
The project can be cloned and may be worked with under the MIT open source
|
||||
license.
|
||||
Python 3.7 was used to prepare and test the provided code.
|
||||
Albeit the [poetry](https://python-poetry.org/) tool was used to manage the
|
||||
dependencies, a [requirements.txt](requirements.txt) file is also provided as
|
||||
an alternative.
|
||||
The **main goal** is to **show** students
|
||||
how **Python** can be used to solve a typical **data science** task.
|
||||
|
||||
On a Unix system, run:
|
||||
- `git clone https://github.com/webartifex/ames-housing.git` (or use HTTPS
|
||||
instead)
|
||||
- either `poetry install` or `pip install -r requirements.txt` (in the latter
|
||||
case, it is suggested that a virtual environment be used)
|
||||
- after installation, `jupyter lab` opens a new tab in one's web browser where
|
||||
the notebooks and data files may be opened
|
||||
|
||||
Alternatively, the project should also be runnable with the
|
||||
[Anaconda Distribution](https://www.anaconda.com/products/individual).
|
||||
### Prerequisites
|
||||
|
||||
To be suitable for *beginners*, there are *no* formal prerequisites.
|
||||
It is only expected that the student has:
|
||||
- a *solid* understanding of the **English** language and
|
||||
- knowledge of **basic mathematics** from high school.
|
||||
|
||||
Some background knowledge in Python is still helpful.
|
||||
To learn about Python and programming in detail,
|
||||
this [introductory course <img height="12" style="display: inline-block" src="static/link/to_gh.png">](https://github.com/webartifex/intro-to-python) is recommended.
|
||||
|
||||
|
||||
### Getting started & Installation
|
||||
|
||||
To follow this workshop, an installation of **Python 3.8** or higher is expected.
|
||||
|
||||
A popular and beginner friendly way is
|
||||
to install the [Anaconda Distribution](https://www.anaconda.com/products/individual)
|
||||
that not only ships Python itself
|
||||
but also comes pre-packaged with a lot of third-party libraries
|
||||
including [Python's scientific stack](https://scipy.org/about.html).
|
||||
|
||||
Detailed instructions can be found [here <img height="12" style="display: inline-block" src="static/link/to_gh.png">](https://github.com/webartifex/intro-to-python#installation).
|
||||
|
||||
As this project assumes a couple of third-party packages
|
||||
that are *not* part of the Anaconda Distribution,
|
||||
it is most likely necessary
|
||||
to run the command `pip install -r requirements.txt`
|
||||
before working with the notebook files.
|
||||
|
||||
|
||||
## Contributing
|
||||
|
||||
Feedback **is highly encouraged** and will be incorporated.
|
||||
Open an issue in the [issues tracker <img height="12" style="display: inline-block" src="static/link/to_gh.png">](https://github.com/webartifex/ames-housing/issues)
|
||||
or initiate a [pull request <img height="12" style="display: inline-block" src="static/link/to_gh.png">](https://help.github.com/en/articles/about-pull-requests)
|
||||
if you are familiar with the concept.
|
||||
Simple issues that *anyone* can **help fix** are, for example,
|
||||
**spelling mistakes** or **broken links**.
|
||||
If you feel that some topic is missing entirely, you may also mention that.
|
||||
The materials here are considered a **permanent work-in-progress**.
|
||||
|
||||
|
||||
## About the Author
|
||||
|
||||
Alexander Hess is a PhD student at the Chair of Logistics Management at the
|
||||
[WHU - Otto Beisheim School of Management](https://www.whu.edu) where he
|
||||
conducts research on urban delivery platforms and teaches an introductory
|
||||
course on Python (cf., [Fall Term 2019](https://vlv.whu.edu/campus/all/event.asp?objgguid=0xE57C2715B01B441AAFD3E79AA05CACCF&from=vvz&gguid=0x6A2B0ED5B2B949E69957A2099E7DE2F1&mode=own&tguid=0x3980A9BBC3BF4A638E977F2DC163F44B&lang=en),
|
||||
[Spring Term 2020](https://vlv.whu.edu/campus/all/event.asp?objgguid=0x3354F4C108FF4E959CDD692A325D9AFE&from=vvz&gguid=0x262E29795DD742CFBDE72B12B69CEFD6&mode=own&lang=en&tguid=0x2E4A7D1FF3C34AD08FF07685461781C9)).
|
||||
|
||||
Connect him on [LinkedIn](https://www.linkedin.com/in/webartifex).
|
||||
Alexander Hess is a PhD student
|
||||
at the Chair of Logistics Management at [WHU - Otto Beisheim School of Management](https://www.whu.edu)
|
||||
where he conducts research on urban delivery platforms
|
||||
and teaches coding courses based on Python in the BSc and MBA programs.
|
||||
|
||||
Connect with him on [LinkedIn](https://www.linkedin.com/in/webartifex).
|
||||
|
|
|
@ -428,7 +428,6 @@ Order,PID,1st Flr SF,1st Flr SF (box-cox-0),2nd Flr SF,3Ssn Porch,Bedroom AbvGr,
|
|||
430,528108130,2020.0,7.61085279039525,0.0,0.0,3,TA,Av,1,0,Ex,788.0,1232.0,0.0,GLQ,Unf,SBrkr,0.0,NA,Gd,1,2,Typ,896.0,3,TA,RFn,TA,2020.0,7.61085279039525,0,1,Ex,Gtl,12350.0,15.654685543608181,Reg,0.0,450.0,0.0,5,98.0,5,9,Y,0.0,NA,0.0,7,3.0,2020.0,290.0,4040.0,21.31759972987615,AllPub,192.0,0,1,1,0,0,0,0,1,0,1,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,1,0,1,1,404000.0,12.909170156943286
|
||||
431,528108140,2020.0,7.61085279039525,0.0,0.0,3,TA,No,1,0,Ex,570.0,1436.0,0.0,GLQ,Unf,SBrkr,0.0,NA,Gd,1,2,Typ,900.0,3,TA,Fin,TA,2020.0,7.61085279039525,1,1,Ex,Gtl,12220.0,15.627551834008008,Reg,0.0,305.0,0.0,9,54.0,5,10,Y,0.0,NA,0.0,9,3.5,2006.0,210.0,4026.0,21.29933447351643,AllPub,156.0,0,1,1,0,0,0,1,0,0,1,1,1,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,1,1,0,0,0,402861.0,12.906346868281233
|
||||
432,528110010,1728.0,7.454719949364001,568.0,0.0,3,TA,Gd,1,0,Ex,384.0,1338.0,0.0,GLQ,Unf,SBrkr,0.0,NA,Gd,1,2,Typ,842.0,3,TA,RFn,TA,2296.0,7.738923757439457,1,1,Ex,Gtl,13478.0,15.879897099046154,IR1,0.0,420.0,0.0,6,274.0,5,10,Y,0.0,NA,0.0,10,3.5,1722.0,656.0,4018.0,21.28887435905055,AllPub,382.0,0,1,1,0,0,0,0,1,1,1,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,1,0,1,1,451950.0,13.021326833226556
|
||||
433,528110020,2674.0,7.891330757661889,0.0,0.0,2,TA,Gd,2,0,Ex,342.0,2288.0,0.0,GLQ,Unf,SBrkr,0.0,NA,Gd,2,2,Typ,762.0,3,TA,Fin,TA,2674.0,7.891330757661889,1,1,Ex,Gtl,13693.0,15.920887120524267,Reg,0.0,472.0,0.0,3,50.0,5,10,Y,0.0,NA,0.0,8,4.5,2630.0,410.0,5304.0,22.790138257062,AllPub,360.0,0,1,1,0,0,0,0,1,0,1,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,1,1,0,2,2,610000.0,13.321214236149494
|
||||
434,528110090,1734.0,7.458186157340487,1088.0,0.0,4,TA,Gd,0,0,Ex,1734.0,0.0,0.0,Unf,Unf,SBrkr,0.0,NA,Gd,1,3,Typ,1020.0,3,TA,RFn,TA,2822.0,7.945201132412759,1,1,Ex,Gtl,13891.0,15.958126895168148,Reg,0.0,424.0,0.0,1,170.0,5,9,Y,0.0,NA,192.0,12,3.5,1734.0,414.0,4556.0,21.95794279603217,AllPub,52.0,0,1,1,0,0,0,0,1,1,1,1,1,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,1,1,1,1,0,582933.0,13.275827535915461
|
||||
435,528112040,1736.0,7.459338895220296,0.0,0.0,3,TA,No,0,0,Ex,1736.0,0.0,0.0,Unf,Unf,SBrkr,0.0,NA,Gd,1,2,Typ,834.0,3,TA,RFn,TA,1736.0,7.459338895220296,0,1,Ex,Gtl,11578.0,15.489619617350394,Reg,0.0,302.0,0.0,7,90.0,5,9,Y,0.0,NA,0.0,7,2.0,1736.0,409.0,3472.0,20.53206536701657,AllPub,319.0,0,1,1,0,0,0,0,1,0,1,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,1,0,1,1,360000.0,12.793859310432293
|
||||
436,528116010,1782.0,7.485491608030754,0.0,0.0,3,TA,Gd,1,0,Ex,251.0,1531.0,0.0,GLQ,Unf,SBrkr,0.0,NA,Gd,2,2,Typ,932.0,3,TA,Fin,TA,1782.0,7.485491608030754,0,1,Gd,Gtl,16870.0,16.46741385670273,IR1,0.0,238.0,0.0,4,82.0,5,8,Y,0.0,NA,0.0,7,3.0,1782.0,181.0,3564.0,20.66596179182348,AllPub,99.0,0,1,1,0,0,0,0,1,0,1,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,1,1,5,4,296000.0,12.598114733306197
|
||||
|
@ -700,7 +699,6 @@ Order,PID,1st Flr SF,1st Flr SF (box-cox-0),2nd Flr SF,3Ssn Porch,Bedroom AbvGr,
|
|||
706,902201110,768.0,6.643789733147672,560.0,0.0,3,TA,No,0,0,TA,384.0,384.0,0.0,BLQ,Unf,FuseA,0.0,MnPrv,NA,0,1,Typ,308.0,1,TA,Unf,TA,1328.0,7.191429330036379,1,1,TA,Gtl,6000.0,13.86795031058074,Reg,0.0,0.0,0.0,3,12.0,7,5,Y,0.0,NA,0.0,6,1.5,768.0,12.0,2096.0,18.080661941177674,AllPub,0.0,0,1,1,0,0,0,1,0,1,1,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,59,39,129500.0,11.771436160121729
|
||||
707,902202150,811.0,6.698268054115413,576.0,0.0,3,TA,No,0,0,TA,811.0,0.0,0.0,Unf,Unf,FuseA,0.0,NA,NA,0,2,Typ,256.0,1,TA,Unf,TA,1387.0,7.234898420314831,0,2,Gd,Gtl,6000.0,13.86795031058074,Reg,0.0,0.0,0.0,11,0.0,4,5,Y,0.0,NA,0.0,7,2.0,811.0,0.0,2198.0,18.30105269110471,AllPub,0.0,0,1,0,1,0,0,1,0,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,55,55,93000.0,11.440354772135393
|
||||
708,902204080,861.0,6.7580945044277305,0.0,0.0,2,TA,No,0,0,Fa,861.0,0.0,0.0,Unf,Unf,SBrkr,128.0,NA,NA,0,1,Typ,288.0,2,TA,Unf,TA,861.0,6.7580945044277305,0,1,TA,Gtl,7404.0,14.375113185074957,Reg,0.0,0.0,0.0,11,0.0,6,4,N,0.0,NA,0.0,5,1.0,861.0,128.0,1722.0,17.190987754073017,AllPub,0.0,0,1,1,0,0,1,0,0,0,1,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,89,59,80000.0,11.289781913656018
|
||||
709,902205010,612.0,6.416732282512326,0.0,0.0,1,NA,NA,0,0,NA,0.0,0.0,0.0,NA,NA,FuseA,25.0,NA,NA,0,1,Typ,308.0,1,Fa,Unf,TA,612.0,6.416732282512326,0,1,TA,Gtl,5925.0,13.837946210420888,Reg,0.0,0.0,0.0,10,0.0,4,2,N,0.0,NA,0.0,4,1.0,0.0,25.0,612.0,13.043479427110126,AllPub,0.0,0,0,1,0,0,1,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,69,59,45000.0,10.714417768752456
|
||||
710,902205030,600.0,6.396929655216146,368.0,0.0,2,TA,No,0,0,TA,600.0,0.0,0.0,Unf,Unf,SBrkr,0.0,GdWo,NA,0,1,Typ,0.0,0,NA,NA,NA,968.0,6.875232087276577,0,1,TA,Gtl,5925.0,13.837946210420888,Reg,0.0,0.0,0.0,5,0.0,6,3,Y,0.0,NA,0.0,6,1.0,600.0,0.0,1568.0,16.7790642448278,AllPub,0.0,1,0,1,0,0,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,99,59,37900.0,10.542706391070517
|
||||
711,902206260,886.0,6.786716950605081,0.0,0.0,2,TA,No,0,0,Fa,190.0,0.0,0.0,Unf,Unf,FuseA,80.0,NA,NA,0,1,Typ,273.0,1,TA,Unf,TA,886.0,6.786716950605081,0,1,TA,Gtl,5784.0,13.780601121178693,Reg,0.0,0.0,0.0,12,20.0,8,5,Y,0.0,NA,0.0,4,1.0,190.0,244.0,1076.0,15.19912043380998,AllPub,144.0,0,1,1,0,0,1,0,0,0,1,0,1,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,71,13,91300.0,11.421906066583059
|
||||
712,902207170,792.0,6.674561391814426,0.0,0.0,2,TA,No,0,0,Fa,624.0,0.0,0.0,Unf,Unf,SBrkr,81.0,GdWo,NA,0,1,Typ,287.0,1,TA,Unf,TA,792.0,6.674561391814426,0,1,TA,Gtl,8520.0,14.719743466735071,Reg,0.0,0.0,0.0,2,0.0,8,5,Y,0.0,NA,0.0,5,1.0,624.0,81.0,1416.0,16.33942215073927,AllPub,0.0,0,1,1,0,0,0,0,1,0,1,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,86,59,99500.0,11.507912923146684
|
||||
|
@ -1745,6 +1743,7 @@ Order,PID,1st Flr SF,1st Flr SF (box-cox-0),2nd Flr SF,3Ssn Porch,Bedroom AbvGr,
|
|||
1770,528354110,1383.0,7.232010331664759,1015.0,0.0,3,TA,No,1,0,Gd,660.0,719.0,0.0,GLQ,Unf,SBrkr,0.0,NA,TA,1,2,Typ,834.0,3,TA,Fin,TA,2398.0,7.7823903355874595,1,1,Gd,Gtl,11787.0,15.53526259606626,IR1,0.0,594.0,0.0,8,60.0,5,7,Y,0.0,NA,0.0,8,3.5,1379.0,299.0,3777.0,20.96566210764097,AllPub,239.0,0,1,1,0,0,0,0,1,1,1,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,11,10,315750.0,12.662706040212738
|
||||
1771,528358040,1214.0,7.101675971619444,1306.0,0.0,4,TA,No,0,0,Gd,638.0,565.0,0.0,GLQ,Unf,SBrkr,0.0,NA,TA,1,2,Typ,721.0,3,TA,RFn,TA,2520.0,7.832014180505469,1,1,Gd,Gtl,9950.0,15.106276534404287,IR1,0.0,290.0,0.0,6,114.0,5,7,Y,0.0,NA,0.0,9,2.5,1203.0,338.0,3723.0,20.8909872937205,AllPub,224.0,1,1,1,0,0,0,0,1,1,1,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,12,12,290000.0,12.577636201962656
|
||||
1772,528364110,1328.0,7.191429330036379,1203.0,0.0,4,TA,Av,0,0,Gd,1198.0,56.0,64.0,LwQ,ALQ,SBrkr,0.0,NA,TA,1,2,Typ,752.0,3,TA,RFn,TA,2531.0,7.836369760545124,1,1,Gd,Gtl,12257.0,15.635300851369571,IR1,0.0,513.0,0.0,11,98.0,5,8,Y,0.0,NA,0.0,9,2.5,1318.0,320.0,3849.0,21.06391115400039,AllPub,222.0,0,1,1,0,0,0,0,1,1,1,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,12,12,290000.0,12.577636201962656
|
||||
1773,528366050,3228.0,8.07961802938984,0.0,0.0,4,TA,No,1,0,Gd,1969.0,1231.0,0.0,GLQ,Unf,SBrkr,291.0,NA,Gd,1,3,Typ,546.0,2,TA,RFn,TA,3228.0,8.07961802938984,0,1,Gd,Gtl,12692.0,15.724859195735945,IR1,0.0,0.0,0.0,5,75.0,5,8,Y,0.0,NA,0.0,10,4.0,3200.0,630.0,6428.0,23.879201299438567,AllPub,264.0,0,1,1,0,0,0,0,1,0,1,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,15,14,430000.0,12.971540487669746
|
||||
1774,528366070,1105.0,7.007600613951853,1097.0,0.0,4,TA,No,1,0,Ex,770.0,335.0,0.0,GLQ,Unf,SBrkr,0.0,NA,TA,1,2,Typ,517.0,2,TA,RFn,TA,2202.0,7.697121317282625,1,1,Gd,Gtl,11762.0,15.529841439857076,Reg,0.0,309.0,0.0,9,65.0,5,8,Y,0.0,NA,144.0,9,3.5,1105.0,209.0,3307.0,20.284644002311033,AllPub,0.0,0,1,1,0,0,0,0,1,1,1,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,15,14,263000.0,12.479909311159902
|
||||
1775,528376010,1335.0,7.19668657083435,1203.0,0.0,4,TA,No,0,0,Gd,100.0,1225.0,0.0,GLQ,Unf,SBrkr,0.0,NA,TA,1,2,Typ,933.0,3,TA,RFn,TA,2538.0,7.839131648274333,1,1,Gd,Gtl,9044.0,14.867724959619174,IR1,0.0,526.0,0.0,5,92.0,5,8,Y,0.0,NA,0.0,8,2.5,1325.0,290.0,3863.0,21.08284412453433,AllPub,198.0,0,1,1,0,0,0,0,1,1,1,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,11,10,330000.0,12.706847933442663
|
||||
1776,528427040,1369.0,7.221835825288449,0.0,0.0,3,TA,No,0,0,Gd,1369.0,0.0,0.0,Unf,Unf,SBrkr,0.0,NA,NA,0,2,Typ,605.0,2,TA,Unf,TA,1369.0,7.221835825288449,0,1,Gd,Gtl,9910.0,15.096165253640338,Reg,0.0,0.0,0.0,9,203.0,6,7,Y,0.0,NA,0.0,5,2.0,1369.0,203.0,2738.0,19.34762998557629,AllPub,0.0,0,1,1,0,0,0,0,1,0,1,0,1,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,1,1,0,0,0,213133.0,12.269671662972327
|
||||
|
@ -2621,6 +2620,7 @@ Order,PID,1st Flr SF,1st Flr SF (box-cox-0),2nd Flr SF,3Ssn Porch,Bedroom AbvGr,
|
|||
2664,902329070,684.0,6.52795791762255,684.0,0.0,3,TA,No,0,0,TA,684.0,0.0,0.0,Unf,Unf,FuseA,0.0,NA,NA,0,1,Typ,216.0,1,Fa,Unf,TA,1368.0,7.221105098182496,0,1,TA,Gtl,3600.0,12.679331552660544,Reg,0.0,0.0,0.0,10,158.0,7,6,N,0.0,NA,0.0,7,1.0,684.0,158.0,2052.0,17.982934302945672,AllPub,0.0,0,0,1,0,0,0,1,0,1,1,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,96,13,114504.0,11.648365035864053
|
||||
2665,902329090,998.0,6.905753276311464,764.0,0.0,4,TA,No,1,0,Fa,596.0,0.0,0.0,Unf,Unf,SBrkr,221.0,NA,NA,0,1,Typ,576.0,2,TA,Unf,TA,1762.0,7.474204806496124,1,1,Gd,Gtl,7200.0,14.307105706203597,Reg,0.0,0.0,0.0,10,0.0,7,7,N,0.0,NA,0.0,8,2.5,596.0,257.0,2358.0,18.630818497230923,AllPub,36.0,0,1,1,0,0,1,0,0,1,1,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,86,56,157000.0,11.964001084330445
|
||||
2666,902400090,1200.0,7.090076835776092,0.0,0.0,4,TA,No,0,0,TA,1200.0,0.0,0.0,Unf,Unf,FuseA,228.0,NA,NA,0,1,Typ,312.0,1,Fa,Unf,Fa,1200.0,7.090076835776092,0,1,TA,Gtl,11340.0,15.43673148163438,Reg,0.0,0.0,0.0,3,0.0,5,6,Y,0.0,NA,0.0,7,1.0,1200.0,228.0,2400.0,18.71440609779312,AllPub,0.0,0,1,1,0,0,1,0,0,0,1,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,83,56,125000.0,11.736069016284437
|
||||
2667,902400110,1518.0,7.325148957955575,1518.0,0.0,4,TA,Mn,0,0,TA,1107.0,0.0,0.0,Unf,Unf,SBrkr,0.0,GdPrv,TA,2,2,Typ,840.0,3,TA,Unf,Ex,3608.0,8.190908881182514,1,1,Ex,Gtl,22950.0,17.294696462432224,IR2,572.0,0.0,0.0,6,260.0,9,10,Y,0.0,NA,410.0,12,2.5,1107.0,670.0,4715.0,22.143531063375193,AllPub,0.0,0,1,1,0,0,1,0,0,1,1,1,1,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,114,13,475000.0,13.071070083016778
|
||||
2668,902401090,624.0,6.436150368369428,624.0,0.0,2,TA,No,0,0,TA,624.0,0.0,0.0,Unf,Unf,FuseA,256.0,NA,NA,0,2,Typ,0.0,0,NA,NA,NA,1248.0,7.129297548929373,0,2,TA,Gtl,5976.0,13.858385901870495,Reg,0.0,0.0,0.0,12,130.0,7,5,N,0.0,NA,0.0,8,2.0,624.0,386.0,1872.0,17.56478361538355,AllPub,0.0,0,0,0,1,0,0,0,1,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,86,56,93500.0,11.44571671527678
|
||||
2669,902406030,960.0,6.866933284461882,0.0,0.0,3,TA,No,0,0,TA,960.0,0.0,0.0,Unf,Unf,SBrkr,0.0,NA,NA,0,1,Typ,624.0,2,TA,Unf,TA,960.0,6.866933284461882,0,1,TA,Gtl,9750.0,15.055349293727405,Reg,0.0,0.0,4500.0,7,0.0,5,5,Y,0.0,NA,0.0,5,1.0,960.0,0.0,1920.0,17.679331552660543,AllPub,0.0,0,1,1,0,0,0,1,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,48,48,125000.0,11.736069016284437
|
||||
2670,902456015,1020.0,6.927557906278317,0.0,0.0,2,Fa,No,0,0,TA,1020.0,0.0,0.0,Unf,Unf,FuseP,105.0,NA,NA,0,1,Typ,0.0,0,NA,NA,NA,1020.0,6.927557906278317,0,1,Fa,Gtl,4761.0,13.322216261602419,Reg,0.0,0.0,0.0,10,0.0,3,3,N,0.0,NA,0.0,5,1.0,1020.0,105.0,2040.0,17.95599057784113,AllPub,0.0,0,0,1,0,0,1,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,88,56,64500.0,11.074420502783864
|
||||
|
|
Can't render this file because it is too large.
|
2261
poetry.lock
generated
2261
poetry.lock
generated
File diff suppressed because it is too large
Load diff
|
@ -1,14 +1,15 @@
|
|||
[build-system]
|
||||
build-backend = "poetry.masonry.api"
|
||||
requires = ["poetry>=0.12"]
|
||||
|
||||
[tool.black]
|
||||
line-length = 79
|
||||
requires = ["poetry-core>=1.0.0"]
|
||||
build-backend = "poetry.core.masonry.api"
|
||||
|
||||
[tool.poetry]
|
||||
authors = ["Alexander Hess <alexander@webartifex.biz>"]
|
||||
name = "ames-housing"
|
||||
version = "0.1.0"
|
||||
|
||||
authors = [
|
||||
"Alexander Hess <alexander@webartifex.biz>",
|
||||
]
|
||||
description = "A case study on predicting house prices in Ames, Iowa"
|
||||
homepage = "https://github.com/webartifex/ames-housing"
|
||||
keywords = [
|
||||
"data-science",
|
||||
"data-cleaning",
|
||||
|
@ -18,27 +19,26 @@ keywords = [
|
|||
"predictive-analytics",
|
||||
]
|
||||
license = "MIT"
|
||||
name = "ames-housing"
|
||||
|
||||
readme = "README.md"
|
||||
homepage = "https://github.com/webartifex/ames-housing"
|
||||
repository = "https://github.com/webartifex/ames-housing"
|
||||
version = "0.1.0"
|
||||
|
||||
[tool.poetry.dependencies]
|
||||
jupyterlab = "^2.1.5"
|
||||
matplotlib = "^3.2.2"
|
||||
python = "^3.8"
|
||||
jupyterlab = "^3.0.16"
|
||||
missingno = "^0.4.2"
|
||||
numpy = "^1.19.0"
|
||||
pandas = "^1.0.5"
|
||||
python = "^3.7"
|
||||
requests = "^2.24.0"
|
||||
seaborn = "^0.10.1"
|
||||
sklearn = "^0.0"
|
||||
statsmodels = "^0.11.1"
|
||||
tabulate = "^0.8.7"
|
||||
tqdm = "^4.47.0"
|
||||
xlrd = "^1.2.0"
|
||||
xlwt = "^1.3.0"
|
||||
matplotlib = "^3.4.2"
|
||||
numpy = "^1.20.3"
|
||||
pandas = "^1.2.4"
|
||||
requests = "^2.25.1"
|
||||
scikit-learn = "^0.24.2"
|
||||
seaborn = "^0.11.1"
|
||||
statsmodels = "^0.12.2"
|
||||
tabulate = "^0.8.9"
|
||||
tqdm = "^4.61.0"
|
||||
xlrd = "^2.0.1"
|
||||
|
||||
[tool.poetry.dev-dependencies]
|
||||
black = "^19.10b0"
|
||||
pylint = "^2.5.3"
|
||||
black = "^21.5b1"
|
||||
pylint = "^2.8.2"
|
||||
|
|
165
requirements.txt
165
requirements.txt
|
@ -1,82 +1,85 @@
|
|||
appdirs==1.4.4
|
||||
astroid==2.4.2
|
||||
attrs==19.3.0
|
||||
backcall==0.2.0
|
||||
black==19.10b0
|
||||
bleach==3.1.5
|
||||
certifi==2020.6.20
|
||||
chardet==3.0.4
|
||||
click==7.1.2
|
||||
cycler==0.10.0
|
||||
decorator==4.4.2
|
||||
defusedxml==0.6.0
|
||||
entrypoints==0.3
|
||||
idna==2.10
|
||||
importlib-metadata==1.7.0
|
||||
ipykernel==5.3.0
|
||||
ipython==7.16.1
|
||||
ipython-genutils==0.2.0
|
||||
isort==4.3.21
|
||||
jedi==0.17.1
|
||||
Jinja2==2.11.2
|
||||
joblib==0.15.1
|
||||
json5==0.9.5
|
||||
jsonschema==3.2.0
|
||||
jupyter-client==6.1.3
|
||||
jupyter-core==4.6.3
|
||||
jupyterlab==2.1.5
|
||||
jupyterlab-server==1.1.5
|
||||
kiwisolver==1.2.0
|
||||
lazy-object-proxy==1.4.3
|
||||
MarkupSafe==1.1.1
|
||||
matplotlib==3.2.2
|
||||
mccabe==0.6.1
|
||||
anyio==3.1.0; python_full_version >= "3.6.2" and python_version >= "3.6"
|
||||
appnope==0.1.2; sys_platform == "darwin" and python_version >= "3.7" and platform_system == "Darwin"
|
||||
argon2-cffi==20.1.0; python_version >= "3.6"
|
||||
async-generator==1.10; python_full_version >= "3.6.1" and python_version >= "3.6"
|
||||
attrs==21.2.0; python_version >= "3.6" and python_full_version < "3.0.0" or python_full_version >= "3.5.0" and python_version >= "3.6"
|
||||
babel==2.9.1; python_version >= "3.6" and python_full_version < "3.0.0" or python_full_version >= "3.4.0" and python_version >= "3.6"
|
||||
backcall==0.2.0; python_version >= "3.7"
|
||||
bleach==3.3.0; python_version >= "3.6" and python_full_version < "3.0.0" or python_full_version >= "3.5.0" and python_version >= "3.6"
|
||||
certifi==2020.12.5; python_version >= "3.6" and python_full_version < "3.0.0" or python_full_version >= "3.5.0" and python_version >= "3.6"
|
||||
cffi==1.14.5; implementation_name == "pypy" and python_version >= "3.6"
|
||||
chardet==4.0.0; python_version >= "3.6" and python_full_version < "3.0.0" or python_full_version >= "3.5.0" and python_version >= "3.6"
|
||||
colorama==0.4.4; python_version >= "3.7" and python_full_version < "3.0.0" and sys_platform == "win32" or sys_platform == "win32" and python_version >= "3.7" and python_full_version >= "3.5.0"
|
||||
cycler==0.10.0; python_version >= "3.7"
|
||||
decorator==5.0.9; python_version >= "3.7"
|
||||
defusedxml==0.7.1; python_version >= "3.6" and python_full_version < "3.0.0" or python_full_version >= "3.5.0" and python_version >= "3.6"
|
||||
entrypoints==0.3; python_version >= "3.6"
|
||||
idna==2.10; python_full_version >= "3.6.2" and python_version >= "3.6" and (python_version >= "3.6" and python_full_version < "3.0.0" or python_full_version >= "3.5.0" and python_version >= "3.6")
|
||||
ipykernel==5.5.5; python_version >= "3.6"
|
||||
ipython-genutils==0.2.0; python_version >= "3.7"
|
||||
ipython==7.23.1; python_version >= "3.7"
|
||||
jedi==0.18.0; python_version >= "3.7"
|
||||
jinja2==3.0.1; python_version >= "3.6"
|
||||
joblib==1.0.1; python_version >= "3.6"
|
||||
json5==0.9.5; python_version >= "3.6"
|
||||
jsonschema==3.2.0; python_version >= "3.6"
|
||||
jupyter-client==6.2.0; python_full_version >= "3.6.1" and python_version >= "3.6"
|
||||
jupyter-core==4.7.1; python_full_version >= "3.6.1" and python_version >= "3.6"
|
||||
jupyter-server==1.8.0; python_version >= "3.6"
|
||||
jupyterlab-pygments==0.1.2; python_version >= "3.6"
|
||||
jupyterlab-server==2.5.2; python_version >= "3.6"
|
||||
jupyterlab==3.0.16; python_version >= "3.6"
|
||||
kiwisolver==1.3.1; python_version >= "3.7"
|
||||
markupsafe==2.0.1; python_version >= "3.6"
|
||||
matplotlib-inline==0.1.2; python_version >= "3.7"
|
||||
matplotlib==3.4.2; python_version >= "3.7"
|
||||
missingno==0.4.2
|
||||
mistune==0.8.4
|
||||
nbconvert==5.6.1
|
||||
nbformat==5.0.7
|
||||
notebook==6.0.3
|
||||
numpy==1.19.0
|
||||
packaging==20.4
|
||||
pandas==1.0.5
|
||||
pandocfilters==1.4.2
|
||||
parso==0.7.0
|
||||
pathspec==0.8.0
|
||||
patsy==0.5.1
|
||||
pexpect==4.8.0
|
||||
pickleshare==0.7.5
|
||||
prometheus-client==0.8.0
|
||||
prompt-toolkit==3.0.5
|
||||
ptyprocess==0.6.0
|
||||
Pygments==2.6.1
|
||||
pylint==2.5.3
|
||||
pyparsing==2.4.7
|
||||
pyrsistent==0.16.0
|
||||
python-dateutil==2.8.1
|
||||
pytz==2020.1
|
||||
pyzmq==19.0.1
|
||||
regex==2020.6.8
|
||||
requests==2.24.0
|
||||
scikit-learn==0.23.1
|
||||
scipy==1.5.0
|
||||
seaborn==0.10.1
|
||||
Send2Trash==1.5.0
|
||||
six==1.15.0
|
||||
sklearn==0.0
|
||||
statsmodels==0.11.1
|
||||
tabulate==0.8.7
|
||||
terminado==0.8.3
|
||||
testpath==0.4.4
|
||||
threadpoolctl==2.1.0
|
||||
toml==0.10.1
|
||||
tornado==6.0.4
|
||||
tqdm==4.47.0
|
||||
traitlets==4.3.3
|
||||
typed-ast==1.4.1
|
||||
urllib3==1.25.9
|
||||
wcwidth==0.2.5
|
||||
webencodings==0.5.1
|
||||
wrapt==1.12.1
|
||||
xlrd==1.2.0
|
||||
xlwt==1.3.0
|
||||
zipp==3.1.0
|
||||
mistune==0.8.4; python_version >= "3.6"
|
||||
nbclassic==0.3.1; python_version >= "3.6"
|
||||
nbclient==0.5.3; python_full_version >= "3.6.1" and python_version >= "3.6"
|
||||
nbconvert==6.0.7; python_version >= "3.6"
|
||||
nbformat==5.1.3; python_full_version >= "3.6.1" and python_version >= "3.6"
|
||||
nest-asyncio==1.5.1; python_full_version >= "3.6.1" and python_version >= "3.6"
|
||||
notebook==6.4.0; python_version >= "3.6"
|
||||
numpy==1.20.3; python_version >= "3.7"
|
||||
packaging==20.9; python_version >= "3.6" and python_full_version < "3.0.0" or python_full_version >= "3.5.0" and python_version >= "3.6"
|
||||
pandas==1.2.4; python_full_version >= "3.7.1"
|
||||
pandocfilters==1.4.3; python_version >= "3.6" and python_full_version < "3.0.0" or python_full_version >= "3.4.0" and python_version >= "3.6"
|
||||
parso==0.8.2; python_version >= "3.7"
|
||||
patsy==0.5.1; python_version >= "3.6"
|
||||
pexpect==4.8.0; sys_platform != "win32" and python_version >= "3.7"
|
||||
pickleshare==0.7.5; python_version >= "3.7"
|
||||
pillow==8.2.0; python_version >= "3.7"
|
||||
prometheus-client==0.10.1; python_version >= "3.6" and python_full_version < "3.0.0" or python_full_version >= "3.4.0" and python_version >= "3.6"
|
||||
prompt-toolkit==3.0.18; python_full_version >= "3.6.1" and python_version >= "3.7"
|
||||
ptyprocess==0.7.0; sys_platform != "win32" and python_version >= "3.7" and os_name != "nt"
|
||||
py==1.10.0; python_version >= "3.6" and python_full_version < "3.0.0" and implementation_name == "pypy" or implementation_name == "pypy" and python_version >= "3.6" and python_full_version >= "3.4.0"
|
||||
pycparser==2.20; python_version >= "3.6" and python_full_version < "3.0.0" or python_full_version >= "3.4.0" and python_version >= "3.6"
|
||||
pygments==2.9.0; python_version >= "3.7"
|
||||
pyparsing==2.4.7; python_version >= "3.7" and python_full_version < "3.0.0" or python_full_version >= "3.4.0" and python_version >= "3.7"
|
||||
pyrsistent==0.17.3; python_version >= "3.6"
|
||||
python-dateutil==2.8.1; python_full_version >= "3.7.1" and python_version >= "3.7" and (python_version >= "3.7" and python_full_version < "3.0.0" or python_full_version >= "3.3.0" and python_version >= "3.7")
|
||||
pytz==2021.1; python_full_version >= "3.7.1" and python_version >= "3.6" and (python_version >= "3.6" and python_full_version < "3.0.0" or python_full_version >= "3.4.0" and python_version >= "3.6")
|
||||
pywin32==300; sys_platform == "win32" and python_version >= "3.6"
|
||||
pywinpty==1.1.1; os_name == "nt" and python_version >= "3.6"
|
||||
pyzmq==22.0.3; python_full_version >= "3.6.1" and python_version >= "3.6"
|
||||
requests==2.25.1; (python_version >= "2.7" and python_full_version < "3.0.0") or (python_full_version >= "3.5.0")
|
||||
scikit-learn==0.24.2; python_version >= "3.6"
|
||||
scipy==1.6.1; python_version >= "3.7"
|
||||
seaborn==0.11.1; python_version >= "3.6"
|
||||
send2trash==1.5.0; python_version >= "3.6"
|
||||
six==1.16.0; python_version >= "3.7" and python_full_version < "3.0.0" or python_full_version >= "3.5.0" and python_version >= "3.7"
|
||||
sniffio==1.2.0; python_full_version >= "3.6.2" and python_version >= "3.6"
|
||||
statsmodels==0.12.2; python_version >= "3.6"
|
||||
tabulate==0.8.9
|
||||
terminado==0.10.0; python_version >= "3.6"
|
||||
testpath==0.5.0; python_version >= "3.6"
|
||||
threadpoolctl==2.1.0; python_version >= "3.6"
|
||||
tornado==6.1; python_full_version >= "3.6.1" and python_version >= "3.6"
|
||||
tqdm==4.61.0; (python_version >= "2.7" and python_full_version < "3.0.0") or (python_full_version >= "3.4.0")
|
||||
traitlets==5.0.5; python_full_version >= "3.6.1" and python_version >= "3.7"
|
||||
urllib3==1.26.4; python_version >= "3.6" and python_full_version < "3.0.0" or python_full_version >= "3.5.0" and python_version < "4" and python_version >= "3.6"
|
||||
wcwidth==0.2.5; python_full_version >= "3.6.1" and python_version >= "3.7"
|
||||
webencodings==0.5.1; python_version >= "3.6" and python_full_version < "3.0.0" or python_full_version >= "3.5.0" and python_version >= "3.6"
|
||||
websocket-client==1.0.1; python_version >= "3.6"
|
||||
xlrd==2.0.1; (python_version >= "2.7" and python_full_version < "3.0.0") or (python_full_version >= "3.6.0")
|
||||
|
|
2
static/link/README.md
Normal file
2
static/link/README.md
Normal file
|
@ -0,0 +1,2 @@
|
|||
This folder contains small images
|
||||
that are used to enhance the links in the notebooks and markdown files.
|
BIN
static/link/to_gh.png
Normal file
BIN
static/link/to_gh.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 1.2 KiB |
Before Width: | Height: | Size: 912 B After Width: | Height: | Size: 912 B |
Loading…
Reference in a new issue