Merge branch 'chapter-08-mfr' into develop

This commit is contained in:
Alexander Hess 2020-10-21 19:25:05 +02:00
commit bdaa86ede2
Signed by: alexander
GPG key ID: 344EA5AB10D868E0
12 changed files with 7079 additions and 41 deletions

View file

@ -771,7 +771,7 @@
" - *Chapter 5*: [Numbers & Bits <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_nb.png\">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/05_numbers/00_content.ipynb)\n", " - *Chapter 5*: [Numbers & Bits <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_nb.png\">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/05_numbers/00_content.ipynb)\n",
" - *Chapter 6*: [Text & Bytes <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_nb.png\">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/06_text/00_content.ipynb)\n", " - *Chapter 6*: [Text & Bytes <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_nb.png\">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/06_text/00_content.ipynb)\n",
" - *Chapter 7*: [Sequential Data <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_nb.png\">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/07_sequences/00_content.ipynb)\n", " - *Chapter 7*: [Sequential Data <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_nb.png\">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/07_sequences/00_content.ipynb)\n",
" - *Chapter 8*: Map, Filter, & Reduce\n", " - *Chapter 8*: [Map, Filter, & Reduce <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_nb.png\">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/08_mfr/00_content.ipynb)\n",
" - *Chapter 9*: Mappings & Sets\n", " - *Chapter 9*: Mappings & Sets\n",
" - *Chapter 10*: Arrays & Dataframes\n", " - *Chapter 10*: Arrays & Dataframes\n",
"- How can we create custom data types?\n", "- How can we create custom data types?\n",

1381
08_mfr/00_content.ipynb Normal file

File diff suppressed because it is too large Load diff

2384
08_mfr/01_content.ipynb Normal file

File diff suppressed because it is too large Load diff

760
08_mfr/02_exercises.ipynb Normal file
View file

@ -0,0 +1,760 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Note**: Click on \"*Kernel*\" > \"*Restart Kernel and Run All*\" in [JupyterLab](https://jupyterlab.readthedocs.io/en/stable/) *after* finishing the exercises to ensure that your solution runs top to bottom *without* any errors. If you cannot run this file on your machine, you may want to open it [in the cloud <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_mb.png\">](https://mybinder.org/v2/gh/webartifex/intro-to-python/develop?urlpath=lab/tree/08_mfr/02_exercises.ipynb)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Chapter 8: Map, Filter, & Reduce (Coding Exercises)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The exercises below assume that you have read the [first <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_nb.png\">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/08_mfr/00_content.ipynb) and [second <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_nb.png\">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/08_mfr/01_content.ipynb) part of Chapter 8.\n",
"\n",
"The `...`'s in the code cells indicate where you need to fill in code snippets. The number of `...`'s within a code cell give you a rough idea of how many lines of code are needed to solve the task. You should not need to create any additional code cells for your final solution. However, you may want to use temporary code cells to try out some ideas."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Removing Outliers in Streaming Data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's say we are given a `list` object with random integers like `sample` below, and we want to calculate some basic statistics on them."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"sample = [\n",
" 45, 46, 40, 49, 36, 53, 49, 42, 25, 40, 39, 36, 38, 40, 40, 52, 36, 52, 40, 41,\n",
" 35, 29, 48, 43, 42, 30, 29, 33, 55, 33, 38, 50, 39, 56, 52, 28, 37, 56, 45, 37,\n",
" 41, 41, 37, 30, 51, 32, 23, 40, 53, 40, 45, 39, 99, 42, 34, 42, 34, 39, 39, 53,\n",
" 43, 37, 46, 36, 45, 42, 32, 38, 57, 34, 36, 44, 47, 51, 46, 39, 28, 40, 35, 46,\n",
" 41, 51, 41, 23, 46, 40, 40, 51, 50, 32, 47, 36, 38, 29, 32, 53, 34, 43, 39, 41,\n",
" 40, 34, 44, 40, 41, 43, 47, 57, 50, 42, 38, 25, 45, 41, 58, 37, 45, 55, 44, 53,\n",
" 82, 31, 45, 33, 32, 39, 46, 48, 42, 47, 40, 45, 51, 35, 31, 46, 40, 44, 61, 57,\n",
" 40, 36, 35, 55, 40, 56, 36, 35, 86, 36, 51, 40, 54, 50, 49, 36, 41, 37, 48, 41,\n",
" 42, 44, 40, 43, 51, 47, 46, 50, 40, 23, 40, 39, 28, 38, 42, 46, 46, 42, 46, 31,\n",
" 32, 40, 48, 27, 40, 40, 30, 32, 25, 31, 30, 43, 44, 29, 45, 41, 63, 32, 33, 58,\n",
"]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"len(sample)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Q1**: `list` objects are **sequences**. What *four* behaviors do they always come with?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" < your answer >"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Q2**: Write a function `mean()` that calculates the simple arithmetic mean of a given `sequence` with numbers!\n",
"\n",
"Hints: You can solve this task with [built-in functions <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html) only. A `for`-loop is *not* needed."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def mean(sequence):\n",
" ..."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"sample_mean = mean(sample)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"sample_mean"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Q3**: Write a function `std()` that calculates the [standard deviation <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_wiki.png\">](https://en.wikipedia.org/wiki/Standard_deviation) of a `sequence` of numbers! Integrate your `mean()` version from before and the [sqrt() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/math.html#math.sqrt) function from the [math <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/math.html) module in the [standard library <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/index.html) provided to you below. Make sure `std()` calls `mean()` only *once* internally! Repeated calls to `mean()` would be a waste of computational resources.\n",
"\n",
"Hints: Parts of the code are probably too long to fit within the suggested 79 characters per line. So, use *temporary variables* inside your function. Instead of a `for`-loop, you may want to use a `list` comprehension or, even better, a memoryless `generator` expression."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from math import sqrt"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def std(sequence):\n",
" ..."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"sample_std = std(sample)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"sample_std"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Q4**: Complete `standardize()` below that takes a `sequence` of numbers and returns a `list` object with the **[z-scores <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_wiki.png\">](https://en.wikipedia.org/wiki/Standard_score)** of these numbers! A z-score is calculated by subtracting the mean and dividing by the standard deviation. Re-use `mean()` and `std()` from before. Again, ensure that `standardize()` calls `mean()` and `std()` only *once*! Further, round all z-scores with the built-in [round() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#round) function and pass on the keyword-only argument `digits` to it.\n",
"\n",
"Hint: You may want to use a `list` comprehension instead of a `for`-loop."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def standardize(sequence, *, digits=3):\n",
" ..."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"z_scores = standardize(sample)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The [pprint() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/pprint.html#pprint.pprint) function from the [pprint <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/pprint.html) module in the [standard library <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/index.html) allows us to \"pretty print\" long `list` objects compactly."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from pprint import pprint"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"pprint(z_scores, compact=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We know that `standardize()` works correctly if the resulting z-scores' mean and standard deviation approach `0` and `1` for a long enough `sequence`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"mean(z_scores), std(z_scores)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Even though `standardize()` calls `mean()` and `std()` only once each, `mean()` is called *twice*! That is so because `std()` internally also re-uses `mean()`!\n",
"\n",
"**Q5.1**: Rewrite `std()` to take an optional keyword-only argument `seq_mean`, defaulting to `None`. If provided, `seq_mean` is used instead of the result of calling `mean()`. Otherwise, the latter is called.\n",
"\n",
"Hint: You must check if `seq_mean` is still the default value."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def std(sequence, *, seq_mean=None):\n",
" ..."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`std()` continues to work as before."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"sample_std = std(sample)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"sample_std"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Q5.2**: Now, rewrite `standardize()` to pass on the return value of `mean()` to `std()`! In summary, `standardize()` calculates the z-scores for the numbers in the `sequence` with as few computational steps as possible."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def standardize(sequence, *, digits=3):\n",
" ..."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"z_scores = standardize(sample)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"mean(z_scores), std(z_scores)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Q6**: With both `sample` and `z_scores` being materialized `list` objects, we can loop over pairs consisting of a number from `sample` and its corresponding z-score. Write a `for`-loop that prints out all the \"outliers,\" as which we define numbers with an absolute z-score above `1.96`. There are *four* of them in the `sample`.\n",
"\n",
"Hint: Use the [abs() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#abs) and [zip() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#zip) built-ins."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"..."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We provide a `stream` module with a `data` object that models an *infinite* **stream** of data (cf., the [*stream.py* <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_gh.png\">](https://github.com/webartifex/intro-to-python/blob/develop/08_mfr/stream.py) file in the repository)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from stream import data"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`data` is of type `generator` and has *no* length."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"type(data)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"len(data)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"So, the only thing we can do with it is to pass it to the built-in [next() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#next) function and go over the numbers it streams one by one."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"next(data)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Q7**: What happens if you call `mean()` with `data` as the argument? What is the problem?\n",
"\n",
"Hints: If you try it out, you may have to press the \"Stop\" button in the toolbar at the top. Your computer should *not* crash, but you will *have to* restart this Jupyter notebook with \"Kernel\" > \"Restart\" and import `data` again."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" < your answer >"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"mean(data)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Q8**: Write a function `take_sample()` that takes an `iterable` as its argument, like `data`, and creates a *materialized* `list` object out of its first `n` elements, defaulting to `1_000`!\n",
"\n",
"Hints: [next() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#next) and the [range() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#func-range) built-in may be helpful. You may want to use a `list` comprehension instead of a `for`-loop and write a one-liner. Audacious students may want to look at [isclice() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/itertools.html#itertools.islice) in the [itertools <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/itertools.html) module in the [standard library <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/index.html)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def take_sample(iterable, *, n=1_000):\n",
" ..."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We take a `new_sample` from the stream of `data`, and its statistics are similar to the initial `sample`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"new_sample = take_sample(data)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"len(new_sample)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"mean(new_sample)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"std(new_sample)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Q9**: Convert `standardize()` into a *new* function `standardized()` that implements the *same* logic but works on a possibly *infinite* stream of data, provided as an `iterable`, instead of a *finite* `sequence`.\n",
"\n",
"To calculate a z-score, we need the stream's overall mean and standard deviation, and that is *impossible* to calculate if we do not know how long the stream is, and, in particular, if it is *infinite*. So, `standardized()` first takes a sample from the `iterable` internally, and uses the sample's mean and standard deviation to calculate the z-scores.\n",
"\n",
"Hint: `standardized()` *must* return a `generator` object. So, use a `generator` expression as the return value; unless you know about the `yield` statement already (cf., [reference <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/reference/simple_stmts.html#the-yield-statement))."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def standardized(iterable, *, digits=3):\n",
" ..."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`standardized()` works almost like `standardize()` except that we use it with [next() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#next) to obtain the z-scores one by one."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"z_scores = standardized(data)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"z_scores"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"type(z_scores)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"next(z_scores)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Q10.1**: `standardized()` allows us to go over an *infinite* stream of z-scores. What we want to do instead is to loop over the stream's raw numbers and skip the outliers. In the remainder of this exercise, you look at the parts that make up the `skip_outliers()` function below to achieve precisely that.\n",
"\n",
"The first steps in `skip_outliers()` are the same as in `standardized()`: We take a `sample` from the stream of `data` and calculate its statistics."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"sample = ...\n",
"seq_mean = ...\n",
"seq_std = ..."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Q10.2**: Just as in `standardized()`, write a `generator` expression that produces z-scores one by one! However, instead of just generating a z-score, the resulting `generator` object should produce `tuple` objects consisting of a \"raw\" number from `data` and its z-score.\n",
"\n",
"Hint: Look at the revisited \"*Averaging Even Numbers*\" example in [Chapter 7 <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_nb.png\">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/07_sequences/01_content.ipynb#Example:-Averaging-all-even-Numbers-in-a-List-%28revisited%29) for some inspiration, which also contains a `generator` expression producing `tuple` objects."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"standardizer = (... for ... in data)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`standardizer` should produce `tuple` objects."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"next(standardizer)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Q10.3**: Write another `generator` expression that loops over `standardizer`. It contains an `if`-clause that keeps only numbers with an absolute z-score below the `threshold_z`. If you fancy, use `tuple` unpacking."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"threshold_z = 1.96"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"no_outliers = (... for ... in standardizer if ...)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`no_outliers` should produce `int` objects."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"next(no_outliers)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Q10.4**: Lastly, put everything together in the `skip_outliers()` function! Make sure you refer to `iterable` inside the function and not the global `data`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def skip_outliers(iterable, *, threshold_z=1.96):\n",
" sample = ...\n",
" seq_mean = ...\n",
" seq_std = ...\n",
" standardizer = ...\n",
" no_outliers = ...\n",
" return no_outliers"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, we can create a `generator` object and loop over the `data` in the stream with outliers skipped. Instead of the default `1.96`, we use a `threshold_z` of only `0.05`: That filters out all numbers except `42`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"skipper = skip_outliers(data, threshold_z=0.05)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"skipper"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"type(skipper)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"next(skipper)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Q11**: You implemented the functions `mean()`, `std()`, `standardize()`, `standardized()`, and `skip_outliers()`. Which of them are **eager**, and which are **lazy**? How do these two concepts relate to **finite** and **infinite** data?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" < your answer >"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.6"
},
"toc": {
"base_numbering": 1,
"nav_menu": {},
"number_sections": false,
"sideBar": true,
"skip_h1_title": true,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": false,
"toc_position": {},
"toc_section_display": false,
"toc_window_display": false
}
},
"nbformat": 4,
"nbformat_minor": 4
}

455
08_mfr/03_exercises.ipynb Normal file
View file

@ -0,0 +1,455 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Note**: Click on \"*Kernel*\" > \"*Restart Kernel and Run All*\" in [JupyterLab](https://jupyterlab.readthedocs.io/en/stable/) *after* finishing the exercises to ensure that your solution runs top to bottom *without* any errors. If you cannot run this file on your machine, you may want to open it [in the cloud <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_mb.png\">](https://mybinder.org/v2/gh/webartifex/intro-to-python/develop?urlpath=lab/tree/08_mfr/03_exercises.ipynb)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Chapter 8: Map, Filter, & Reduce (Coding Exercises)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The exercises below assume that you have read the [first <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_nb.png\">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/08_mfr/00_content.ipynb) and [second <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_nb.png\">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/08_mfr/01_content.ipynb) part of Chapter 8.\n",
"\n",
"The `...`'s in the code cells indicate where you need to fill in code snippets. The number of `...`'s within a code cell give you a rough idea of how many lines of code are needed to solve the task. You should not need to create any additional code cells for your final solution. However, you may want to use temporary code cells to try out some ideas."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Packing & Unpacking with Functions (continued)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Q1**: Copy your solution to **Q10** from the \"*Packing & Unpacking with Functions*\" exercise in [Chapter 7 <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_nb.png\">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/07_sequences/04_exercises.ipynb) into the code cell below!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import collections.abc as abc"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def product(*args, ...):\n",
" \"\"\"Multiply all arguments.\"\"\"\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
"\n",
" ...\n",
" ...\n",
"\n",
" return ..."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Q2**: Verify that all test cases below work (i.e., the `assert` statements must *not* raise an `AssertionError`)!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"assert product(42) == 42"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"assert product(2, 5, 10) == 100"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"assert product(2, 5, 10, start=2) == 200"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"one_hundred = [2, 5, 10]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"assert product(one_hundred) == 100"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"assert product(*one_hundred) == 100"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Q3**: Verify that `product()` raises a `TypeError` when called without any arguments!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"product()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This implementation of `product()` is convenient to use, in particular, because we can pass it any *collection* object with or without *unpacking* it.\n",
"\n",
"However, `product()` suffers from one last flaw: We cannot pass it a **stream** of data, as modeled, for example, with a `generator` object that produces elements on a one-by-one basis.\n",
"\n",
"**Q4**: Click through the following code cells and observe what they do!\n",
"\n",
"The [*stream.py* <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_gh.png\">](https://github.com/webartifex/intro-to-python/blob/develop/08_mfr/stream.py) module in the book's repository provides a `make_finite_stream()` function. It is a *factory* function creating objects of type `generator` that we use to model *streaming* data."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from stream import make_finite_stream"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"data = make_finite_stream()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"data"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"type(data)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`generator` objects are good for only *one* thing: Giving us the \"next\" element in a series of possibly *infinitely* many objects. While the `data` object is finite (i.e., execute the next code cell until you see a `StopIteration` exception), ..."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"next(data)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"... it has *no* concept of a \"length:\" The built-in [len() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#len) function raises a `TypeError`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"len(data)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can use the built-in [list() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#func-list) constructor to *materialize* all elements. However, in a real-world scenario, these may *not* fit into our machine's memory! If you get an empty `list` object below, you have to create a *new* `data` object above again."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"list(data)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To be more realistic, `make_finite_stream()` creates `generator` objects producing a varying number of elements."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"list(make_finite_stream())"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"list(make_finite_stream())"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"list(make_finite_stream())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's see what happens if we pass a `generator` object, as created by `make_finite_stream()`, instead of a materialized *collection*, like `one_hundred`, to `product()`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"product(make_finite_stream())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Q5**: What line causes the `TypeError`? What line is really the problem in `product()`? Hint: These may be different lines. Describe what happens on each line in the function's body until the exception is raised!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" < your answer >"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Q6**: Adapt `product()` one last time to make it work with `generator` objects, or more generallz *iterators*, as well!\n",
"\n",
"Hints: This task is as easy as replacing `Collection` with something else. Which of the three behaviors of *collections* do `generator` objects also exhibit? You may want to look at the documentations on the built-in [max() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#max), [min() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#min), and [sum() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#sum) functions: What kind of argument do they take?"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def product(*args, ...):\n",
" \"\"\"Multiply all arguments.\"\"\"\n",
" ...\n",
" ...\n",
" ...\n",
" ...\n",
"\n",
" ...\n",
" ...\n",
"\n",
" return ..."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The final version of `product()` behaves like built-ins in edge cases (i.e., `sum()` also raises a `TypeError` when called without arguments), ..."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"product()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"... works with the arguments passed either separately as *positional* arguments, *packed* together into a single *collection* argument, or *unpacked*, ..."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"product(42)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"product(2, 5, 10)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"product([2, 5, 10])"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"product(*[2, 5, 10])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"... and can handle *streaming* data with *indefinite* \"length.\""
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"product(make_finite_stream())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In real-world projects, the data science practitioner must decide if it is worthwhile to make a function usable in various different forms as we do in this exercise. This may be over-engineered.\n",
"\n",
"Yet, two lessons are important to take away:\n",
"- It is a good idea to *mimic* the behavior of *built-ins* when accepting arguments, and\n",
"- make functions capable of working with *streaming* data."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.6"
},
"toc": {
"base_numbering": 1,
"nav_menu": {},
"number_sections": false,
"sideBar": true,
"skip_h1_title": true,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": false,
"toc_position": {},
"toc_section_display": false,
"toc_window_display": false
}
},
"nbformat": 4,
"nbformat_minor": 4
}

1672
08_mfr/04_content.ipynb Normal file

File diff suppressed because it is too large Load diff

75
08_mfr/05_summary.ipynb Normal file
View file

@ -0,0 +1,75 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Chapter 8: Map, Filter, & Reduce (TL;DR)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"The operations we do with sequential data commonly follow the **map-filter-reduce paradigm**: We apply the same transformation to all elements, filter some of them out, and calculate summary statistics from the remaining ones.\n",
"\n",
"An essential idea in this chapter is that, in many situations, we need *not* have all the data **materialized** in memory. Instead, **iterators** allow us to process sequential data on a one-by-one basis.\n",
"\n",
"Examples for iterators are the `map`, `filter`, and `generator` types."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.6"
},
"livereveal": {
"auto_select": "code",
"auto_select_fragment": true,
"scroll": true,
"theme": "serif"
},
"toc": {
"base_numbering": 1,
"nav_menu": {},
"number_sections": false,
"sideBar": true,
"skip_h1_title": true,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": false,
"toc_position": {
"height": "calc(100% - 180px)",
"left": "10px",
"top": "150px",
"width": "384px"
},
"toc_section_display": false,
"toc_window_display": false
}
},
"nbformat": 4,
"nbformat_minor": 4
}

214
08_mfr/06_review.ipynb Normal file
View file

@ -0,0 +1,214 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Chapter 8: Map, Filter, & Reduce (Review Questions)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The questions below assume that you have read the [first <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_nb.png\">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/08_mfr/00_content.ipynb), [second <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_nb.png\">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/08_mfr/01_content.ipynb), and [third <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_nb.png\">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/08_mfr/04_content.ipynb) part in Chapter 8.\n",
"\n",
"Be concise in your answers! Most questions can be answered in *one* sentence."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Essay Questions "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Q1**: With the [map() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#map) and [filter() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#filter) built-ins and the [reduce() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functools.html#functools.reduce) function from the [functools <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functools.html) module in the [standard library <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/index.html), we can replace many tedious `for`-loops and `if` statements. What are some advantages of doing so?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" < your answer >"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Q2**: Looking at the `lambda` expression inside [reduce() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functools.html#functools.reduce) below, what [built-in function <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html) is mimicked here?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```python\n",
"from functools import reduce\n",
"\n",
"numbers = [7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4]\n",
"\n",
"reduce(lambda x, y: x if x > y else y, numbers)\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" < your answer >"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Q3**: What is the primary use case of **`list` comprehensions**? Why do we describe them as **eager**?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" < your answer >"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Q4**: **`generator` expressions** may replace `list` objects and list comprehensions in many scenarios. When evaluated, they create a **lazy** `generator` object that does *not* **materialize** its elements right away. What do we mean by that? What does it mean for a `generator` object to be **exhausted**?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" < your answer >"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Q5**: What does it mean for the **boolean reducers**, the built-in [all() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#all) and [any() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#any) functions, to follow the **short-circuiting** strategy?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" < your answer >"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Q6**: What is an **iterator**? How does it relate to an **iterable**?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" < your answer >"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## True / False Questions"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Motivate your answer with *one short* sentence!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Q7**: `lambda` expressions are useful in the context of the **map-filter-reduce** paradigm, where we often do *not* re-use a `function` object more than once."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" < your answer >"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Q8**: Using **`generator` expressions** in place of **`list` comprehensions** wherever possible is a good practice as it makes our programs use memory more efficiently."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" < your answer >"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Q9**: Just as **`list` comprehensions** create `list` objects, **`tuple` comprehensions** create `tuple` objects."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
" < your answer >"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.6"
},
"toc": {
"base_numbering": 1,
"nav_menu": {},
"number_sections": false,
"sideBar": true,
"skip_h1_title": true,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": false,
"toc_position": {},
"toc_section_display": false,
"toc_window_display": false
}
},
"nbformat": 4,
"nbformat_minor": 4
}

51
08_mfr/stream.py Normal file
View file

@ -0,0 +1,51 @@
"""Simulation of random streams of data.
This module defines:
- a generator object `data` modeling an infinite stream of integers
- a function `make_finite_stream()` that creates finite streams of data
The probability distribution underlying the integers is Gaussian-like with a
mean of 42 and a standard deviation of 8. The left tail of the distribution is
cut off meaning that the streams only produce non-negative numbers. Further,
one in a hundred random numbers has an increased chance to be an outlier.
"""
import itertools as _itertools
import random as _random
_random.seed(87)
def _infinite_stream():
"""Internal generator function to simulate an infinite stream of data."""
while True:
number = max(0, int(_random.gauss(42, 8)))
if _random.randint(1, 100) == 1:
number *= 2
yield number
def make_finite_stream(min_=5, max_=15):
"""Simulate a finite stream of data.
The returned stream is finite, but the number of elements to be produced
by it is still random. This default behavior may be turned off by passing
in `min_` and `max_` arguments with `min_ == max_`.
Args:
min_ (optional, int): minimum numbers in the stream; defaults to 5
max_ (optional, int): maximum numbers in the stream; defaults to 15
Returns:
finite_stream (generator)
Raises:
ValueError: if max_ < min_
"""
stream = _infinite_stream()
n = _random.randint(min_, max_)
yield from _itertools.islice(stream, n)
data = _infinite_stream()

View file

@ -2,6 +2,21 @@
The materials are designed to resemble an *interactive* book. The materials are designed to resemble an *interactive* book.
The files come
primarily in the [Jupyter Notebook](https://jupyter-notebook.readthedocs.io/en/stable/)
format (i.e., \*.ipynb)
but also as [modules and packages <img height="12" style="display: inline-block" src="static/link/to_py.png">](https://docs.python.org/3/tutorial/modules.html)
(i.e., \*.py).
Together with some other static files (e.g., images),
they are stored in one folder per chapter in this repository.
They are to be opened
from within the [JupyterLab](https://jupyterlab.readthedocs.io/en/stable/) application,
even though other ways are certainly possible as well.
Both the files and the folders
are appropriately named with prefixes
indicating the order in which they should be read
and starting with "00_".
It is recommended It is recommended
to follow the [installation instructions](https://github.com/webartifex/intro-to-python#installation) to follow the [installation instructions](https://github.com/webartifex/intro-to-python#installation)
in the [README.md](README.md) file in the [README.md](README.md) file
@ -189,3 +204,28 @@ If this is not possible,
(`namedtuple` Type) (`namedtuple` Type)
- [summary <img height="12" style="display: inline-block" src="static/link/to_nb.png">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/07_sequences/06_summary.ipynb) - [summary <img height="12" style="display: inline-block" src="static/link/to_nb.png">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/07_sequences/06_summary.ipynb)
- [review questions <img height="12" style="display: inline-block" src="static/link/to_nb.png">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/07_sequences/07_review.ipynb) - [review questions <img height="12" style="display: inline-block" src="static/link/to_nb.png">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/07_sequences/07_review.ipynb)
- *Chapter 8*: Map, Filter, & Reduce
- [content <img height="12" style="display: inline-block" src="static/link/to_nb.png">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/08_mfr/00_content.ipynb)
[<img height="12" style="display: inline-block" src="static/link/to_mb.png">](https://mybinder.org/v2/gh/webartifex/intro-to-python/develop?urlpath=lab/tree/08_mfr/00_content.ipynb)
(Mapping;
Filtering;
Reducing;
`lambda` Expression)
- [content <img height="12" style="display: inline-block" src="static/link/to_nb.png">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/08_mfr/01_content.ipynb)
[<img height="12" style="display: inline-block" src="static/link/to_mb.png">](https://mybinder.org/v2/gh/webartifex/intro-to-python/develop?urlpath=lab/tree/08_mfr/01_content.ipynb)
(`list` Comprehension;
`generator` Expression;
Streams of Data;
Boolean Reducers)
- [exercises <img height="12" style="display: inline-block" src="static/link/to_nb.png">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/08_mfr/02_exercises.ipynb)
[<img height="12" style="display: inline-block" src="static/link/to_mb.png">](https://mybinder.org/v2/gh/webartifex/intro-to-python/develop?urlpath=lab/tree/08_mfr/02_exercises.ipynb)
(Removing Outliers in Streaming Data)
- [exercises <img height="12" style="display: inline-block" src="static/link/to_nb.png">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/08_mfr/03_exercises.ipynb)
[<img height="12" style="display: inline-block" src="static/link/to_mb.png">](https://mybinder.org/v2/gh/webartifex/intro-to-python/develop?urlpath=lab/tree/08_mfr/03_exercises.ipynb)
(Packing & Unpacking with Functions, continued)
- [content <img height="12" style="display: inline-block" src="static/link/to_nb.png">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/08_mfr/04_content.ipynb)
[<img height="12" style="display: inline-block" src="static/link/to_mb.png">](https://mybinder.org/v2/gh/webartifex/intro-to-python/develop?urlpath=lab/tree/08_mfr/04_content.ipynb)
(Iterators vs. Iterables;
Example: `sorted()` vs. `reversed()`)
- [summary <img height="12" style="display: inline-block" src="static/link/to_nb.png">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/08_mfr/05_summary.ipynb)
- [review questions <img height="12" style="display: inline-block" src="static/link/to_nb.png">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/08_mfr/06_review.ipynb)

View file

@ -20,6 +20,7 @@ For a more *detailed version* with **clickable links**
- *Chapter 5*: Numbers & Bits - *Chapter 5*: Numbers & Bits
- *Chapter 6*: Text & Bytes - *Chapter 6*: Text & Bytes
- *Chapter 7*: Sequential Data - *Chapter 7*: Sequential Data
- *Chapter 8*: Map, Filter, & Reduce
#### Videos #### Videos

85
poetry.lock generated
View file

@ -475,7 +475,7 @@ test = ["pytest", "requests"]
[[package]] [[package]]
name = "lxml" name = "lxml"
version = "4.5.2" version = "4.6.1"
description = "Powerful and Pythonic XML processing library combining libxml2/libxslt with the ElementTree API." description = "Powerful and Pythonic XML processing library combining libxml2/libxslt with the ElementTree API."
category = "dev" category = "dev"
optional = false optional = false
@ -505,8 +505,8 @@ python-versions = "*"
[[package]] [[package]]
name = "nbclient" name = "nbclient"
version = "0.5.0" version = "0.5.1"
description = "A client library for executing notebooks. Formally nbconvert's ExecutePreprocessor." description = "A client library for executing notebooks. Formerly nbconvert's ExecutePreprocessor."
category = "main" category = "main"
optional = false optional = false
python-versions = ">=3.6" python-versions = ">=3.6"
@ -920,7 +920,7 @@ test = ["pytest"]
[[package]] [[package]]
name = "urllib3" name = "urllib3"
version = "1.25.10" version = "1.25.11"
description = "HTTP library with thread-safe connection pooling, file post, and more." description = "HTTP library with thread-safe connection pooling, file post, and more."
category = "main" category = "main"
optional = false optional = false
@ -928,7 +928,7 @@ python-versions = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*, <4"
[package.extras] [package.extras]
brotli = ["brotlipy (>=0.6.0)"] brotli = ["brotlipy (>=0.6.0)"]
secure = ["certifi", "cryptography (>=1.3.4)", "idna (>=2.0.0)", "pyOpenSSL (>=0.14)", "ipaddress"] secure = ["pyOpenSSL (>=0.14)", "cryptography (>=1.3.4)", "idna (>=2.0.0)", "certifi", "ipaddress"]
socks = ["PySocks (>=1.5.6,<1.5.7 || >1.5.7,<2.0)"] socks = ["PySocks (>=1.5.6,<1.5.7 || >1.5.7,<2.0)"]
[[package]] [[package]]
@ -1169,37 +1169,42 @@ jupyterlab-server = [
{file = "jupyterlab_server-1.2.0.tar.gz", hash = "sha256:5431d9dde96659364b7cc877693d5d21e7b80cea7ae3959ecc2b87518e5f5d8c"}, {file = "jupyterlab_server-1.2.0.tar.gz", hash = "sha256:5431d9dde96659364b7cc877693d5d21e7b80cea7ae3959ecc2b87518e5f5d8c"},
] ]
lxml = [ lxml = [
{file = "lxml-4.5.2-cp27-cp27m-macosx_10_9_x86_64.whl", hash = "sha256:74f48ec98430e06c1fa8949b49ebdd8d27ceb9df8d3d1c92e1fdc2773f003f20"}, {file = "lxml-4.6.1-cp27-cp27m-macosx_10_9_x86_64.whl", hash = "sha256:4b7572145054330c8e324a72d808c8c8fbe12be33368db28c39a255ad5f7fb51"},
{file = "lxml-4.5.2-cp27-cp27m-manylinux1_i686.whl", hash = "sha256:e70d4e467e243455492f5de463b72151cc400710ac03a0678206a5f27e79ddef"}, {file = "lxml-4.6.1-cp27-cp27m-manylinux1_i686.whl", hash = "sha256:302160eb6e9764168e01d8c9ec6becddeb87776e81d3fcb0d97954dd51d48e0a"},
{file = "lxml-4.5.2-cp27-cp27m-manylinux1_x86_64.whl", hash = "sha256:7ad7906e098ccd30d8f7068030a0b16668ab8aa5cda6fcd5146d8d20cbaa71b5"}, {file = "lxml-4.6.1-cp27-cp27m-manylinux1_x86_64.whl", hash = "sha256:d4ad7fd3269281cb471ad6c7bafca372e69789540d16e3755dd717e9e5c9d82f"},
{file = "lxml-4.5.2-cp27-cp27m-win32.whl", hash = "sha256:92282c83547a9add85ad658143c76a64a8d339028926d7dc1998ca029c88ea6a"}, {file = "lxml-4.6.1-cp27-cp27m-win32.whl", hash = "sha256:189ad47203e846a7a4951c17694d845b6ade7917c47c64b29b86526eefc3adf5"},
{file = "lxml-4.5.2-cp27-cp27m-win_amd64.whl", hash = "sha256:05a444b207901a68a6526948c7cc8f9fe6d6f24c70781488e32fd74ff5996e3f"}, {file = "lxml-4.6.1-cp27-cp27m-win_amd64.whl", hash = "sha256:56eff8c6fb7bc4bcca395fdff494c52712b7a57486e4fbde34c31bb9da4c6cc4"},
{file = "lxml-4.5.2-cp27-cp27mu-manylinux1_i686.whl", hash = "sha256:94150231f1e90c9595ccc80d7d2006c61f90a5995db82bccbca7944fd457f0f6"}, {file = "lxml-4.6.1-cp27-cp27mu-manylinux1_i686.whl", hash = "sha256:23c83112b4dada0b75789d73f949dbb4e8f29a0a3511647024a398ebd023347b"},
{file = "lxml-4.5.2-cp27-cp27mu-manylinux1_x86_64.whl", hash = "sha256:bea760a63ce9bba566c23f726d72b3c0250e2fa2569909e2d83cda1534c79443"}, {file = "lxml-4.6.1-cp27-cp27mu-manylinux1_x86_64.whl", hash = "sha256:0e89f5d422988c65e6936e4ec0fe54d6f73f3128c80eb7ecc3b87f595523607b"},
{file = "lxml-4.5.2-cp35-cp35m-manylinux1_i686.whl", hash = "sha256:c3f511a3c58676147c277eff0224c061dd5a6a8e1373572ac817ac6324f1b1e0"}, {file = "lxml-4.6.1-cp35-cp35m-manylinux1_i686.whl", hash = "sha256:2358809cc64394617f2719147a58ae26dac9e21bae772b45cfb80baa26bfca5d"},
{file = "lxml-4.5.2-cp35-cp35m-manylinux1_x86_64.whl", hash = "sha256:59daa84aef650b11bccd18f99f64bfe44b9f14a08a28259959d33676554065a1"}, {file = "lxml-4.6.1-cp35-cp35m-manylinux1_x86_64.whl", hash = "sha256:be1ebf9cc25ab5399501c9046a7dcdaa9e911802ed0e12b7d620cd4bbf0518b3"},
{file = "lxml-4.5.2-cp35-cp35m-manylinux2014_aarch64.whl", hash = "sha256:c9d317efde4bafbc1561509bfa8a23c5cab66c44d49ab5b63ff690f5159b2304"}, {file = "lxml-4.6.1-cp35-cp35m-manylinux2014_aarch64.whl", hash = "sha256:4fff34721b628cce9eb4538cf9a73d02e0f3da4f35a515773cce6f5fe413b360"},
{file = "lxml-4.5.2-cp35-cp35m-win32.whl", hash = "sha256:9dc9006dcc47e00a8a6a029eb035c8f696ad38e40a27d073a003d7d1443f5d88"}, {file = "lxml-4.6.1-cp35-cp35m-win32.whl", hash = "sha256:475325e037fdf068e0c2140b818518cf6bc4aa72435c407a798b2db9f8e90810"},
{file = "lxml-4.5.2-cp35-cp35m-win_amd64.whl", hash = "sha256:08fc93257dcfe9542c0a6883a25ba4971d78297f63d7a5a26ffa34861ca78730"}, {file = "lxml-4.6.1-cp35-cp35m-win_amd64.whl", hash = "sha256:f98b6f256be6cec8dd308a8563976ddaff0bdc18b730720f6f4bee927ffe926f"},
{file = "lxml-4.5.2-cp36-cp36m-macosx_10_9_x86_64.whl", hash = "sha256:121b665b04083a1e85ff1f5243d4a93aa1aaba281bc12ea334d5a187278ceaf1"}, {file = "lxml-4.6.1-cp36-cp36m-macosx_10_9_x86_64.whl", hash = "sha256:be7c65e34d1b50ab7093b90427cbc488260e4b3a38ef2435d65b62e9fa3d798a"},
{file = "lxml-4.5.2-cp36-cp36m-manylinux1_i686.whl", hash = "sha256:5591c4164755778e29e69b86e425880f852464a21c7bb53c7ea453bbe2633bbe"}, {file = "lxml-4.6.1-cp36-cp36m-manylinux1_i686.whl", hash = "sha256:d18331ea905a41ae71596502bd4c9a2998902328bbabd29e3d0f5f8569fabad1"},
{file = "lxml-4.5.2-cp36-cp36m-manylinux1_x86_64.whl", hash = "sha256:cc411ad324a4486b142c41d9b2b6a722c534096963688d879ea6fa8a35028258"}, {file = "lxml-4.6.1-cp36-cp36m-manylinux1_x86_64.whl", hash = "sha256:3d9b2b72eb0dbbdb0e276403873ecfae870599c83ba22cadff2db58541e72856"},
{file = "lxml-4.5.2-cp36-cp36m-manylinux2014_aarch64.whl", hash = "sha256:1fa21263c3aba2b76fd7c45713d4428dbcc7644d73dcf0650e9d344e433741b3"}, {file = "lxml-4.6.1-cp36-cp36m-manylinux2014_aarch64.whl", hash = "sha256:d20d32cbb31d731def4b1502294ca2ee99f9249b63bc80e03e67e8f8e126dea8"},
{file = "lxml-4.5.2-cp36-cp36m-win32.whl", hash = "sha256:786aad2aa20de3dbff21aab86b2fb6a7be68064cbbc0219bde414d3a30aa47ae"}, {file = "lxml-4.6.1-cp36-cp36m-win32.whl", hash = "sha256:d182eada8ea0de61a45a526aa0ae4bcd222f9673424e65315c35820291ff299c"},
{file = "lxml-4.5.2-cp36-cp36m-win_amd64.whl", hash = "sha256:e1cacf4796b20865789083252186ce9dc6cc59eca0c2e79cca332bdff24ac481"}, {file = "lxml-4.6.1-cp36-cp36m-win_amd64.whl", hash = "sha256:c0dac835c1a22621ffa5e5f999d57359c790c52bbd1c687fe514ae6924f65ef5"},
{file = "lxml-4.5.2-cp37-cp37m-macosx_10_9_x86_64.whl", hash = "sha256:80a38b188d20c0524fe8959c8ce770a8fdf0e617c6912d23fc97c68301bb9aba"}, {file = "lxml-4.6.1-cp37-cp37m-macosx_10_9_x86_64.whl", hash = "sha256:d84d741c6e35c9f3e7406cb7c4c2e08474c2a6441d59322a00dcae65aac6315d"},
{file = "lxml-4.5.2-cp37-cp37m-manylinux1_i686.whl", hash = "sha256:ecc930ae559ea8a43377e8b60ca6f8d61ac532fc57efb915d899de4a67928efd"}, {file = "lxml-4.6.1-cp37-cp37m-manylinux1_i686.whl", hash = "sha256:8862d1c2c020cb7a03b421a9a7b4fe046a208db30994fc8ff68c627a7915987f"},
{file = "lxml-4.5.2-cp37-cp37m-manylinux1_x86_64.whl", hash = "sha256:a76979f728dd845655026ab991df25d26379a1a8fc1e9e68e25c7eda43004bed"}, {file = "lxml-4.6.1-cp37-cp37m-manylinux1_x86_64.whl", hash = "sha256:3a7a380bfecc551cfd67d6e8ad9faa91289173bdf12e9cfafbd2bdec0d7b1ec1"},
{file = "lxml-4.5.2-cp37-cp37m-manylinux2014_aarch64.whl", hash = "sha256:cfd7c5dd3c35c19cec59c63df9571c67c6d6e5c92e0fe63517920e97f61106d1"}, {file = "lxml-4.6.1-cp37-cp37m-manylinux2014_aarch64.whl", hash = "sha256:2d6571c48328be4304aee031d2d5046cbc8aed5740c654575613c5a4f5a11311"},
{file = "lxml-4.5.2-cp37-cp37m-win32.whl", hash = "sha256:5a9c8d11aa2c8f8b6043d845927a51eb9102eb558e3f936df494e96393f5fd3e"}, {file = "lxml-4.6.1-cp37-cp37m-win32.whl", hash = "sha256:803a80d72d1f693aa448566be46ffd70882d1ad8fc689a2e22afe63035eb998a"},
{file = "lxml-4.5.2-cp37-cp37m-win_amd64.whl", hash = "sha256:4b4a111bcf4b9c948e020fd207f915c24a6de3f1adc7682a2d92660eb4e84f1a"}, {file = "lxml-4.6.1-cp37-cp37m-win_amd64.whl", hash = "sha256:24e811118aab6abe3ce23ff0d7d38932329c513f9cef849d3ee88b0f848f2aa9"},
{file = "lxml-4.5.2-cp38-cp38-macosx_10_9_x86_64.whl", hash = "sha256:5dd20538a60c4cc9a077d3b715bb42307239fcd25ef1ca7286775f95e9e9a46d"}, {file = "lxml-4.6.1-cp38-cp38-macosx_10_9_x86_64.whl", hash = "sha256:2e311a10f3e85250910a615fe194839a04a0f6bc4e8e5bb5cac221344e3a7891"},
{file = "lxml-4.5.2-cp38-cp38-manylinux1_i686.whl", hash = "sha256:2b30aa2bcff8e958cd85d907d5109820b01ac511eae5b460803430a7404e34d7"}, {file = "lxml-4.6.1-cp38-cp38-manylinux1_i686.whl", hash = "sha256:a71400b90b3599eb7bf241f947932e18a066907bf84617d80817998cee81e4bf"},
{file = "lxml-4.5.2-cp38-cp38-manylinux1_x86_64.whl", hash = "sha256:aa8eba3db3d8761db161003e2d0586608092e217151d7458206e243be5a43843"}, {file = "lxml-4.6.1-cp38-cp38-manylinux1_x86_64.whl", hash = "sha256:211b3bcf5da70c2d4b84d09232534ad1d78320762e2c59dedc73bf01cb1fc45b"},
{file = "lxml-4.5.2-cp38-cp38-manylinux2014_aarch64.whl", hash = "sha256:8f0ec6b9b3832e0bd1d57af41f9238ea7709bbd7271f639024f2fc9d3bb01293"}, {file = "lxml-4.6.1-cp38-cp38-manylinux2014_aarch64.whl", hash = "sha256:e65c221b2115a91035b55a593b6eb94aa1206fa3ab374f47c6dc10d364583ff9"},
{file = "lxml-4.5.2-cp38-cp38-win32.whl", hash = "sha256:107781b213cf7201ec3806555657ccda67b1fccc4261fb889ef7fc56976db81f"}, {file = "lxml-4.6.1-cp38-cp38-win32.whl", hash = "sha256:d6f8c23f65a4bfe4300b85f1f40f6c32569822d08901db3b6454ab785d9117cc"},
{file = "lxml-4.5.2-cp38-cp38-win_amd64.whl", hash = "sha256:f161af26f596131b63b236372e4ce40f3167c1b5b5d459b29d2514bd8c9dc9ee"}, {file = "lxml-4.6.1-cp38-cp38-win_amd64.whl", hash = "sha256:573b2f5496c7e9f4985de70b9bbb4719ffd293d5565513e04ac20e42e6e5583f"},
{file = "lxml-4.5.2.tar.gz", hash = "sha256:cdc13a1682b2a6241080745b1953719e7fe0850b40a5c71ca574f090a1391df6"}, {file = "lxml-4.6.1-cp39-cp39-manylinux1_i686.whl", hash = "sha256:1d87936cb5801c557f3e981c9c193861264c01209cb3ad0964a16310ca1b3301"},
{file = "lxml-4.6.1-cp39-cp39-manylinux1_x86_64.whl", hash = "sha256:2d5896ddf5389560257bbe89317ca7bcb4e54a02b53a3e572e1ce4226512b51b"},
{file = "lxml-4.6.1-cp39-cp39-manylinux2014_aarch64.whl", hash = "sha256:9b06690224258db5cd39a84e993882a6874676f5de582da57f3df3a82ead9174"},
{file = "lxml-4.6.1-cp39-cp39-win32.whl", hash = "sha256:bb252f802f91f59767dcc559744e91efa9df532240a502befd874b54571417bd"},
{file = "lxml-4.6.1-cp39-cp39-win_amd64.whl", hash = "sha256:7ecaef52fd9b9535ae5f01a1dd2651f6608e4ec9dc136fc4dfe7ebe3c3ddb230"},
{file = "lxml-4.6.1.tar.gz", hash = "sha256:c152b2e93b639d1f36ec5a8ca24cde4a8eefb2b6b83668fcd8e83a67badcb367"},
] ]
markupsafe = [ markupsafe = [
{file = "MarkupSafe-1.1.1-cp27-cp27m-macosx_10_6_intel.whl", hash = "sha256:09027a7803a62ca78792ad89403b1b7a73a01c8cb65909cd876f7fcebd79b161"}, {file = "MarkupSafe-1.1.1-cp27-cp27m-macosx_10_6_intel.whl", hash = "sha256:09027a7803a62ca78792ad89403b1b7a73a01c8cb65909cd876f7fcebd79b161"},
@ -1241,8 +1246,8 @@ mistune = [
{file = "mistune-0.8.4.tar.gz", hash = "sha256:59a3429db53c50b5c6bcc8a07f8848cb00d7dc8bdb431a4ab41920d201d4756e"}, {file = "mistune-0.8.4.tar.gz", hash = "sha256:59a3429db53c50b5c6bcc8a07f8848cb00d7dc8bdb431a4ab41920d201d4756e"},
] ]
nbclient = [ nbclient = [
{file = "nbclient-0.5.0-py3-none-any.whl", hash = "sha256:8a6e27ff581cee50895f44c41936ce02369674e85e2ad58643d8d4a6c36771b0"}, {file = "nbclient-0.5.1-py3-none-any.whl", hash = "sha256:4d6b116187c795c99b9dba13d46e764d596574b14c296d60670c8dfe454db364"},
{file = "nbclient-0.5.0.tar.gz", hash = "sha256:8ad52d27ba144fca1402db014857e53c5a864a2f407be66ca9d74c3a56d6591d"}, {file = "nbclient-0.5.1.tar.gz", hash = "sha256:01e2d726d16eaf2cde6db74a87e2451453547e8832d142f73f72fddcd4fe0250"},
] ]
nbconvert = [ nbconvert = [
{file = "nbconvert-6.0.7-py3-none-any.whl", hash = "sha256:39e9f977920b203baea0be67eea59f7b37a761caa542abe80f5897ce3cf6311d"}, {file = "nbconvert-6.0.7-py3-none-any.whl", hash = "sha256:39e9f977920b203baea0be67eea59f7b37a761caa542abe80f5897ce3cf6311d"},
@ -1467,8 +1472,8 @@ traitlets = [
{file = "traitlets-5.0.5.tar.gz", hash = "sha256:178f4ce988f69189f7e523337a3e11d91c786ded9360174a3d9ca83e79bc5396"}, {file = "traitlets-5.0.5.tar.gz", hash = "sha256:178f4ce988f69189f7e523337a3e11d91c786ded9360174a3d9ca83e79bc5396"},
] ]
urllib3 = [ urllib3 = [
{file = "urllib3-1.25.10-py2.py3-none-any.whl", hash = "sha256:e7983572181f5e1522d9c98453462384ee92a0be7fac5f1413a1e35c56cc0461"}, {file = "urllib3-1.25.11-py2.py3-none-any.whl", hash = "sha256:f5321fbe4bf3fefa0efd0bfe7fb14e90909eb62a48ccda331726b4319897dd5e"},
{file = "urllib3-1.25.10.tar.gz", hash = "sha256:91056c15fa70756691db97756772bb1eb9678fa585d9184f24534b100dc60f4a"}, {file = "urllib3-1.25.11.tar.gz", hash = "sha256:8d7eaa5a82a1cac232164990f04874c594c9453ec55eef02eab885aa02fc17a2"},
] ]
virtualenv = [ virtualenv = [
{file = "virtualenv-20.0.35-py2.py3-none-any.whl", hash = "sha256:0ebc633426d7468664067309842c81edab11ae97fcaf27e8ad7f5748c89b431b"}, {file = "virtualenv-20.0.35-py2.py3-none-any.whl", hash = "sha256:0ebc633426d7468664067309842c81edab11ae97fcaf27e8ad7f5748c89b431b"},