intro-to-python/08_mfr/04_content.ipynb

1672 lines
44 KiB
Text

{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"**Note**: Click on \"*Kernel*\" > \"*Restart Kernel and Clear All Outputs*\" in [JupyterLab](https://jupyterlab.readthedocs.io/en/stable/) *before* reading this notebook to reset its output. If you cannot run this file on your machine, you may want to open it [in the cloud <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_mb.png\">](https://mybinder.org/v2/gh/webartifex/intro-to-python/develop?urlpath=lab/tree/08_mfr/04_content.ipynb)."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Chapter 8: Map, Filter, & Reduce"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"Similarly to how we classify different *concrete* data types like `list` or `str` by how they behave *abstractly* in a given context in [Chapter 7 <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_nb.png\">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/07_sequences/00_content.ipynb#Collections-vs.-Sequences), we also do so for the data types we have introduced in this chapter."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Iterators vs. Iterables"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"Here, the `map`, `filter`, and `generator` types all behave like \"rules\" in memory that govern how objects are produced \"on the fly.\" Their main commonality is their support for the built-in [next() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#next) function. In computer science terminology, such data types are called **[iterators <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_wiki.png\">](https://en.wikipedia.org/wiki/Iterator)**, and the [collections.abc <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/collections.abc.html) module formalizes them with the `Iterator` ABC in Python.\n",
"\n",
"So, one example of an iterator is `evens_transformed` below, an object of type `generator`."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
"numbers = [7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4]"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [],
"source": [
"evens_transformed = ((x ** 2) + 1 for x in numbers if x % 2 == 0)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"Let's first confirm that `evens_transformed` is indeed an `Iterator`, \"abstractly speaking.\""
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [],
"source": [
"import collections.abc as abc"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"isinstance(evens_transformed, abc.Iterator)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"In Python, iterators are *always* also iterables. The reverse is *not* true! To be precise, iterators are *specializations* of iterables. That is what the \"Inherits from\" column means in the [collections.abc <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/collections.abc.html) module's documentation."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"isinstance(evens_transformed, abc.Iterable)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"Furthermore, we sharpen our definition of an *iterable* from [Chapter 7 <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_nb.png\">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/07_sequences/00_content.ipynb#Collections-vs.-Sequences): Just as we define an *iterator* to be any object that supports the [next() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#next) function, we define an *iterable* to be any object that supports the built-in [iter() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#iter) function.\n",
"\n",
"The confused reader may now be wondering how the two concepts relate to each other.\n",
"\n",
"In short, the [iter() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#iter) function is the general way to create an *iterator* object out of a given *iterable* object. Then, the *iterator* object manages the iteration over the *iterable* object. In real-world code, we hardly ever see [iter() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#iter) as Python calls it for us in the background. \n",
"\n",
"For illustration, let's do that ourselves and create *two* iterators out of the iterable `numbers` and see what we can do with them."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [],
"source": [
"iterator1 = iter(numbers)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [],
"source": [
"iterator2 = iter(numbers)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"`iterator1` and `iterator2` are of type `list_iterator`."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/plain": [
"list_iterator"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"type(iterator1)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"*Iterators* are useful for only *one* thing: Get the next object from the associated *iterable*.\n",
"\n",
"By calling [next() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#next) three times with `iterator1` as the argument, we obtain the first three elements of `numbers`."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/plain": [
"(7, 11, 8)"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"next(iterator1), next(iterator1), next(iterator1)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"`iterator1` and `iterator2` keep their *states* separate. So, we could loop over the same *iterable* several times in parallel."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/plain": [
"(5, 7)"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"next(iterator1), next(iterator2)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"We can also play a \"trick\" and exchange some elements in `numbers`. `iterator1` and `iterator2` do *not* see these changes and present us with the new elements. So, *iterators* not only have state on their own but also keep this separate from the underlying *iterable*."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [],
"source": [
"numbers[1], numbers[4] = 99, 99"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [
{
"data": {
"text/plain": [
"(99, 99)"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"next(iterator1), next(iterator2)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"Let's re-assign the elements in `numbers` so that they are in order. After that, the numbers returned from [next() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#next) also tell us how often [next() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#next) was called with `iterator1` or `iterator2`. We conclude that `list_iterator` objects work by simply keeping track of the *last* index obtained from the underlying *iterable*."
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [],
"source": [
"numbers[:] = list(range(1, 13))"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"numbers"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [
{
"data": {
"text/plain": [
"(6, 3)"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"next(iterator1), next(iterator2)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"With the concepts introduced in this section, we can now understand the first sentence in the documentation on the [zip() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#zip) built-in better: \"Make an *iterator* that aggregates elements from each of the *iterables*.\"\n",
"\n",
"Because *iterators* are always also *iterables*, we may pass `iterator1` and `iterator2` as arguments to [zip() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#zip).\n",
"\n",
"The returned `zipper` object is of type `zip` and, \"abstractly speaking,\" an `Iterator` as well."
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
"zipper = zip(iterator1, iterator2)"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/plain": [
"<zip at 0x7f2788118b00>"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"zipper"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/plain": [
"zip"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"type(zipper)"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"isinstance(zipper, abc.Iterator)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"So far, we have always used [zip() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#zip) in a `for`-loop. That was our earlier definition of an *iterable*. Our revised definition in this section states that an *iterable* is an object that supports the [iter() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#iter) function. So, let's see what happens if we pass `zipper` to [iter() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#iter)."
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [],
"source": [
"zipper_iterator = iter(zipper)"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [
{
"data": {
"text/plain": [
"<zip at 0x7f2788118b00>"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"zipper_iterator"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"`zipper_iterator` references the *same* object as `zipper`! That is true for *iterators* in general: Any *iterator* created from an existing *iterator* with [iter() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#iter) is the *iterator* itself."
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"zipper is zipper_iterator"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"The Python core developers made that design decision so that *iterators* may also be looped over.\n",
"\n",
"The `for`-loop below prints out only *six* more `tuple` objects derived from the now ordered `numbers` because the `iterator1` object hidden inside `zipper` already returned its first *six* elements. So, the respective first elements of the `tuple` objects printed range from `7` to `12`. Similarly, as `iterator2` already returned its first *three* elements from `numbers`, we see the respective second elements in the range from `4` to `9`."
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"7 and 4 8 and 5 9 and 6 10 and 7 11 and 8 12 and 9 "
]
}
],
"source": [
"for x, y in zipper:\n",
" print(x, \"and\", y, end=\" \")"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"`zipper` is now *exhausted*. So, the `for`-loop below does *not* make any iteration at all."
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [],
"source": [
"for x, y in zipper:\n",
" raise RuntimeError(\"We won't see this error\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"We verify that `iterator1` is exhausted by passing it to [next() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#next) again, which raises a `StopIteration` exception."
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"ename": "StopIteration",
"evalue": "",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mStopIteration\u001b[0m Traceback (most recent call last)",
"Cell \u001b[0;32mIn[25], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[38;5;28;43mnext\u001b[39;49m\u001b[43m(\u001b[49m\u001b[43miterator1\u001b[49m\u001b[43m)\u001b[49m\n",
"\u001b[0;31mStopIteration\u001b[0m: "
]
}
],
"source": [
"next(iterator1)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"On the contrary, `iterator2` is *not* yet exhausted."
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/plain": [
"10"
]
},
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"next(iterator2)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"Understanding *iterators* and *iterables* is helpful for any data science practitioner that deals with large amounts of data. Even without that, these two terms occur everywhere in Python-related texts and documentation. So, a beginner should regularly review this section until it becomes second nature."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## The `for` Statement (revisited)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"In [Chapter 4 <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_nb.png\">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/04_iteration/02_content.ipynb#The-for-Statement), we argue that the `for` statement is syntactic sugar, replacing the `while` statement in many common scenarios. In particular, a `for`-loop saves us two tasks: Managing an index variable *and* obtaining the individual elements by indexing. In this sub-section, we look at a more realistic picture, using the new terminology as well.\n",
"\n",
"Let's print out the elements of a `list` object as the *iterable* being looped over."
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
"iterable = [0, 1, 2, 3, 4]"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0 1 2 3 4 "
]
}
],
"source": [
"for element in iterable:\n",
" print(element, end=\" \")"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"Our previous and equivalent formulation with a `while` statement is like so."
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0 1 2 3 4 "
]
}
],
"source": [
"index = 0\n",
"\n",
"while index < len(iterable):\n",
" element = iterable[index]\n",
" print(element, end=\" \")\n",
" index += 1\n",
"\n",
"del index"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"What actually happens behind the scenes in the Python interpreter is shown below.\n",
"\n",
"First, Python calls [iter() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#iter) with the `iterable` to be looped over as the argument. The returned `iterator` contains the entire logic of how the `iterable` is looped over. In particular, the `iterator` may or may not pick the `iterable`'s elements in a predictable order. That is up to the \"rule\" it models.\n",
"\n",
"Second, Python enters an *indefinite* `while`-loop. It tries to obtain the next element with [next() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#next). If that succeeds, the `for`-loop's code block is executed. Below, that code is placed within the `else`-clause that runs only if *no* exception is raised in the `try`-clause. Then, Python jumps into the next iteration and tries to obtain the next element from the `iterator`, and so on. Once the `iterator` is exhausted, it raises a `StopIteration` exception, and Python stops the `while`-loop with the `break` statement."
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0 1 2 3 4 "
]
}
],
"source": [
"iterator = iter(iterable)\n",
"\n",
"while True:\n",
" try:\n",
" element = next(iterator)\n",
" except StopIteration:\n",
" break\n",
" else:\n",
" print(element, end=\" \")\n",
"\n",
"del iterator"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## sorted() vs. reversed()"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"Now that we know the concept of an *iterator*, let's compare some of the built-ins introduced in [Chapter 7 <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_nb.png\">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/07_sequences/00_content.ipynb) in detail and make sure we understand what is going on in memory. This section also serves as a summary of all the concepts in this chapter.\n",
"\n",
"We use two simple examples, `numbers` and `memoryless`: `numbers` creates *thirteen* objects in memory and `memoryless` only *one* (cf., [PythonTutor <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](http://pythontutor.com/visualize.html#code=numbers%20%3D%20%5B7,%2011,%208,%205,%203,%2012,%202,%206,%209,%2010,%201,%204%5D%0Amemoryless%20%3D%20range%281,%2013%29&cumulative=false&curstr=2&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false))."
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
"numbers = [7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4]"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [],
"source": [
"memoryless = range(1, 13)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"The [sorted() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#sorted) function takes a *finite* `iterable` argument and *materializes* its elements into a *new* `list` object that is returned.\n",
"\n",
"The argument may already be materialized, as is the case with `numbers`, but may also be an *iterable* without any objects in it, such as `memoryless`. In both cases, we end up with materialized `list` objects with the elements sorted in *forward* order (cf., [PythonTutor <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](http://pythontutor.com/visualize.html#code=numbers%20%3D%20%5B7,%2011,%208,%205,%203,%2012,%202,%206,%209,%2010,%201,%204%5D%0Amemoryless%20%3D%20range%281,%2013%29%0Aresult1%20%3D%20sorted%28numbers%29%0Aresult2%20%3D%20sorted%28memoryless%29&cumulative=false&curstr=4&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false))."
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]"
]
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sorted(numbers)"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]"
]
},
"execution_count": 34,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sorted(memoryless)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"By adding a keyword-only argument `reverse=True`, the materialized `list` objects are sorted in *reverse* order (cf., [PythonTutor <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](http://pythontutor.com/visualize.html#code=numbers%20%3D%20%5B7,%2011,%208,%205,%203,%2012,%202,%206,%209,%2010,%201,%204%5D%0Amemoryless%20%3D%20range%281,%2013%29%0Aresult1%20%3D%20sorted%28numbers,%20reverse%3DTrue%29%0Aresult2%20%3D%20sorted%28memoryless,%20reverse%3DTrue%29&cumulative=false&curstr=4&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false))."
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1]"
]
},
"execution_count": 35,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sorted(numbers, reverse=True)"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1]"
]
},
"execution_count": 36,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sorted(memoryless, reverse=True)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"The order in `numbers` remains *unchanged*, and `memoryless` is still *not* materialized."
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4]"
]
},
"execution_count": 37,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"numbers"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [
{
"data": {
"text/plain": [
"range(1, 13)"
]
},
"execution_count": 38,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"memoryless"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"The [reversed() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#reversed) built-in takes a `sequence` argument and returns an *iterator*. The argument must be *finite* and *reversible* (i.e., *iterable* in *reverse* order) as otherwise [reversed() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#reversed) could neither determine the last element that becomes the first nor loop in a *predictable* backward fashion. [PythonTutor <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](http://pythontutor.com/visualize.html#code=numbers%20%3D%20%5B7,%2011,%208,%205,%203,%2012,%202,%206,%209,%2010,%201,%204%5D%0Amemoryless%20%3D%20range%281,%2013%29%0Aiterator1%20%3D%20reversed%28numbers%29%0Aiterator2%20%3D%20reversed%28memoryless%29&cumulative=false&curstr=4&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false) confirms that [reversed() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#reversed) does *not* materialize any elements but only returns an *iterator*.\n",
"\n",
"**Side Note**: Even though `range` objects, like `memoryless` here, do *not* \"contain\" references to other objects, they count as *sequence* types, and as such, they are also *container* types. The `in` operator works with `range` objects because we can always cast the object to be checked as an `int` and check if that lies within the `range` object's `start` and `stop` values, taking a potential `step` value into account (cf., this [blog post](https://treyhunner.com/2018/02/python-range-is-not-an-iterator/) for more details on the [range() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#func-range) built-in)."
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"data": {
"text/plain": [
"<list_reverseiterator at 0x7f2788158880>"
]
},
"execution_count": 39,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"reversed(numbers)"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/plain": [
"<range_iterator at 0x7f2788158fc0>"
]
},
"execution_count": 40,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"reversed(memoryless)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"To materialize the elements, we can pass the returned *iterators* to, for example, the [list() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#func-list) or [tuple() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#func-tuple) constructors. That creates *new* `list` and `tuple` objects (cf., [PythonTutor <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](http://pythontutor.com/visualize.html#code=numbers%20%3D%20%5B7,%2011,%208,%205,%203,%2012,%202,%206,%209,%2010,%201,%204%5D%0Amemoryless%20%3D%20range%281,%2013%29%0Aresult1%20%3D%20list%28reversed%28numbers%29%29%0Aresult2%20%3D%20tuple%28reversed%28memoryless%29%29&cumulative=false&curstr=4&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false)).\n",
"\n",
"To reiterate some more new terminology from this chapter, we describe [reversed() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#reversed) as *lazy* whereas [list() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#func-list) and [tuple() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#func-tuple) are *eager*. The former has no significant side effect in memory, while the latter may require a lot of memory."
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[4, 1, 10, 9, 6, 2, 12, 3, 5, 8, 11, 7]"
]
},
"execution_count": 41,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"list(reversed(numbers))"
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/plain": [
"(12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1)"
]
},
"execution_count": 42,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tuple(reversed(memoryless))"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"Of course, we can also loop over the returned *iterators* instead.\n",
"\n",
"That works because *iterators* are always *iterable*; in particular, as the previous \"*The for Statement (revisited)*\" sub-section explains, the `for`-loops below call `iter(reversed(numbers))` and `iter(reversed(memoryless))` behind the scenes. However, the *iterators* returned by [iter() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#iter) are the *same* as the `reversed(numbers)` and `reversed(memoryless)` iterators passed in! In summary, the `for`-loops below involve many subtleties that together make Python the expressive language it is."
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"4 1 10 9 6 2 12 3 5 8 11 7 "
]
}
],
"source": [
"for number in reversed(numbers):\n",
" print(number, end=\" \")"
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"12 11 10 9 8 7 6 5 4 3 2 1 "
]
}
],
"source": [
"for element in reversed(memoryless):\n",
" print(element, end=\" \")"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"As with [sorted() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#sorted), the [reversed() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#reversed) built-in does *not* mutate its argument."
]
},
{
"cell_type": "code",
"execution_count": 45,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4]"
]
},
"execution_count": 45,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"numbers"
]
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/plain": [
"range(1, 13)"
]
},
"execution_count": 46,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"memoryless"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"To point out the potentially obvious, we compare the results of *sorting* `numbers` in *reverse* order with *reversing* it: These are *different* concepts!"
]
},
{
"cell_type": "code",
"execution_count": 47,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4]"
]
},
"execution_count": 47,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"numbers"
]
},
{
"cell_type": "code",
"execution_count": 48,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1]"
]
},
"execution_count": 48,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sorted(numbers, reverse=True)"
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[4, 1, 10, 9, 6, 2, 12, 3, 5, 8, 11, 7]"
]
},
"execution_count": 49,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"list(reversed(numbers))"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"Whereas both [sorted() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#sorted) and [reversed() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#reversed) do *not* mutate their arguments, the *mutable* `list` type comes with two methods, [sort() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/stdtypes.html#list.sort) and `reverse()`, that implement the same logic but mutate an object, like `numbers` below, *in place*. To indicate that all changes occur *in place*, the [sort() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/stdtypes.html#list.sort) and `reverse()` methods always return `None`, which is not shown in JupyterLab."
]
},
{
"cell_type": "code",
"execution_count": 50,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4]"
]
},
"execution_count": 50,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"numbers"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"The `reverse()` method on the `list` type is *eager*, as opposed to the *lazy* [reversed() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#reversed) built-in. That means the mutations caused by the `reverse()` method are written into memory right away."
]
},
{
"cell_type": "code",
"execution_count": 51,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [],
"source": [
"numbers.reverse()"
]
},
{
"cell_type": "code",
"execution_count": 52,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[4, 1, 10, 9, 6, 2, 12, 3, 5, 8, 11, 7]"
]
},
"execution_count": 52,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"numbers"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"*Sorting* `numbers` in place occurs eagerly."
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [],
"source": [
"numbers.sort(reverse=True)"
]
},
{
"cell_type": "code",
"execution_count": 54,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1]"
]
},
"execution_count": 54,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"numbers"
]
},
{
"cell_type": "code",
"execution_count": 55,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [],
"source": [
"numbers.sort()"
]
},
{
"cell_type": "code",
"execution_count": 56,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]"
]
},
"execution_count": 56,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"numbers"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.2"
},
"livereveal": {
"auto_select": "code",
"auto_select_fragment": true,
"scroll": true,
"theme": "serif"
},
"toc": {
"base_numbering": 1,
"nav_menu": {},
"number_sections": false,
"sideBar": true,
"skip_h1_title": true,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": false,
"toc_position": {
"height": "calc(100% - 180px)",
"left": "10px",
"top": "150px",
"width": "384px"
},
"toc_section_display": false,
"toc_window_display": false
}
},
"nbformat": 4,
"nbformat_minor": 4
}