{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "**Note**: Click on \"*Kernel*\" > \"*Restart Kernel and Clear All Outputs*\" in [JupyterLab](https://jupyterlab.readthedocs.io/en/stable/) *before* reading this notebook to reset its output. If you cannot run this file on your machine, you may want to open it [in the cloud ](https://mybinder.org/v2/gh/webartifex/intro-to-python/main?urlpath=lab/tree/08_mfr/04_content.ipynb)." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Chapter 8: Map, Filter, & Reduce" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Similarly to how we classify different *concrete* data types like `list` or `str` by how they behave *abstractly* in a given context in [Chapter 7 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/main/07_sequences/00_content.ipynb#Collections-vs.-Sequences), we also do so for the data types we have introduced in this chapter." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Iterators vs. Iterables" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Here, the `map`, `filter`, and `generator` types all behave like \"rules\" in memory that govern how objects are produced \"on the fly.\" Their main commonality is their support for the built-in [next() ](https://docs.python.org/3/library/functions.html#next) function. In computer science terminology, such data types are called **[iterators ](https://en.wikipedia.org/wiki/Iterator)**, and the [collections.abc ](https://docs.python.org/3/library/collections.abc.html) module formalizes them with the `Iterator` ABC in Python.\n", "\n", "So, one example of an iterator is `evens_transformed` below, an object of type `generator`." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "numbers = [7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4]" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "evens_transformed = ((x ** 2) + 1 for x in numbers if x % 2 == 0)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Let's first confirm that `evens_transformed` is indeed an `Iterator`, \"abstractly speaking.\"" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "import collections.abc as abc" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "isinstance(evens_transformed, abc.Iterator)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "In Python, iterators are *always* also iterables. The reverse is *not* true! To be precise, iterators are *specializations* of iterables. That is what the \"Inherits from\" column means in the [collections.abc ](https://docs.python.org/3/library/collections.abc.html) module's documentation." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "isinstance(evens_transformed, abc.Iterable)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Furthermore, we sharpen our definition of an *iterable* from [Chapter 7 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/main/07_sequences/00_content.ipynb#Collections-vs.-Sequences): Just as we define an *iterator* to be any object that supports the [next() ](https://docs.python.org/3/library/functions.html#next) function, we define an *iterable* to be any object that supports the built-in [iter() ](https://docs.python.org/3/library/functions.html#iter) function.\n", "\n", "The confused reader may now be wondering how the two concepts relate to each other.\n", "\n", "In short, the [iter() ](https://docs.python.org/3/library/functions.html#iter) function is the general way to create an *iterator* object out of a given *iterable* object. Then, the *iterator* object manages the iteration over the *iterable* object. In real-world code, we hardly ever see [iter() ](https://docs.python.org/3/library/functions.html#iter) as Python calls it for us in the background. \n", "\n", "For illustration, let's do that ourselves and create *two* iterators out of the iterable `numbers` and see what we can do with them." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "iterator1 = iter(numbers)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "iterator2 = iter(numbers)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "`iterator1` and `iterator2` are of type `list_iterator`." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "list_iterator" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(iterator1)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "*Iterators* are useful for only *one* thing: Get the next object from the associated *iterable*.\n", "\n", "By calling [next() ](https://docs.python.org/3/library/functions.html#next) three times with `iterator1` as the argument, we obtain the first three elements of `numbers`." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "(7, 11, 8)" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "next(iterator1), next(iterator1), next(iterator1)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "`iterator1` and `iterator2` keep their *states* separate. So, we could loop over the same *iterable* several times in parallel." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "(5, 7)" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "next(iterator1), next(iterator2)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "We can also play a \"trick\" and exchange some elements in `numbers`. `iterator1` and `iterator2` do *not* see these changes and present us with the new elements. So, *iterators* not only have state on their own but also keep this separate from the underlying *iterable*." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "numbers[1], numbers[4] = 99, 99" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "(99, 99)" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "next(iterator1), next(iterator2)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Let's re-assign the elements in `numbers` so that they are in order. After that, the numbers returned from [next() ](https://docs.python.org/3/library/functions.html#next) also tell us how often [next() ](https://docs.python.org/3/library/functions.html#next) was called with `iterator1` or `iterator2`. We conclude that `list_iterator` objects work by simply keeping track of the *last* index obtained from the underlying *iterable*." ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "numbers[:] = list(range(1, 13))" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "numbers" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "(6, 3)" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "next(iterator1), next(iterator2)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "With the concepts introduced in this section, we can now understand the first sentence in the documentation on the [zip() ](https://docs.python.org/3/library/functions.html#zip) built-in better: \"Make an *iterator* that aggregates elements from each of the *iterables*.\"\n", "\n", "Because *iterators* are always also *iterables*, we may pass `iterator1` and `iterator2` as arguments to [zip() ](https://docs.python.org/3/library/functions.html#zip).\n", "\n", "The returned `zipper` object is of type `zip` and, \"abstractly speaking,\" an `Iterator` as well." ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "zipper = zip(iterator1, iterator2)" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "zipper" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "zip" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(zipper)" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "isinstance(zipper, abc.Iterator)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "So far, we have always used [zip() ](https://docs.python.org/3/library/functions.html#zip) in a `for`-loop. That was our earlier definition of an *iterable*. Our revised definition in this section states that an *iterable* is an object that supports the [iter() ](https://docs.python.org/3/library/functions.html#iter) function. So, let's see what happens if we pass `zipper` to [iter() ](https://docs.python.org/3/library/functions.html#iter)." ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "zipper_iterator = iter(zipper)" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "zipper_iterator" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "`zipper_iterator` references the *same* object as `zipper`! That is true for *iterators* in general: Any *iterator* created from an existing *iterator* with [iter() ](https://docs.python.org/3/library/functions.html#iter) is the *iterator* itself." ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "zipper is zipper_iterator" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "The Python core developers made that design decision so that *iterators* may also be looped over.\n", "\n", "The `for`-loop below prints out only *six* more `tuple` objects derived from the now ordered `numbers` because the `iterator1` object hidden inside `zipper` already returned its first *six* elements. So, the respective first elements of the `tuple` objects printed range from `7` to `12`. Similarly, as `iterator2` already returned its first *three* elements from `numbers`, we see the respective second elements in the range from `4` to `9`." ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "7 and 4 8 and 5 9 and 6 10 and 7 11 and 8 12 and 9 " ] } ], "source": [ "for x, y in zipper:\n", " print(x, \"and\", y, end=\" \")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "`zipper` is now *exhausted*. So, the `for`-loop below does *not* make any iteration at all." ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "for x, y in zipper:\n", " raise RuntimeError(\"We won't see this error\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "We verify that `iterator1` is exhausted by passing it to [next() ](https://docs.python.org/3/library/functions.html#next) again, which raises a `StopIteration` exception." ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "ename": "StopIteration", "evalue": "", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mStopIteration\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[25], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[38;5;28;43mnext\u001b[39;49m\u001b[43m(\u001b[49m\u001b[43miterator1\u001b[49m\u001b[43m)\u001b[49m\n", "\u001b[0;31mStopIteration\u001b[0m: " ] } ], "source": [ "next(iterator1)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "On the contrary, `iterator2` is *not* yet exhausted." ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "10" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "next(iterator2)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Understanding *iterators* and *iterables* is helpful for any data science practitioner that deals with large amounts of data. Even without that, these two terms occur everywhere in Python-related texts and documentation. So, a beginner should regularly review this section until it becomes second nature." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## The `for` Statement (revisited)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "In [Chapter 4 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/main/04_iteration/02_content.ipynb#The-for-Statement), we argue that the `for` statement is syntactic sugar, replacing the `while` statement in many common scenarios. In particular, a `for`-loop saves us two tasks: Managing an index variable *and* obtaining the individual elements by indexing. In this sub-section, we look at a more realistic picture, using the new terminology as well.\n", "\n", "Let's print out the elements of a `list` object as the *iterable* being looped over." ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "iterable = [0, 1, 2, 3, 4]" ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0 1 2 3 4 " ] } ], "source": [ "for element in iterable:\n", " print(element, end=\" \")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Our previous and equivalent formulation with a `while` statement is like so." ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0 1 2 3 4 " ] } ], "source": [ "index = 0\n", "\n", "while index < len(iterable):\n", " element = iterable[index]\n", " print(element, end=\" \")\n", " index += 1\n", "\n", "del index" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "What actually happens behind the scenes in the Python interpreter is shown below.\n", "\n", "First, Python calls [iter() ](https://docs.python.org/3/library/functions.html#iter) with the `iterable` to be looped over as the argument. The returned `iterator` contains the entire logic of how the `iterable` is looped over. In particular, the `iterator` may or may not pick the `iterable`'s elements in a predictable order. That is up to the \"rule\" it models.\n", "\n", "Second, Python enters an *indefinite* `while`-loop. It tries to obtain the next element with [next() ](https://docs.python.org/3/library/functions.html#next). If that succeeds, the `for`-loop's code block is executed. Below, that code is placed within the `else`-clause that runs only if *no* exception is raised in the `try`-clause. Then, Python jumps into the next iteration and tries to obtain the next element from the `iterator`, and so on. Once the `iterator` is exhausted, it raises a `StopIteration` exception, and Python stops the `while`-loop with the `break` statement." ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0 1 2 3 4 " ] } ], "source": [ "iterator = iter(iterable)\n", "\n", "while True:\n", " try:\n", " element = next(iterator)\n", " except StopIteration:\n", " break\n", " else:\n", " print(element, end=\" \")\n", "\n", "del iterator" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## sorted() vs. reversed()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Now that we know the concept of an *iterator*, let's compare some of the built-ins introduced in [Chapter 7 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/main/07_sequences/00_content.ipynb) in detail and make sure we understand what is going on in memory. This section also serves as a summary of all the concepts in this chapter.\n", "\n", "We use two simple examples, `numbers` and `memoryless`: `numbers` creates *thirteen* objects in memory and `memoryless` only *one* (cf., [PythonTutor ](http://pythontutor.com/visualize.html#code=numbers%20%3D%20%5B7,%2011,%208,%205,%203,%2012,%202,%206,%209,%2010,%201,%204%5D%0Amemoryless%20%3D%20range%281,%2013%29&cumulative=false&curstr=2&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false))." ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "numbers = [7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4]" ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "memoryless = range(1, 13)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "The [sorted() ](https://docs.python.org/3/library/functions.html#sorted) function takes a *finite* `iterable` argument and *materializes* its elements into a *new* `list` object that is returned.\n", "\n", "The argument may already be materialized, as is the case with `numbers`, but may also be an *iterable* without any objects in it, such as `memoryless`. In both cases, we end up with materialized `list` objects with the elements sorted in *forward* order (cf., [PythonTutor ](http://pythontutor.com/visualize.html#code=numbers%20%3D%20%5B7,%2011,%208,%205,%203,%2012,%202,%206,%209,%2010,%201,%204%5D%0Amemoryless%20%3D%20range%281,%2013%29%0Aresult1%20%3D%20sorted%28numbers%29%0Aresult2%20%3D%20sorted%28memoryless%29&cumulative=false&curstr=4&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false))." ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/plain": [ "[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sorted(numbers)" ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sorted(memoryless)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "By adding a keyword-only argument `reverse=True`, the materialized `list` objects are sorted in *reverse* order (cf., [PythonTutor ](http://pythontutor.com/visualize.html#code=numbers%20%3D%20%5B7,%2011,%208,%205,%203,%2012,%202,%206,%209,%2010,%201,%204%5D%0Amemoryless%20%3D%20range%281,%2013%29%0Aresult1%20%3D%20sorted%28numbers,%20reverse%3DTrue%29%0Aresult2%20%3D%20sorted%28memoryless,%20reverse%3DTrue%29&cumulative=false&curstr=4&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false))." ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "[12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1]" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sorted(numbers, reverse=True)" ] }, { "cell_type": "code", "execution_count": 36, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "[12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1]" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sorted(memoryless, reverse=True)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "The order in `numbers` remains *unchanged*, and `memoryless` is still *not* materialized." ] }, { "cell_type": "code", "execution_count": 37, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "[7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4]" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "numbers" ] }, { "cell_type": "code", "execution_count": 38, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "range(1, 13)" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "memoryless" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "The [reversed() ](https://docs.python.org/3/library/functions.html#reversed) built-in takes a `sequence` argument and returns an *iterator*. The argument must be *finite* and *reversible* (i.e., *iterable* in *reverse* order) as otherwise [reversed() ](https://docs.python.org/3/library/functions.html#reversed) could neither determine the last element that becomes the first nor loop in a *predictable* backward fashion. [PythonTutor ](http://pythontutor.com/visualize.html#code=numbers%20%3D%20%5B7,%2011,%208,%205,%203,%2012,%202,%206,%209,%2010,%201,%204%5D%0Amemoryless%20%3D%20range%281,%2013%29%0Aiterator1%20%3D%20reversed%28numbers%29%0Aiterator2%20%3D%20reversed%28memoryless%29&cumulative=false&curstr=4&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false) confirms that [reversed() ](https://docs.python.org/3/library/functions.html#reversed) does *not* materialize any elements but only returns an *iterator*.\n", "\n", "**Side Note**: Even though `range` objects, like `memoryless` here, do *not* \"contain\" references to other objects, they count as *sequence* types, and as such, they are also *container* types. The `in` operator works with `range` objects because we can always cast the object to be checked as an `int` and check if that lies within the `range` object's `start` and `stop` values, taking a potential `step` value into account (cf., this [blog post](https://treyhunner.com/2018/02/python-range-is-not-an-iterator/) for more details on the [range() ](https://docs.python.org/3/library/functions.html#func-range) built-in)." ] }, { "cell_type": "code", "execution_count": 39, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "reversed(numbers)" ] }, { "cell_type": "code", "execution_count": 40, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "reversed(memoryless)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "To materialize the elements, we can pass the returned *iterators* to, for example, the [list() ](https://docs.python.org/3/library/functions.html#func-list) or [tuple() ](https://docs.python.org/3/library/functions.html#func-tuple) constructors. That creates *new* `list` and `tuple` objects (cf., [PythonTutor ](http://pythontutor.com/visualize.html#code=numbers%20%3D%20%5B7,%2011,%208,%205,%203,%2012,%202,%206,%209,%2010,%201,%204%5D%0Amemoryless%20%3D%20range%281,%2013%29%0Aresult1%20%3D%20list%28reversed%28numbers%29%29%0Aresult2%20%3D%20tuple%28reversed%28memoryless%29%29&cumulative=false&curstr=4&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false)).\n", "\n", "To reiterate some more new terminology from this chapter, we describe [reversed() ](https://docs.python.org/3/library/functions.html#reversed) as *lazy* whereas [list() ](https://docs.python.org/3/library/functions.html#func-list) and [tuple() ](https://docs.python.org/3/library/functions.html#func-tuple) are *eager*. The former has no significant side effect in memory, while the latter may require a lot of memory." ] }, { "cell_type": "code", "execution_count": 41, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "[4, 1, 10, 9, 6, 2, 12, 3, 5, 8, 11, 7]" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list(reversed(numbers))" ] }, { "cell_type": "code", "execution_count": 42, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "(12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1)" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "tuple(reversed(memoryless))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Of course, we can also loop over the returned *iterators* instead.\n", "\n", "That works because *iterators* are always *iterable*; in particular, as the previous \"*The for Statement (revisited)*\" sub-section explains, the `for`-loops below call `iter(reversed(numbers))` and `iter(reversed(memoryless))` behind the scenes. However, the *iterators* returned by [iter() ](https://docs.python.org/3/library/functions.html#iter) are the *same* as the `reversed(numbers)` and `reversed(memoryless)` iterators passed in! In summary, the `for`-loops below involve many subtleties that together make Python the expressive language it is." ] }, { "cell_type": "code", "execution_count": 43, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "4 1 10 9 6 2 12 3 5 8 11 7 " ] } ], "source": [ "for number in reversed(numbers):\n", " print(number, end=\" \")" ] }, { "cell_type": "code", "execution_count": 44, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "12 11 10 9 8 7 6 5 4 3 2 1 " ] } ], "source": [ "for element in reversed(memoryless):\n", " print(element, end=\" \")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "As with [sorted() ](https://docs.python.org/3/library/functions.html#sorted), the [reversed() ](https://docs.python.org/3/library/functions.html#reversed) built-in does *not* mutate its argument." ] }, { "cell_type": "code", "execution_count": 45, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/plain": [ "[7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4]" ] }, "execution_count": 45, "metadata": {}, "output_type": "execute_result" } ], "source": [ "numbers" ] }, { "cell_type": "code", "execution_count": 46, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "range(1, 13)" ] }, "execution_count": 46, "metadata": {}, "output_type": "execute_result" } ], "source": [ "memoryless" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "To point out the potentially obvious, we compare the results of *sorting* `numbers` in *reverse* order with *reversing* it: These are *different* concepts!" ] }, { "cell_type": "code", "execution_count": 47, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "[7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4]" ] }, "execution_count": 47, "metadata": {}, "output_type": "execute_result" } ], "source": [ "numbers" ] }, { "cell_type": "code", "execution_count": 48, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "[12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1]" ] }, "execution_count": 48, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sorted(numbers, reverse=True)" ] }, { "cell_type": "code", "execution_count": 49, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "[4, 1, 10, 9, 6, 2, 12, 3, 5, 8, 11, 7]" ] }, "execution_count": 49, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list(reversed(numbers))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Whereas both [sorted() ](https://docs.python.org/3/library/functions.html#sorted) and [reversed() ](https://docs.python.org/3/library/functions.html#reversed) do *not* mutate their arguments, the *mutable* `list` type comes with two methods, [sort() ](https://docs.python.org/3/library/stdtypes.html#list.sort) and `reverse()`, that implement the same logic but mutate an object, like `numbers` below, *in place*. To indicate that all changes occur *in place*, the [sort() ](https://docs.python.org/3/library/stdtypes.html#list.sort) and `reverse()` methods always return `None`, which is not shown in JupyterLab." ] }, { "cell_type": "code", "execution_count": 50, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "[7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4]" ] }, "execution_count": 50, "metadata": {}, "output_type": "execute_result" } ], "source": [ "numbers" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "The `reverse()` method on the `list` type is *eager*, as opposed to the *lazy* [reversed() ](https://docs.python.org/3/library/functions.html#reversed) built-in. That means the mutations caused by the `reverse()` method are written into memory right away." ] }, { "cell_type": "code", "execution_count": 51, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "numbers.reverse()" ] }, { "cell_type": "code", "execution_count": 52, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "[4, 1, 10, 9, 6, 2, 12, 3, 5, 8, 11, 7]" ] }, "execution_count": 52, "metadata": {}, "output_type": "execute_result" } ], "source": [ "numbers" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "*Sorting* `numbers` in place occurs eagerly." ] }, { "cell_type": "code", "execution_count": 53, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "numbers.sort(reverse=True)" ] }, { "cell_type": "code", "execution_count": 54, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "[12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1]" ] }, "execution_count": 54, "metadata": {}, "output_type": "execute_result" } ], "source": [ "numbers" ] }, { "cell_type": "code", "execution_count": 55, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "numbers.sort()" ] }, { "cell_type": "code", "execution_count": 56, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]" ] }, "execution_count": 56, "metadata": {}, "output_type": "execute_result" } ], "source": [ "numbers" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.2" }, "livereveal": { "auto_select": "code", "auto_select_fragment": true, "scroll": true, "theme": "serif" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": false, "sideBar": true, "skip_h1_title": true, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": { "height": "calc(100% - 180px)", "left": "10px", "top": "150px", "width": "384px" }, "toc_section_display": false, "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 4 }