diff --git a/00_intro/00_content.ipynb b/00_intro/00_content.ipynb index de293c8..a5d9e15 100644 --- a/00_intro/00_content.ipynb +++ b/00_intro/00_content.ipynb @@ -770,7 +770,7 @@ "- How is data stored in memory?\n", " - *Chapter 5*: [Numbers & Bits ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/05_numbers/00_content.ipynb)\n", " - *Chapter 6*: [Text & Bytes ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/06_text/00_content.ipynb)\n", - " - *Chapter 7*: Sequential Data\n", + " - *Chapter 7*: [Sequential Data ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/07_sequences/00_content.ipynb)\n", " - *Chapter 8*: Map, Filter, & Reduce\n", " - *Chapter 9*: Mappings & Sets\n", " - *Chapter 10*: Arrays & Dataframes\n", diff --git a/01_elements/00_content.ipynb b/01_elements/00_content.ipynb index 6003bd9..4d6c792 100644 --- a/01_elements/00_content.ipynb +++ b/01_elements/00_content.ipynb @@ -1340,9 +1340,9 @@ } }, "source": [ - "Different types imply different behaviors for the objects. The `b` object, for example, may be \"asked\" if it is a whole number with the [is_integer() ](https://docs.python.org/3/library/stdtypes.html#float.is_integer) \"functionality\" that comes with *every* `float` object.\n", + "Different types imply different behaviors for the objects. The `b` object, for example, may be \"asked\" if it is a whole number with the [.is_integer() ](https://docs.python.org/3/library/stdtypes.html#float.is_integer) \"functionality\" that comes with *every* `float` object.\n", "\n", - "Formally, we call such type-specific functionalities **methods** (i.e., as opposed to functions) and we look at them in detail in [Chapter 10 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/10_classes/00_content.ipynb). For now, it suffices to know that we access them with the **dot operator** `.` on the object. Of course, `b` is a whole number, which the boolean object `True` tells us." + "Formally, we call such type-specific functionalities **methods** (i.e., as opposed to functions) and we look at them in detail in [Chapter 11 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/11_classes/00_content.ipynb). For now, it suffices to know that we access them with the **dot operator** `.` on the object. Of course, `b` is a whole number, which the boolean object `True` tells us." ] }, { @@ -1377,7 +1377,7 @@ } }, "source": [ - "For an `int` object, this [is_integer() ](https://docs.python.org/3/library/stdtypes.html#float.is_integer) check does *not* make sense as we already know it is an `int`: We see the `AttributeError` below as `a` does not even know what `is_integer()` means." + "For an `int` object, this [.is_integer() ](https://docs.python.org/3/library/stdtypes.html#float.is_integer) check does *not* make sense as we already know it is an `int`: We see the `AttributeError` below as `a` does not even know what `is_integer()` means." ] }, { diff --git a/04_iteration/03_content.ipynb b/04_iteration/03_content.ipynb index 8cfb19f..dafbe08 100644 --- a/04_iteration/03_content.ipynb +++ b/04_iteration/03_content.ipynb @@ -788,7 +788,7 @@ "\n", "First, we divide the business logic into two functions `get_guess()` and `toss_coin()` that are controlled from within a `while`-loop.\n", "\n", - "`get_guess()` not only reads in the user's input but also implements a simple input validation pattern in that the [strip() ](https://docs.python.org/3/library/stdtypes.html?highlight=__contains__#str.strip) and [lower() ](https://docs.python.org/3/library/stdtypes.html?highlight=__contains__#str.lower) methods remove preceding and trailing whitespace and lower case the input ensuring that the user may spell the input in any possible way (e.g., all upper or lower case). Also, `get_guess()` checks if the user entered one of the two valid options. If so, it returns either `\"heads\"` or `\"tails\"`; if not, it returns `None`." + "`get_guess()` not only reads in the user's input but also implements a simple input validation pattern in that the [.strip() ](https://docs.python.org/3/library/stdtypes.html?highlight=__contains__#str.strip) and [.lower() ](https://docs.python.org/3/library/stdtypes.html?highlight=__contains__#str.lower) methods remove preceding and trailing whitespace and lower case the input ensuring that the user may spell the input in any possible way (e.g., all upper or lower case). Also, `get_guess()` checks if the user entered one of the two valid options. If so, it returns either `\"heads\"` or `\"tails\"`; if not, it returns `None`." ] }, { diff --git a/05_numbers/01_content.ipynb b/05_numbers/01_content.ipynb index af18f07..00157ad 100644 --- a/05_numbers/01_content.ipynb +++ b/05_numbers/01_content.ipynb @@ -1827,7 +1827,7 @@ "\n", "The Python [documentation ](https://docs.python.org/3/tutorial/floatingpoint.html) provides another good discussion of floats and the goodness of their approximations.\n", "\n", - "If we are interested in the exact bits behind a `float` object, we use the [hex() ](https://docs.python.org/3/library/stdtypes.html#float.hex) method that returns a `str` object beginning with `\"0x1.\"` followed by the $fraction$ in hexadecimal notation and the $exponent$ as an integer after subtraction of $1023$ and separated by a `\"p\"`." + "If we are interested in the exact bits behind a `float` object, we use the [.hex() ](https://docs.python.org/3/library/stdtypes.html#float.hex) method that returns a `str` object beginning with `\"0x1.\"` followed by the $fraction$ in hexadecimal notation and the $exponent$ as an integer after subtraction of $1023$ and separated by a `\"p\"`." ] }, { @@ -1875,7 +1875,7 @@ } }, "source": [ - "Also, the [as_integer_ratio() ](https://docs.python.org/3/library/stdtypes.html#float.as_integer_ratio) method returns the two smallest integers whose ratio best approximates a `float` object." + "Also, the [.as_integer_ratio() ](https://docs.python.org/3/library/stdtypes.html#float.as_integer_ratio) method returns the two smallest integers whose ratio best approximates a `float` object." ] }, { @@ -2030,7 +2030,7 @@ } }, "source": [ - "As seen in [Chapter 1 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/01_elements/00_content.ipynb#%28Data%29-Type-%2F-%22Behavior%22), the [is_integer() ](https://docs.python.org/3/library/stdtypes.html#float.is_integer) method tells us if a `float` can be casted as an `int` object without any loss in precision." + "As seen in [Chapter 1 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/01_elements/00_content.ipynb#%28Data%29-Type-%2F-%22Behavior%22), the [.is_integer() ](https://docs.python.org/3/library/stdtypes.html#float.is_integer) method tells us if a `float` can be casted as an `int` object without any loss in precision." ] }, { @@ -2091,7 +2091,7 @@ } }, "source": [ - "As the exact implementation of floats may vary and be dependent on a particular Python installation, we look up the [float_info ](https://docs.python.org/3/library/sys.html#sys.float_info) attribute in the [sys ](https://docs.python.org/3/library/sys.html) module in the [standard library ](https://docs.python.org/3/library/index.html) to check the details. Usually, this is not necessary." + "As the exact implementation of floats may vary and be dependent on a particular Python installation, we look up the [.float_info ](https://docs.python.org/3/library/sys.html#sys.float_info) attribute in the [sys ](https://docs.python.org/3/library/sys.html) module in the [standard library ](https://docs.python.org/3/library/index.html) to check the details. Usually, this is not necessary." ] }, { diff --git a/05_numbers/02_content.ipynb b/05_numbers/02_content.ipynb index e430e90..ae1b01d 100644 --- a/05_numbers/02_content.ipynb +++ b/05_numbers/02_content.ipynb @@ -581,7 +581,7 @@ } }, "source": [ - "A `complex` number comes with two **attributes** `real` and `imag` that return the two parts as `float` objects on their own." + "A `complex` number comes with two **attributes** `.real` and `.imag` that return the two parts as `float` objects on their own." ] }, { @@ -640,7 +640,7 @@ } }, "source": [ - "Also, a `conjugate()` method is bound to every `complex` object. The [complex conjugate ](https://en.wikipedia.org/wiki/Complex_conjugate) is defined to be the complex number with identical real part but an imaginary part reversed in sign." + "Also, a `.conjugate()` method is bound to every `complex` object. The [complex conjugate ](https://en.wikipedia.org/wiki/Complex_conjugate) is defined to be the complex number with identical real part but an imaginary part reversed in sign." ] }, { diff --git a/06_text/00_content.ipynb b/06_text/00_content.ipynb index 7c9fff8..236942b 100644 --- a/06_text/00_content.ipynb +++ b/06_text/00_content.ipynb @@ -2370,7 +2370,7 @@ "source": [ "Objects of type `str` come with many **methods** bound on them (cf., the [documentation ](https://docs.python.org/3/library/stdtypes.html#string-methods) for a full list). As seen before, they work like *normal* functions and are accessed via the **dot operator** `.`. Calling a method is also referred to as **method invocation**.\n", "\n", - "The [find() ](https://docs.python.org/3/library/stdtypes.html#str.find) method returns the index of the first occurrence of a character or a substring. If no match is found, it returns `-1`. A mirrored version searching from the right called [rfind() ](https://docs.python.org/3/library/stdtypes.html#str.rfind) exists as well. The [index() ](https://docs.python.org/3/library/stdtypes.html#str.index) and [rindex() ](https://docs.python.org/3/library/stdtypes.html#str.rindex) methods work in the same way but raise a `ValueError` if no match is found. So, we can control if a search fails *silently* or *loudly*." + "The [.find() ](https://docs.python.org/3/library/stdtypes.html#str.find) method returns the index of the first occurrence of a character or a substring. If no match is found, it returns `-1`. A mirrored version searching from the right called [.rfind() ](https://docs.python.org/3/library/stdtypes.html#str.rfind) exists as well. The [.index() ](https://docs.python.org/3/library/stdtypes.html#str.index) and [.rindex() ](https://docs.python.org/3/library/stdtypes.html#str.rindex) methods work in the same way but raise a `ValueError` if no match is found. So, we can control if a search fails *silently* or *loudly*." ] }, { @@ -2477,7 +2477,7 @@ } }, "source": [ - "[find() ](https://docs.python.org/3/library/stdtypes.html#str.find) takes optional *start* and *end* arguments that allow us to find occurrences other than the first one." + "[.find() ](https://docs.python.org/3/library/stdtypes.html#str.find) takes optional *start* and *end* arguments that allow us to find occurrences other than the first one." ] }, { @@ -2560,7 +2560,7 @@ } }, "source": [ - "The [count() ](https://docs.python.org/3/library/stdtypes.html#str.count) method does what we expect." + "The [.count() ](https://docs.python.org/3/library/stdtypes.html#str.count) method does what we expect." ] }, { @@ -2619,7 +2619,7 @@ } }, "source": [ - "As [count() ](https://docs.python.org/3/library/stdtypes.html#str.count) is *case-sensitive*, we must **chain** it with the [lower() ](https://docs.python.org/3/library/stdtypes.html#str.lower) method to get the count of all `\"L\"`s and `\"l\"`s." + "As [.count() ](https://docs.python.org/3/library/stdtypes.html#str.count) is *case-sensitive*, we must **chain** it with the [.lower() ](https://docs.python.org/3/library/stdtypes.html#str.lower) method to get the count of all `\"L\"`s and `\"l\"`s." ] }, { @@ -2654,7 +2654,7 @@ } }, "source": [ - "Alternatively, we can use the [upper() ](https://docs.python.org/3/library/stdtypes.html#str.upper) method and search for `\"L\"`s." + "Alternatively, we can use the [.upper() ](https://docs.python.org/3/library/stdtypes.html#str.upper) method and search for `\"L\"`s." ] }, { @@ -2689,7 +2689,7 @@ } }, "source": [ - "Because `str` objects are *immutable*, [upper() ](https://docs.python.org/3/library/stdtypes.html#str.upper) and [lower() ](https://docs.python.org/3/library/stdtypes.html#str.lower) return *new* `str` objects, even if they do *not* change the value of the original `str` object." + "Because `str` objects are *immutable*, [.upper() ](https://docs.python.org/3/library/stdtypes.html#str.upper) and [.lower() ](https://docs.python.org/3/library/stdtypes.html#str.lower) return *new* `str` objects, even if they do *not* change the value of the original `str` object." ] }, { @@ -2833,7 +2833,7 @@ } }, "source": [ - "Besides [upper() ](https://docs.python.org/3/library/stdtypes.html#str.upper) and [lower() ](https://docs.python.org/3/library/stdtypes.html#str.lower) there exist also [title() ](https://docs.python.org/3/library/stdtypes.html#str.title) and [swapcase() ](https://docs.python.org/3/library/stdtypes.html#str.swapcase) methods." + "Besides [.upper() ](https://docs.python.org/3/library/stdtypes.html#str.upper) and [.lower() ](https://docs.python.org/3/library/stdtypes.html#str.lower) there exist also [.title() ](https://docs.python.org/3/library/stdtypes.html#str.title) and [.swapcase() ](https://docs.python.org/3/library/stdtypes.html#str.swapcase) methods." ] }, { @@ -2940,9 +2940,9 @@ } }, "source": [ - "Another popular string method is [split() ](https://docs.python.org/3/library/stdtypes.html#str.split): It separates a longer `str` object into smaller ones collected in a `list` object. By default, groups of contiguous whitespace characters are used as the *separator*.\n", + "Another popular string method is [.split() ](https://docs.python.org/3/library/stdtypes.html#str.split): It separates a longer `str` object into smaller ones collected in a `list` object. By default, groups of contiguous whitespace characters are used as the *separator*.\n", "\n", - "As an example, we use [split() ](https://docs.python.org/3/library/stdtypes.html#str.split) to print out the individual words in `text` with more whitespace in between them." + "As an example, we use [.split() ](https://docs.python.org/3/library/stdtypes.html#str.split) to print out the individual words in `text` with more whitespace in between them." ] }, { @@ -2999,7 +2999,7 @@ } }, "source": [ - "The opposite of splitting is done with the [join() ](https://docs.python.org/3/library/stdtypes.html#str.join) method. It is typically invoked on a `str` object that represents a separator (e.g., `\" \"` or `\", \"`) and connects the elements provided by an *iterable* argument (e.g., `words` below) into one *new* `str` object." + "The opposite of splitting is done with the [.join() ](https://docs.python.org/3/library/stdtypes.html#str.join) method. It is typically invoked on a `str` object that represents a separator (e.g., `\" \"` or `\", \"`) and connects the elements provided by an *iterable* argument (e.g., `words` below) into one *new* `str` object." ] }, { @@ -3095,7 +3095,7 @@ } }, "source": [ - "The [replace() ](https://docs.python.org/3/library/stdtypes.html#str.replace) method creates a *new* `str` object with parts of the original `str` object potentially replaced." + "The [.replace() ](https://docs.python.org/3/library/stdtypes.html#str.replace) method creates a *new* `str` object with parts of the original `str` object potentially replaced." ] }, { @@ -3130,7 +3130,7 @@ } }, "source": [ - "Note how `sentence` itself remains unchanged. Bound to an immutable object, [replace() ](https://docs.python.org/3/library/stdtypes.html#str.replace) must create *new* objects." + "Note how `sentence` itself remains unchanged. Bound to an immutable object, [.replace() ](https://docs.python.org/3/library/stdtypes.html#str.replace) must create *new* objects." ] }, { @@ -3165,7 +3165,7 @@ } }, "source": [ - "As seen previously, the [strip() ](https://docs.python.org/3/library/stdtypes.html#str.strip) method is often helpful in cleaning text data from unreliable sources like user input from unnecessary leading and trailing whitespace. The [lstrip() ](https://docs.python.org/3/library/stdtypes.html#str.lstrip) and [rstrip() ](https://docs.python.org/3/library/stdtypes.html#str.rstrip) methods are specialized versions of it." + "As seen previously, the [.strip() ](https://docs.python.org/3/library/stdtypes.html#str.strip) method is often helpful in cleaning text data from unreliable sources like user input from unnecessary leading and trailing whitespace. The [.lstrip() ](https://docs.python.org/3/library/stdtypes.html#str.lstrip) and [.rstrip() ](https://docs.python.org/3/library/stdtypes.html#str.rstrip) methods are specialized versions of it." ] }, { @@ -3248,7 +3248,7 @@ } }, "source": [ - "When justifying a `str` object for output, the [ljust() ](https://docs.python.org/3/library/stdtypes.html#str.ljust) and [rjust() ](https://docs.python.org/3/library/stdtypes.html#str.rjust) methods may be helpful." + "When justifying a `str` object for output, the [.ljust() ](https://docs.python.org/3/library/stdtypes.html#str.ljust) and [.rjust() ](https://docs.python.org/3/library/stdtypes.html#str.rjust) methods may be helpful." ] }, { @@ -3307,7 +3307,7 @@ } }, "source": [ - "Similarly, the [zfill() ](https://docs.python.org/3/library/stdtypes.html#str.zfill) method can be used to pad a `str` representation of a number with leading `0`s for justified output." + "Similarly, the [.zfill() ](https://docs.python.org/3/library/stdtypes.html#str.zfill) method can be used to pad a `str` representation of a number with leading `0`s for justified output." ] }, { @@ -3706,7 +3706,7 @@ } }, "source": [ - "`str` objects also provide a [format() ](https://docs.python.org/3/library/stdtypes.html#str.format) method that accepts an arbitrary number of *positional* arguments that are inserted into the `str` object in the same order replacing empty curly brackets `{}`. String interpolation with the [format() ](https://docs.python.org/3/library/stdtypes.html#str.format) method is a more traditional and probably the most common way as of today. While f-strings are the recommended way going forward, usage of the [format() ](https://docs.python.org/3/library/stdtypes.html#str.format) method is likely not declining any time soon." + "`str` objects also provide a [.format() ](https://docs.python.org/3/library/stdtypes.html#str.format) method that accepts an arbitrary number of *positional* arguments that are inserted into the `str` object in the same order replacing empty curly brackets `{}`. String interpolation with the [.format() ](https://docs.python.org/3/library/stdtypes.html#str.format) method is a more traditional and probably the most common way as of today. While f-strings are the recommended way going forward, usage of the [.format() ](https://docs.python.org/3/library/stdtypes.html#str.format) method is likely not declining any time soon." ] }, { @@ -3776,7 +3776,7 @@ } }, "source": [ - "The [format() ](https://docs.python.org/3/library/stdtypes.html#str.format) method may alternatively be used with *keyword* arguments as well. Then, we must put the keywords' names within the curly brackets." + "The [.format() ](https://docs.python.org/3/library/stdtypes.html#str.format) method may alternatively be used with *keyword* arguments as well. Then, we must put the keywords' names within the curly brackets." ] }, { diff --git a/07_sequences/00_content.ipynb b/07_sequences/00_content.ipynb new file mode 100644 index 0000000..43ec596 --- /dev/null +++ b/07_sequences/00_content.ipynb @@ -0,0 +1,958 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "**Note**: Click on \"*Kernel*\" > \"*Restart Kernel and Clear All Outputs*\" in [JupyterLab](https://jupyterlab.readthedocs.io/en/stable/) *before* reading this notebook to reset its output. If you cannot run this file on your machine, you may want to open it [in the cloud ](https://mybinder.org/v2/gh/webartifex/intro-to-python/develop?urlpath=lab/tree/07_sequences/00_content.ipynb)." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "# Chapter 7: Sequential Data" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "We studied numbers (cf., [Chapter 5 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/05_numbers/00_content.ipynb)) and textual data (cf., [Chapter 6 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/06_text/00_content.ipynb)) first mainly because objects of the presented data types are \"simple.\" That is so for two reasons: First, they are *immutable*, and, as we saw in the \"*Who am I? And how many?*\" section in [Chapter 1 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/01_elements/03_content.ipynb#Who-am-I?-And-how-many?), mutable objects can quickly become hard to reason about. Second, they are \"flat\" in the sense that they are *not* composed of other objects.\n", + "\n", + "The `str` type is a bit of a corner case in this regard. While one could argue that a longer `str` object, for example, `\"text\"`, is composed of individual characters, this is *not* the case in memory as the literal `\"text\"` only creates *one* object (i.e., one \"bag\" of $0$s and $1$s modeling all characters).\n", + "\n", + "This chapter, [Chapter 8 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/08_mfr/00_content.ipynb), [Chapter 9 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/09_mappings/00_content.ipynb), and [Chapter 10 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/10_arrays/00_content.ipynb) introduce various \"complex\" data types. While some are mutable and others are not, they all share that they are primarily used to \"manage,\" or structure, the memory in a program (i.e., they provide references to other objects). Unsurprisingly, computer scientists refer to the ideas behind these data types as **[data structures ](https://en.wikipedia.org/wiki/Data_structure)**.\n", + "\n", + "In this chapter, we focus on data types that model all kinds of sequential data. Examples of such data are [spreadsheets ](https://en.wikipedia.org/wiki/Spreadsheet) or [matrices ](https://en.wikipedia.org/wiki/Matrix_%28mathematics%29) and [vectors ](https://en.wikipedia.org/wiki/Vector_%28mathematics_and_physics%29). These formats share the property that they are composed of smaller units that come in a sequence of, for example, rows/columns/cells or elements/entries." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## Collections vs. Sequences" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "[Chapter 6 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/06_text/00_content.ipynb#A-\"String\"-of-Characters) already describes the **sequence** properties of `str` objects. In this section, we take a step back and study these properties one by one.\n", + "\n", + "The [collections.abc ](https://docs.python.org/3/library/collections.abc.html) module in the [standard library ](https://docs.python.org/3/library/index.html) defines a variety of **abstract base classes** (ABCs). We saw ABCs already in [Chapter 5 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/05_numbers/02_content.ipynb#The-Numerical-Tower), where we use the ones from the [numbers ](https://docs.python.org/3/library/numbers.html) module in the [standard library ](https://docs.python.org/3/library/index.html) to classify Python's numeric data types according to mathematical ideas. Now, we take the ABCs from the [collections.abc ](https://docs.python.org/3/library/collections.abc.html) module to classify the data types in this chapter according to their behavior in various contexts.\n", + "\n", + "As an illustration, consider `numbers` and `text` below, two objects of *different* types." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "numbers = [7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4]\n", + "text = \"Lorem ipsum dolor sit amet.\"" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Among others, one commonality between the two is that we may loop over them with the `for` statement. So, in the context of iteration, both exhibit the *same* behavior." + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "7 11 8 5 3 12 2 6 9 10 1 4 " + ] + } + ], + "source": [ + "for number in numbers:\n", + " print(number, end=\" \")" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "L o r e m i p s u m d o l o r s i t a m e t . " + ] + } + ], + "source": [ + "for character in text:\n", + " print(character, end=\" \")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "In [Chapter 4 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/04_iteration/02_content.ipynb#Containers-vs.-Iterables), we referred to such types as *iterables*. That is *not* a proper [English](https://dictionary.cambridge.org/spellcheck/english-german/?q=iterable) word, even if it may sound like one at first sight. Yet, it is an official term in the Python world formalized with the `Iterable` ABC in the [collections.abc ](https://docs.python.org/3/library/collections.abc.html) module.\n", + "\n", + "For the data science practitioner, it is worthwhile to know such terms as, for example, the documentation on the [built-ins ](https://docs.python.org/3/library/functions.html) uses them extensively: In simple words, any built-in that takes an argument called \"*iterable*\" may be called with *any* object that supports being looped over. Already familiar [built-ins ](https://docs.python.org/3/library/functions.html) include [enumerate() ](https://docs.python.org/3/library/functions.html#enumerate), [sum() ](https://docs.python.org/3/library/functions.html#sum), or [zip() ](https://docs.python.org/3/library/functions.html#zip). So, they do *not* require the argument to be of a certain data type (e.g., `list`); instead, any *iterable* type works." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "import collections.abc as abc" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "collections.abc.Iterable" + ] + }, + "execution_count": 5, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "abc.Iterable" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "As seen in [Chapter 5 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/05_numbers/02_content.ipynb#Goose-Typing), we can use ABCs with the built-in [isinstance() ](https://docs.python.org/3/library/functions.html#isinstance) function to check if an object supports a behavior.\n", + "\n", + "So, let's \"ask\" Python if it can loop over `numbers` or `text`." + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 6, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "isinstance(numbers, abc.Iterable)" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 7, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "isinstance(text, abc.Iterable)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Contrary to `list` or `str` objects, numeric objects are *not* iterable." + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "False" + ] + }, + "execution_count": 8, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "isinstance(999, abc.Iterable)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Instead of asking, we could try to loop over `999`, but this results in a `TypeError`." + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "ename": "TypeError", + "evalue": "'int' object is not iterable", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0;32mfor\u001b[0m \u001b[0mdigit\u001b[0m \u001b[0;32min\u001b[0m \u001b[0;36m999\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mdigit\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;31mTypeError\u001b[0m: 'int' object is not iterable" + ] + } + ], + "source": [ + "for digit in 999:\n", + " print(digit)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Most of the data types in this chapter and [Chapter 9 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/09_mappings/00_content.ipynb) and [Chapter 10 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/10_arrays/00_content.ipynb) exhibit three [orthogonal ](https://en.wikipedia.org/wiki/Orthogonality) (i.e., \"independent\") behaviors, formalized by ABCs in the [collections.abc ](https://docs.python.org/3/library/collections.abc.html) module as:\n", + "- `Iterable`: An object may be looped over.\n", + "- `Container`: An object \"contains\" references to other objects; a \"whole\" is composed of many \"parts.\"\n", + "- `Sized`: The number of references to other objects, the \"parts,\" is *finite*.\n", + "\n", + "The characteristical operation supported by `Container` types is the `in` operator for membership testing." + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "False" + ] + }, + "execution_count": 10, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "0 in numbers" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 11, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "\"l\" in text" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Alternatively, we could also check if `numbers` and `text` are `Container` types with [isinstance() ](https://docs.python.org/3/library/functions.html#isinstance)." + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 12, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "isinstance(numbers, abc.Container)" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 13, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "isinstance(text, abc.Container)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Numeric objects do *not* \"contain\" references to other objects, and that is why they are considered \"flat\" data types. The `in` operator raises a `TypeError`. Conceptually speaking, Python views numeric types as \"wholes\" without any \"parts.\"" + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "False" + ] + }, + "execution_count": 14, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "isinstance(999, abc.Container)" + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "ename": "TypeError", + "evalue": "argument of type 'int' is not iterable", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0;36m9\u001b[0m \u001b[0;32min\u001b[0m \u001b[0;36m999\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;31mTypeError\u001b[0m: argument of type 'int' is not iterable" + ] + } + ], + "source": [ + "9 in 999" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Analogously, being `Sized` types, we can pass `numbers` and `text` as the argument to the built-in [len() ](https://docs.python.org/3/library/functions.html#len) function and obtain \"meaningful\" results. The exact meaning depends on the data type: For `numbers`, [len() ](https://docs.python.org/3/library/functions.html#len) tells us how many elements are in the `list` object; for `text`, it tells us how many [Unicode characters ](https://en.wikipedia.org/wiki/Unicode) make up the `str` object. *Abstractly* speaking, both data types exhibit the *same* behavior of *finiteness*." + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "12" + ] + }, + "execution_count": 16, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "len(numbers)" + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "27" + ] + }, + "execution_count": 17, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "len(text)" + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 18, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "isinstance(numbers, abc.Sized)" + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 19, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "isinstance(text, abc.Sized)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "On the contrary, even though `999` consists of three digits for humans, numeric objects in Python have no concept of a \"size\" or \"length,\" and the [len() ](https://docs.python.org/3/library/functions.html#len) function raises a `TypeError`." + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "False" + ] + }, + "execution_count": 20, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "isinstance(999, abc.Sized)" + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "ename": "TypeError", + "evalue": "object of type 'int' has no len()", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mlen\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m999\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;31mTypeError\u001b[0m: object of type 'int' has no len()" + ] + } + ], + "source": [ + "len(999)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "These three behaviors are so essential that whenever they coincide for a data type, it is called a **collection**, formalized with the `Collection` ABC. That is where the [collections.abc ](https://docs.python.org/3/library/collections.abc.html) module got its name from: It summarizes all ABCs related to collections; in particular, it defines a hierarchy of specialized kinds of collections.\n", + "\n", + "Without going into too much detail, one way to read the summary table at the beginning of the [collections.abc ](https://docs.python.org/3/library/collections.abc.html) module's documention is as follows: The first column, titled \"ABC\", lists all collection-related ABCs in Python. The second column, titled \"Inherits from,\" indicates if the idea behind the ABC is *original* (e.g., the first row with the `Container` ABC has an empty \"Inherits from\" column) or a *combination* (e.g., the row with the `Collection` ABC has `Sized`, `Iterable`, and `Container` in the \"Inherits from\" column). The third and fourth columns list the methods that come with a data type following an ABC. We keep ignoring the methods named in the dunder style for now.\n", + "\n", + "So, let's confirm that both `numbers` and `text` are collections." + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 22, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "isinstance(numbers, abc.Collection)" + ] + }, + { + "cell_type": "code", + "execution_count": 23, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 23, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "isinstance(text, abc.Collection)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "They share one more common behavior: When looping over them, we can *predict* the *order* of the elements or characters. The ABC in the [collections.abc ](https://docs.python.org/3/library/collections.abc.html) module corresponding to this behavior is `Reversible`. While sounding unintuitive at first, it is evident that if something is reversible, it must have a forward order, to begin with.\n", + "\n", + "The [reversed() ](https://docs.python.org/3/library/functions.html#reversed) built-in allows us to loop over the elements or characters in reverse order." + ] + }, + { + "cell_type": "code", + "execution_count": 24, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "4 1 10 9 6 2 12 3 5 8 11 7 " + ] + } + ], + "source": [ + "for number in reversed(numbers):\n", + " print(number, end=\" \")" + ] + }, + { + "cell_type": "code", + "execution_count": 25, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + ". t e m a t i s r o l o d m u s p i m e r o L " + ] + } + ], + "source": [ + "for character in reversed(text):\n", + " print(character, end=\" \")" + ] + }, + { + "cell_type": "code", + "execution_count": 26, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 26, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "isinstance(numbers, abc.Reversible)" + ] + }, + { + "cell_type": "code", + "execution_count": 27, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 27, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "isinstance(text, abc.Reversible)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Collections that exhibit this fourth behavior are referred to as **sequences**, formalized with the `Sequence` ABC in the [collections.abc ](https://docs.python.org/3/library/collections.abc.html) module." + ] + }, + { + "cell_type": "code", + "execution_count": 28, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 28, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "isinstance(numbers, abc.Sequence)" + ] + }, + { + "cell_type": "code", + "execution_count": 29, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 29, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "isinstance(text, abc.Sequence)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The data types introduced in this chapter are sequences. Nevertheless, we also look at some data types that are neither collections nor sequences but are still useful to model sequential data in practice in [Chapter 8 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/08_mfr/00_content.ipynb).\n", + "\n", + "In Python-related documentations, the terms collection and sequence are heavily used, and the data science practitioner should always think of them in terms of the three or four behaviors they exhibit.\n", + "\n", + "Data types that are collections but not sequences are covered in [Chapter 9 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/09_mappings/00_content.ipynb)." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.6" + }, + "livereveal": { + "auto_select": "code", + "auto_select_fragment": true, + "scroll": true, + "theme": "serif" + }, + "toc": { + "base_numbering": 1, + "nav_menu": {}, + "number_sections": false, + "sideBar": true, + "skip_h1_title": true, + "title_cell": "Table of Contents", + "title_sidebar": "Contents", + "toc_cell": false, + "toc_position": { + "height": "calc(100% - 180px)", + "left": "10px", + "top": "150px", + "width": "384px" + }, + "toc_section_display": false, + "toc_window_display": false + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/07_sequences/01_content.ipynb b/07_sequences/01_content.ipynb new file mode 100644 index 0000000..1dd590a --- /dev/null +++ b/07_sequences/01_content.ipynb @@ -0,0 +1,2999 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "**Note**: Click on \"*Kernel*\" > \"*Restart Kernel and Clear All Outputs*\" in [JupyterLab](https://jupyterlab.readthedocs.io/en/stable/) *before* reading this notebook to reset its output. If you cannot run this file on your machine, you may want to open it [in the cloud ](https://mybinder.org/v2/gh/webartifex/intro-to-python/develop?urlpath=lab/tree/07_sequences/01_content.ipynb)." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "# Chapter 7: Sequential Data (continued)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "In this second part of the chapter, we look closely at the built-in `list` type, which is probably the most commonly used sequence data type in practice." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## The `list` Type" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "As already seen multiple times, to create a `list` object, we use the *literal notation* and list all elements within brackets `[` and `]`." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "empty = []" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "simple = [40, 50]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The elements do *not* need to be of the *same* type, and `list` objects may also be **nested**." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "nested = [empty, 10, 20.0, \"Thirty\", simple]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "[PythonTutor ](http://pythontutor.com/visualize.html#code=empty%20%3D%20%5B%5D%0Asimple%20%3D%20%5B40,%2050%5D%0Anested%20%3D%20%5Bempty,%2010,%2020.0,%20%22Thirty%22,%20simple%5D&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false) shows how `nested` holds references to the `empty` and `simple` objects. Technically, it holds three more references to the `10`, `20.0`, and `\"Thirty\"` objects as well. However, to simplify the visualization, these three objects are shown right inside the `nested` object. That may be done because they are immutable and \"flat\" data types. In general, the $0$s and $1$s inside a `list` object in memory always constitute references to other objects only." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[[], 10, 20.0, 'Thirty', [40, 50]]" + ] + }, + "execution_count": 4, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Let's not forget that `nested` is an object on its own with an *identity* and *data type*." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "139688113982848" + ] + }, + "execution_count": 5, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "id(nested)" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "list" + ] + }, + "execution_count": 6, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "type(nested)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Alternatively, we use the built-in [list() ](https://docs.python.org/3/library/functions.html#func-list) constructor to create a `list` object out of any (finite) *iterable* we pass to it as the argument.\n", + "\n", + "For example, we can wrap the [range() ](https://docs.python.org/3/library/functions.html#func-range) built-in with [list() ](https://docs.python.org/3/library/functions.html#func-list): As described in [Chapter 4 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/04_iteration/02_content.ipynb#Containers-vs.-Iterables), `range` objects, like `range(1, 13)` below, are iterable and generate `int` objects \"on the fly\" (i.e., one by one). The [list() ](https://docs.python.org/3/library/functions.html#func-list) around it acts like a `for`-loop and **materializes** twelve `int` objects in memory that then become the elements of the newly created `list` object. [PythonTutor ](http://pythontutor.com/visualize.html#code=r%20%3D%20range%281,%2013%29%0Al%20%3D%20list%28range%281,%2013%29%29&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false) shows this difference visually." + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]" + ] + }, + "execution_count": 7, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "list(range(1, 13))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Beware of passing a `range` object over a \"big\" horizon as the argument to [list() ](https://docs.python.org/3/library/functions.html#func-list) as that may lead to a `MemoryError` and the computer crashing." + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "ename": "MemoryError", + "evalue": "", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mMemoryError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mlist\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mrange\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m999_999_999_999\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;31mMemoryError\u001b[0m: " + ] + } + ], + "source": [ + "list(range(999_999_999_999))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "As another example, we create a `list` object from a `str` object, which is iterable, as well. Then, the individual characters become the elements of the new `list` object!" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['i', 't', 'e', 'r', 'a', 'b', 'l', 'e']" + ] + }, + "execution_count": 9, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "list(\"iterable\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## Sequence Behaviors" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`list` objects are *sequences*. To reiterate that, we briefly summarize the *four* behaviors of a sequence and provide some more `list`-specific details below:" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "- `Container`:\n", + " - holds references to other objects in memory (with their own *identity* and *type*)\n", + " - implements membership testing via the `in` operator\n", + "- `Iterable`:\n", + " - supports being looped over\n", + " - works with the `for` or `while` statements\n", + "- `Reversible`:\n", + " - the elements come in a *predictable* order that we may loop over in a forward or backward fashion\n", + " - works with the [reversed() ](https://docs.python.org/3/library/functions.html#reversed) built-in\n", + "- `Sized`:\n", + " - the number of elements is finite *and* known in advance\n", + " - works with the built-in [len() ](https://docs.python.org/3/library/functions.html#len) function" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The \"length\" of `nested` is *five* because `empty` and `simple` count as *one* element each. In other words, `nested` holds five references to other objects, two of which are `list` objects." + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "5" + ] + }, + "execution_count": 10, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "len(nested)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "With a `for`-loop, we can iterate over all elements in a *predictable* order, forward or backward. As `list` objects hold *references* to other *objects*, these have an *indentity* and may even be of *different* types; however, the latter observation is rarely, if ever, useful in practice." + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[] 139688113991744 \n", + "10 94291567220768 \n", + "20.0 139688114016656 \n", + "Thirty 139688114205808 \n", + "[40, 50] 139688113982208 \n" + ] + } + ], + "source": [ + "for element in nested:\n", + " print(str(element).ljust(10), str(id(element)).ljust(18), type(element))" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[40, 50] Thirty 20.0 10 [] " + ] + } + ], + "source": [ + "for element in reversed(nested):\n", + " print(element, end=\" \")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The `in` operator checks if a given object is \"contained\" in a `list` object. It uses the `==` operator behind the scenes (i.e., *not* the `is` operator) conducting a **[linear search ](https://en.wikipedia.org/wiki/Linear_search)**: So, Python implicitly loops over *all* elements and only stops prematurely if an element evaluates equal to the searched object. A linear search may, therefore, be relatively *slow* for big `list` objects." + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 13, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "10 in nested" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`20` compares equal to the `20.0` in `nested`." + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 14, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "20 in nested" + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "False" + ] + }, + "execution_count": 15, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "30 in nested" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## Indexing" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Because of the *predictable* order and the *finiteness*, each element in a sequence can be labeled with a unique *index*, an `int` object in the range $0 \\leq \\text{index} < \\lvert \\text{sequence} \\rvert$.\n", + "\n", + "Brackets, `[` and `]`, are the literal syntax for accessing individual elements of any sequence type. In this book, we also call them the *indexing operator* in this context." + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[]" + ] + }, + "execution_count": 16, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested[0]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The last index is one less than `len(nested)`, and Python raises an `IndexError` if we look up an index that is not in the range." + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "ename": "IndexError", + "evalue": "list index out of range", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mIndexError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mnested\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m5\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;31mIndexError\u001b[0m: list index out of range" + ] + } + ], + "source": [ + "nested[5]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Negative indices are used to count in reverse order from the end of a sequence, and brackets may be chained to access nested objects. So, to access the `50` inside `simple` via the `nested` object, we write `nested[-1][1]`." + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[40, 50]" + ] + }, + "execution_count": 18, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested[-1]" + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "50" + ] + }, + "execution_count": 19, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested[-1][1]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## Slicing" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Slicing `list` objects works analogously to slicing `str` objects: We use the literal syntax with either one or two colons `:` inside the brackets `[]` to separate the *start*, *stop*, and *step* values. Slicing creates a *new* `list` object with the elements chosen from the original one.\n", + "\n", + "For example, to obtain the three elements in the \"middle\" of `nested`, we slice from `1` (including) to `4` (excluding)." + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[10, 20.0, 'Thirty']" + ] + }, + "execution_count": 20, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested[1:4]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "To obtain \"every other\" element, we slice from beginning to end, defaulting to `0` and `len(nested)` when omitted, in steps of `2`." + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[[], 20.0, [40, 50]]" + ] + }, + "execution_count": 21, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested[::2]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The literal notation with the colons `:` is *syntactic sugar*. It saves us from using the [slice() ](https://docs.python.org/3/library/functions.html#slice) built-in to create `slice` objects. [slice() ](https://docs.python.org/3/library/functions.html#slice) takes *start*, *stop*, and *step* arguments in the same way as the familiar [range() ](https://docs.python.org/3/library/functions.html#func-range), and the `slice` objects it creates are used just as *indexes* above.\n", + "\n", + "In most cases, the literal notation is more convenient to use; however, with `slice` objects, we can give names to slices and reuse them across several sequences." + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [], + "source": [ + "middle = slice(1, 4)" + ] + }, + { + "cell_type": "code", + "execution_count": 23, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "slice" + ] + }, + "execution_count": 23, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "type(middle)" + ] + }, + { + "cell_type": "code", + "execution_count": 24, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[10, 20.0, 'Thirty']" + ] + }, + "execution_count": 24, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested[middle]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`slice` objects come with three read-only attributes `start`, `stop`, and `step` on them." + ] + }, + { + "cell_type": "code", + "execution_count": 25, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "1" + ] + }, + "execution_count": 25, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "middle.start" + ] + }, + { + "cell_type": "code", + "execution_count": 26, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "4" + ] + }, + "execution_count": 26, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "middle.stop" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "If not passed to [slice() ](https://docs.python.org/3/library/functions.html#slice), these attributes default to `None`. That is why the cell below has no output." + ] + }, + { + "cell_type": "code", + "execution_count": 27, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [], + "source": [ + "middle.step" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "A good trick to know is taking a \"full\" slice: This copies *all* elements of a `list` object into a *new* `list` object." + ] + }, + { + "cell_type": "code", + "execution_count": 28, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "nested_copy = nested[:]" + ] + }, + { + "cell_type": "code", + "execution_count": 29, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[[], 10, 20.0, 'Thirty', [40, 50]]" + ] + }, + "execution_count": 29, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested_copy" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "At first glance, `nested` and `nested_copy` seem to cause no pain. For `list` objects, the comparison operator `==` goes over the elements in both operands in a pairwise fashion and checks if they all evaluate equal (cf., the \"*List Comparison*\" section below for more details).\n", + "\n", + "We confirm that `nested` and `nested_copy` compare equal as could be expected but also that they are *different* objects." + ] + }, + { + "cell_type": "code", + "execution_count": 30, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 30, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested == nested_copy" + ] + }, + { + "cell_type": "code", + "execution_count": 31, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "False" + ] + }, + "execution_count": 31, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested is nested_copy" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "However, as [PythonTutor ](http://pythontutor.com/visualize.html#code=nested%20%3D%20%5B%5B%5D,%2010,%2020.0,%20%22Thirty%22,%20%5B40,%2050%5D%5D%0Anested_copy%20%3D%20nested%5B%3A%5D&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false) reveals, only the *references* to the elements are copied, and not the objects in `nested` themselves! Because of that, `nested_copy` is a so-called **[shallow copy ](https://en.wikipedia.org/wiki/Object_copying#Shallow_copy)** of `nested`.\n", + "\n", + "We could also see this with the [id() ](https://docs.python.org/3/library/functions.html#id) function: The respective first elements in both `nested` and `nested_copy` are the *same* object, namely `empty`. So, we have three ways of accessing the *same* address in memory. Also, we say that `nested` and `nested_copy` partially share the *same* state." + ] + }, + { + "cell_type": "code", + "execution_count": 32, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 32, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested[0] is nested_copy[0]" + ] + }, + { + "cell_type": "code", + "execution_count": 33, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "139688113991744" + ] + }, + "execution_count": 33, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "id(nested[0])" + ] + }, + { + "cell_type": "code", + "execution_count": 34, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "139688113991744" + ] + }, + "execution_count": 34, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "id(nested_copy[0])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Knowing this becomes critical if the elements in a `list` object are mutable objects (i.e., we can change them *in place*), and this is the case with `nested` and `nested_copy`, as we see in the next section on \"*Mutability*\".\n", + "\n", + "As both the original `nested` object and its copy reference the *same* `list` objects in memory, any changes made to them are visible to both! Because of that, working with shallow copies can easily become confusing.\n", + "\n", + "Instead of a shallow copy, we could also create a so-called **[deep copy ](https://en.wikipedia.org/wiki/Object_copying#Deep_copy)** of `nested`: Then, the copying process recursively follows every reference in a nested data structure and creates copies of *every* object found.\n", + "\n", + "To explicitly create shallow or deep copies, the [copy ](https://docs.python.org/3/library/copy.html) module in the [standard library ](https://docs.python.org/3/library/index.html) provides two functions, [copy() ](https://docs.python.org/3/library/copy.html#copy.copy) and [deepcopy() ](https://docs.python.org/3/library/copy.html#copy.deepcopy). We must always remember that slicing creates *shallow* copies only." + ] + }, + { + "cell_type": "code", + "execution_count": 35, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [], + "source": [ + "import copy" + ] + }, + { + "cell_type": "code", + "execution_count": 36, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [], + "source": [ + "nested_deep_copy = copy.deepcopy(nested)" + ] + }, + { + "cell_type": "code", + "execution_count": 37, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 37, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested == nested_deep_copy" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Now, the first elements of `nested` and `nested_deep_copy` are *different* objects, and [PythonTutor ](http://pythontutor.com/visualize.html#code=import%20copy%0Anested%20%3D%20%5B%5B%5D,%2010,%2020.0,%20%22Thirty%22,%20%5B40,%2050%5D%5D%0Anested_deep_copy%20%3D%20copy.deepcopy%28nested%29&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false) shows that there are *six* `list` objects in memory." + ] + }, + { + "cell_type": "code", + "execution_count": 38, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "False" + ] + }, + "execution_count": 38, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested[0] is nested_deep_copy[0]" + ] + }, + { + "cell_type": "code", + "execution_count": 39, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "139688113991744" + ] + }, + "execution_count": 39, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "id(nested[0])" + ] + }, + { + "cell_type": "code", + "execution_count": 40, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "139688113233152" + ] + }, + "execution_count": 40, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "id(nested_deep_copy[0])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "As this [StackOverflow question ](https://stackoverflow.com/questions/184710/what-is-the-difference-between-a-deep-copy-and-a-shallow-copy) shows, understanding shallow and deep copies is a common source of confusion, independent of the programming language." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## Mutability" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "In contrast to `str` objects, `list` objects are *mutable*: We may assign new elements to indices or slices and also remove elements. That changes the *references* in a `list` object. In general, if an object is *mutable*, we say that it may be changed *in place*." + ] + }, + { + "cell_type": "code", + "execution_count": 41, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "nested[0] = 0" + ] + }, + { + "cell_type": "code", + "execution_count": 42, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[0, 10, 20.0, 'Thirty', [40, 50]]" + ] + }, + "execution_count": 42, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "When we re-assign a slice, we can even change the size of the `list` object." + ] + }, + { + "cell_type": "code", + "execution_count": 43, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "nested[:4] = [100, 100, 100]" + ] + }, + { + "cell_type": "code", + "execution_count": 44, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[100, 100, 100, [40, 50]]" + ] + }, + "execution_count": 44, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested" + ] + }, + { + "cell_type": "code", + "execution_count": 45, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "4" + ] + }, + "execution_count": 45, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "len(nested)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The `list` object's identity does *not* change. That is the main point behind mutable objects." + ] + }, + { + "cell_type": "code", + "execution_count": 46, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "139688113982848" + ] + }, + "execution_count": 46, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "id(nested) # same memory location as before" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`nested_copy` is unchanged!" + ] + }, + { + "cell_type": "code", + "execution_count": 47, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[[], 10, 20.0, 'Thirty', [40, 50]]" + ] + }, + "execution_count": 47, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested_copy" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Let's change the nested `[40, 50]` via `nested_copy` into `[1, 2, 3]` by replacing all its elements." + ] + }, + { + "cell_type": "code", + "execution_count": 48, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "nested_copy[-1][:] = [1, 2, 3]" + ] + }, + { + "cell_type": "code", + "execution_count": 49, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[[], 10, 20.0, 'Thirty', [1, 2, 3]]" + ] + }, + "execution_count": 49, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested_copy" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "That has a surprising side effect on `nested`." + ] + }, + { + "cell_type": "code", + "execution_count": 50, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[100, 100, 100, [1, 2, 3]]" + ] + }, + "execution_count": 50, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "That is precisely the confusion we talked about above when we said that `nested_copy` is a *shallow* copy of `nested`. [PythonTutor ](http://pythontutor.com/visualize.html#code=nested%20%3D%20%5B%5B%5D,%2010,%2020.0,%20%22Thirty%22,%20%5B40,%2050%5D%5D%0Anested_copy%20%3D%20nested%5B%3A%5D%0Anested%5B%3A4%5D%20%3D%20%5B100,%20100,%20100%5D%0Anested_copy%5B-1%5D%5B%3A%5D%20%3D%20%5B1,%202,%203%5D&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false) shows how both reference the *same* nested `list` object that is changed *in place* from `[40, 50]` into `[1, 2, 3]`.\n", + "\n", + "Lastly, we use the `del` statement to remove an element." + ] + }, + { + "cell_type": "code", + "execution_count": 51, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "del nested[-1]" + ] + }, + { + "cell_type": "code", + "execution_count": 52, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[100, 100, 100]" + ] + }, + "execution_count": 52, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The `del` statement also works for slices. Here, we remove all references `nested` holds." + ] + }, + { + "cell_type": "code", + "execution_count": 53, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "del nested[:]" + ] + }, + { + "cell_type": "code", + "execution_count": 54, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[]" + ] + }, + "execution_count": 54, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Mutability for sequences is formalized by the `MutableSequence` ABC in the [collections.abc ](https://docs.python.org/3/library/collections.abc.html) module." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## List Methods" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The `list` type is an essential data structure in any real-world Python application, and many typical `list`-related algorithms from computer science theory are already built into it at the C level (cf., the [documentation ](https://docs.python.org/3/library/stdtypes.html#mutable-sequence-types) or the [tutorial ](https://docs.python.org/3/tutorial/datastructures.html#more-on-lists) for a full overview; unfortunately, not all methods have direct links). So, understanding and applying the built-in methods of the `list` type not only speeds up the development process but also makes programs significantly faster.\n", + "\n", + "In contrast to the `str` type's methods in [Chapter 6 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/06_text/00_content.ipynb#String-Methods) (e.g., [.upper() ](https://docs.python.org/3/library/stdtypes.html#str.upper) or [.lower() ](https://docs.python.org/3/library/stdtypes.html#str.lower)), the `list` type's methods that mutate an object do so *in place*. That means they *never* create *new* `list` objects and return `None` to indicate that. So, we must *never* assign the return value of `list` methods to the variable holding the list!\n", + "\n", + "Let's look at the following `names` example." + ] + }, + { + "cell_type": "code", + "execution_count": 55, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "names = [\"Carl\", \"Peter\"]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "To add an object to the end of `names`, we use the `.append()` method. The code cell shows no output indicating that `None` must be the return value." + ] + }, + { + "cell_type": "code", + "execution_count": 56, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "names.append(\"Eckardt\")" + ] + }, + { + "cell_type": "code", + "execution_count": 57, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Carl', 'Peter', 'Eckardt']" + ] + }, + "execution_count": 57, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "With the `.extend()` method, we may also append multiple elements provided by an iterable. Here, the iterable is a `list` object itself holding two `str` objects." + ] + }, + { + "cell_type": "code", + "execution_count": 58, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "names.extend([\"Karl\", \"Oliver\"])" + ] + }, + { + "cell_type": "code", + "execution_count": 59, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Carl', 'Peter', 'Eckardt', 'Karl', 'Oliver']" + ] + }, + "execution_count": 59, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Similar to `.append()`, we may add a new element at an arbitrary position with the `.insert()` method. `.insert()` takes two arguments, an *index* and the element to be inserted." + ] + }, + { + "cell_type": "code", + "execution_count": 60, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "names.insert(1, \"Berthold\")" + ] + }, + { + "cell_type": "code", + "execution_count": 61, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Carl', 'Berthold', 'Peter', 'Eckardt', 'Karl', 'Oliver']" + ] + }, + "execution_count": 61, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`list` objects may be sorted *in place* with the [.sort() ](https://docs.python.org/3/library/stdtypes.html#list.sort) method. That is different from the built-in [sorted() ](https://docs.python.org/3/library/functions.html#sorted) function that takes any *finite* and *iterable* object and returns a *new* `list` object with the iterable's elements sorted!" + ] + }, + { + "cell_type": "code", + "execution_count": 62, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Berthold', 'Carl', 'Eckardt', 'Karl', 'Oliver', 'Peter']" + ] + }, + "execution_count": 62, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "sorted(names)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "As the previous code cell created a *new* `list` object, `names` is still unsorted." + ] + }, + { + "cell_type": "code", + "execution_count": 63, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Carl', 'Berthold', 'Peter', 'Eckardt', 'Karl', 'Oliver']" + ] + }, + "execution_count": 63, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Let's sort the elements in `names` instead." + ] + }, + { + "cell_type": "code", + "execution_count": 64, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "names.sort()" + ] + }, + { + "cell_type": "code", + "execution_count": 65, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Berthold', 'Carl', 'Eckardt', 'Karl', 'Oliver', 'Peter']" + ] + }, + "execution_count": 65, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "To sort in reverse order, we pass a keyword-only `reverse=True` argument to either the [.sort() ](https://docs.python.org/3/library/stdtypes.html#list.sort) method or the [sorted() ](https://docs.python.org/3/library/functions.html#sorted) function." + ] + }, + { + "cell_type": "code", + "execution_count": 66, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "names.sort(reverse=True)" + ] + }, + { + "cell_type": "code", + "execution_count": 67, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Peter', 'Oliver', 'Karl', 'Eckardt', 'Carl', 'Berthold']" + ] + }, + "execution_count": 67, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The [.sort() ](https://docs.python.org/3/library/stdtypes.html#list.sort) method and the [sorted() ](https://docs.python.org/3/library/functions.html#sorted) function sort the elements in `names` in alphabetical order, forward or backward. However, that does *not* hold in general.\n", + "\n", + "We mention above that `list` objects may contain objects of *any* type and even of *mixed* types. Because of that, the sorting is **[delegated ](https://en.wikipedia.org/wiki/Delegation_(object-oriented_programming))** to the elements in a `list` object. In a way, Python \"asks\" the elements in a `list` object to sort themselves. As `names` contains only `str` objects, they are sorted according the the comparison rules explained in [Chapter 6 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/06_text/00_content.ipynb#String-Comparison).\n", + "\n", + "To customize the sorting, we pass a keyword-only `key` argument to [.sort() ](https://docs.python.org/3/library/stdtypes.html#list.sort) or [sorted() ](https://docs.python.org/3/library/functions.html#sorted), which must be a `function` object accepting *one* positional argument. Then, the elements in the `list` object are passed to that one by one, and the return values are used as the **sort keys**. The `key` argument is also a popular use case for `lambda` expressions.\n", + "\n", + "For example, to sort `names` not by alphabet but by the names' lengths, we pass in a reference to the built-in [len() ](https://docs.python.org/3/library/functions.html#len) function as `key=len`. Note that there are *no* parentheses after `len`!" + ] + }, + { + "cell_type": "code", + "execution_count": 68, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "names.sort(key=len)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "If two names have the same length, their relative order is kept as is. That is why `\"Karl\"` comes before `\"Carl\" ` below. A [sorting algorithm ](https://en.wikipedia.org/wiki/Sorting_algorithm) with that property is called **[stable ](https://en.wikipedia.org/wiki/Sorting_algorithm#Stability)**.\n", + "\n", + "Sorting is an important topic in programming, and we refer to the official [HOWTO ](https://docs.python.org/3/howto/sorting.html) for a more comprehensive introduction." + ] + }, + { + "cell_type": "code", + "execution_count": 69, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Karl', 'Carl', 'Peter', 'Oliver', 'Eckardt', 'Berthold']" + ] + }, + "execution_count": 69, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`.sort(reverse=True)` is different from the `.reverse()` method. Whereas the former applies some sorting rule in reverse order, the latter simply reverses the elements in a `list` object." + ] + }, + { + "cell_type": "code", + "execution_count": 70, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "names.reverse()" + ] + }, + { + "cell_type": "code", + "execution_count": 71, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Berthold', 'Eckardt', 'Oliver', 'Peter', 'Carl', 'Karl']" + ] + }, + "execution_count": 71, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The `.pop()` method removes the *last* element from a `list` object *and* returns it. Below we **capture** the `removed` element to show that the return value is not `None` as with all the methods introduced so far." + ] + }, + { + "cell_type": "code", + "execution_count": 72, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "removed = names.pop()" + ] + }, + { + "cell_type": "code", + "execution_count": 73, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'Karl'" + ] + }, + "execution_count": 73, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "removed" + ] + }, + { + "cell_type": "code", + "execution_count": 74, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Berthold', 'Eckardt', 'Oliver', 'Peter', 'Carl']" + ] + }, + "execution_count": 74, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`.pop()` takes an optional *index* argument and removes that instead.\n", + "\n", + "So, to remove the second element, `\"Eckhardt\"`, from `names`, we write this." + ] + }, + { + "cell_type": "code", + "execution_count": 75, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "removed = names.pop(1)" + ] + }, + { + "cell_type": "code", + "execution_count": 76, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'Eckardt'" + ] + }, + "execution_count": 76, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "removed" + ] + }, + { + "cell_type": "code", + "execution_count": 77, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Berthold', 'Oliver', 'Peter', 'Carl']" + ] + }, + "execution_count": 77, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Instead of removing an element by its index, we can also remove it by its value with the `.remove()` method. Behind the scenes, Python then compares the object to be removed, `\"Peter\"` in the example, sequentially to each element with the `==` operator and removes the *first* one that evaluates equal to it. `.remove()` does *not* return the removed element." + ] + }, + { + "cell_type": "code", + "execution_count": 78, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "names.remove(\"Peter\")" + ] + }, + { + "cell_type": "code", + "execution_count": 79, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Berthold', 'Oliver', 'Carl']" + ] + }, + "execution_count": 79, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Also, `.remove()` raises a `ValueError` if the value is not found." + ] + }, + { + "cell_type": "code", + "execution_count": 80, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "ename": "ValueError", + "evalue": "list.remove(x): x not in list", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mnames\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mremove\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"Peter\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;31mValueError\u001b[0m: list.remove(x): x not in list" + ] + } + ], + "source": [ + "names.remove(\"Peter\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`list` objects implement an `.index()` method that returns the position of the first element that compares equal to its argument. It fails *loudly* with a `ValueError` if no element compares equal." + ] + }, + { + "cell_type": "code", + "execution_count": 81, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Berthold', 'Oliver', 'Carl']" + ] + }, + "execution_count": 81, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names" + ] + }, + { + "cell_type": "code", + "execution_count": 82, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "1" + ] + }, + "execution_count": 82, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names.index(\"Oliver\")" + ] + }, + { + "cell_type": "code", + "execution_count": 83, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "ename": "ValueError", + "evalue": "'Karl' is not in list", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mnames\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mindex\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"Karl\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;31mValueError\u001b[0m: 'Karl' is not in list" + ] + } + ], + "source": [ + "names.index(\"Karl\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The `.count()` method returns the number of elements that compare equal to its argument." + ] + }, + { + "cell_type": "code", + "execution_count": 84, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "1" + ] + }, + "execution_count": 84, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names.count(\"Carl\")" + ] + }, + { + "cell_type": "code", + "execution_count": 85, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "0" + ] + }, + "execution_count": 85, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names.count(\"Karl\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Two more methods, `.copy()` and `.clear()`, are *syntactic sugar* and replace working with slices.\n", + "\n", + "`.copy()` creates a *shallow* copy. So, `names.copy()` below does the same as taking a full slice with `names[:]`, and the caveats from above apply, too." + ] + }, + { + "cell_type": "code", + "execution_count": 86, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [], + "source": [ + "names_copy = names.copy()" + ] + }, + { + "cell_type": "code", + "execution_count": 87, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Berthold', 'Oliver', 'Carl']" + ] + }, + "execution_count": 87, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names_copy" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`.clear()` removes all references from a `list` object. So, `names_copy.clear()` is the same as `del names_copy[:]`." + ] + }, + { + "cell_type": "code", + "execution_count": 88, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [], + "source": [ + "names_copy.clear()" + ] + }, + { + "cell_type": "code", + "execution_count": 89, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[]" + ] + }, + "execution_count": 89, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names_copy" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Many methods introduced in this section are mentioned in the [collections.abc ](https://docs.python.org/3/library/collections.abc.html) module's documentation as well: While the `.index()` and `.count()` methods come with any data type that is a `Sequence`, the `.append()`, `.extend()`, `.insert()`, `.reverse()`, `.pop()`, and `.remove()` methods are part of any `MutableSequence` type. The `.sort()`, `.copy()`, and `.clear()` methods are `list`-specific.\n", + "\n", + "So, being a sequence does not only imply the four *behaviors* specified above, but also means that a data type comes with certain standardized methods." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## List Operations" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "As with `str` objects, the `+` and `*` operators are overloaded for concatenation and always return a *new* `list` object. The references in this newly created `list` object reference the *same* objects as the two original `list` objects. So, the same caveat as with *shallow* copies from above applies!" + ] + }, + { + "cell_type": "code", + "execution_count": 90, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Berthold', 'Oliver', 'Carl']" + ] + }, + "execution_count": 90, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names" + ] + }, + { + "cell_type": "code", + "execution_count": 91, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Berthold', 'Oliver', 'Carl', 'Diedrich', 'Yves']" + ] + }, + "execution_count": 91, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names + [\"Diedrich\", \"Yves\"]" + ] + }, + { + "cell_type": "code", + "execution_count": 92, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Berthold', 'Oliver', 'Carl', 'Berthold', 'Oliver', 'Carl']" + ] + }, + "execution_count": 92, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "2 * names" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "### Unpacking" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Besides being an operator, the `*` symbol has a second syntactical use, as explained in [PEP 3132 ](https://www.python.org/dev/peps/pep-3132/) and [PEP 448 ](https://www.python.org/dev/peps/pep-0448/): It implements what is called **iterable unpacking**. It is *not* an operator syntactically but a notation that Python reads as a literal.\n", + "\n", + "In the example, Python interprets the expression as if the elements of the iterable `names` were placed between `\"Achim\"` and `\"Xavier\"` one by one. So, we do not obtain a nested but a *flat* list." + ] + }, + { + "cell_type": "code", + "execution_count": 93, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Achim', 'Berthold', 'Oliver', 'Carl', 'Xavier']" + ] + }, + "execution_count": 93, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "[\"Achim\", *names, \"Xavier\"]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Effectively, Python reads that as if we wrote the following." + ] + }, + { + "cell_type": "code", + "execution_count": 94, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Achim', 'Berthold', 'Oliver', 'Carl', 'Xavier']" + ] + }, + "execution_count": 94, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "[\"Achim\", names[0], names[1], names[2], \"Xavier\"]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "### List Comparison" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The relational operators also work with `list` objects; yet another example of operator overloading.\n", + "\n", + "Comparison is made in a pairwise fashion until the first pair of elements does not evaluate equal or one of the `list` objects ends. The exact comparison rules depend on the elements and not the `list` objects. As with [.sort() ](https://docs.python.org/3/library/stdtypes.html#list.sort) or [sorted() ](https://docs.python.org/3/library/functions.html#sorted) above, comparison is *delegated* to the objects to be compared, and Python \"asks\" the elements in the two `list` objects to compare themselves. Usually, all elements are of the *same* type, and comparison is straightforward." + ] + }, + { + "cell_type": "code", + "execution_count": 95, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Berthold', 'Oliver', 'Carl']" + ] + }, + "execution_count": 95, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names" + ] + }, + { + "cell_type": "code", + "execution_count": 96, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 96, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names == [\"Berthold\", \"Oliver\", \"Carl\"]" + ] + }, + { + "cell_type": "code", + "execution_count": 97, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 97, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names != [\"Berthold\", \"Oliver\", \"Karl\"]" + ] + }, + { + "cell_type": "code", + "execution_count": 98, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 98, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names < [\"Berthold\", \"Oliver\", \"Karl\"]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "If two `list` objects have a different number of elements and all overlapping elements compare equal, the shorter `list` object is considered \"smaller.\"" + ] + }, + { + "cell_type": "code", + "execution_count": 99, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 99, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "[\"Berthold\", \"Oliver\"] < names < [\"Berthold\", \"Oliver\", \"Carl\", \"Xavier\"]" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.6" + }, + "livereveal": { + "auto_select": "code", + "auto_select_fragment": true, + "scroll": true, + "theme": "serif" + }, + "toc": { + "base_numbering": 1, + "nav_menu": {}, + "number_sections": false, + "sideBar": true, + "skip_h1_title": true, + "title_cell": "Table of Contents", + "title_sidebar": "Contents", + "toc_cell": false, + "toc_position": { + "height": "calc(100% - 180px)", + "left": "10px", + "top": "150px", + "width": "384px" + }, + "toc_section_display": false, + "toc_window_display": false + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/07_sequences/02_exercises.ipynb b/07_sequences/02_exercises.ipynb new file mode 100644 index 0000000..0ec7907 --- /dev/null +++ b/07_sequences/02_exercises.ipynb @@ -0,0 +1,257 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Note**: Click on \"*Kernel*\" > \"*Restart Kernel and Run All*\" in [JupyterLab](https://jupyterlab.readthedocs.io/en/stable/) *after* finishing the exercises to ensure that your solution runs top to bottom *without* any errors. If you cannot run this file on your machine, you may want to open it [in the cloud ](https://mybinder.org/v2/gh/webartifex/intro-to-python/develop?urlpath=lab/tree/07_sequences/02_exercises.ipynb)." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Chapter 7: Sequential Data (Coding Exercises)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The exercises below assume that you have read the [second part ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/07_sequences/01_content.ipynb) of Chapter 7.\n", + "\n", + "The `...`'s in the code cells indicate where you need to fill in code snippets. The number of `...`'s within a code cell give you a rough idea of how many lines of code are needed to solve the task. You should not need to create any additional code cells for your final solution. However, you may want to use temporary code cells to try out some ideas." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Working with Lists" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q1**: Write a function `nested_sum()` that takes a `list` object as its argument, which contains other `list` objects with numbers, and adds up the numbers! Use `nested_numbers` below to test your function!\n", + "\n", + "Hint: You need at least one `for`-loop." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "nested_numbers = [[1, 2, 3], [4], [5], [6, 7], [8], [9]]" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def nested_sum(list_of_lists):\n", + " \"\"\"Add up numbers in nested lists.\n", + " \n", + " Args:\n", + " list_of_lists (list): A list containing the lists with the numbers\n", + " \n", + " Returns:\n", + " sum (int or float)\n", + " \"\"\"\n", + " ...\n", + " ...\n", + " ...\n", + "\n", + " return ..." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "nested_sum(nested_numbers)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q2**: Generalize `nested_sum()` into a function `mixed_sum()` that can process a \"mixed\" `list` object, which contains numbers and other `list` objects with numbers! Use `mixed_numbers` below for testing!\n", + "\n", + "Hints: Use the built-in [isinstance() ](https://docs.python.org/3/library/functions.html#isinstance) function to check how an element is to be processed." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "mixed_numbers = [[1, 2, 3], 4, 5, [6, 7], 8, [9]]" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import collections.abc as abc" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def mixed_sum(list_of_lists_or_numbers):\n", + " \"\"\"Add up numbers in nested lists.\n", + " \n", + " Args:\n", + " list_of_lists_or_numbers (list): A list containing both numbers and\n", + " lists with numbers\n", + " \n", + " Returns:\n", + " sum (int or float)\n", + " \"\"\"\n", + " ...\n", + " ...\n", + " ...\n", + " ...\n", + " ...\n", + " ...\n", + "\n", + " return ..." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "mixed_sum(mixed_numbers)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q3.1**: Write a function `cum_sum()` that takes a `list` object with numbers as its argument and returns a *new* `list` object with the **cumulative sums** of these numbers! So, `sum_up` below, `[1, 2, 3, 4, 5]`, should return `[1, 3, 6, 10, 15]`.\n", + "\n", + "Hint: The idea behind is similar to the [cumulative distribution function ](https://en.wikipedia.org/wiki/Cumulative_distribution_function) from statistics." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "sum_up = [1, 2, 3, 4, 5]" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def cum_sum(numbers):\n", + " \"\"\"Create the cumulative sums for some numbers.\n", + "\n", + " Args:\n", + " numbers (list): A list with numbers for that the cumulative sums\n", + " are calculated\n", + " \n", + " Returns:\n", + " cum_sums (list): A list with all the cumulative sums\n", + " \"\"\"\n", + " ...\n", + " ...\n", + "\n", + " ...\n", + " ...\n", + " ...\n", + "\n", + " return ..." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "cum_sum(sum_up)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q3.2**: We should always make sure that our functions also work in corner cases. What happens if your implementation of `cum_sum()` is called with an empty list `[]`? Make sure it handles that case *without* crashing! What would be a good return value in this corner case?\n", + "\n", + "Hint: It is possible to write this without any extra input validation." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "cum_sum([])" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " < your answer >" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.6" + }, + "toc": { + "base_numbering": 1, + "nav_menu": {}, + "number_sections": false, + "sideBar": true, + "skip_h1_title": true, + "title_cell": "Table of Contents", + "title_sidebar": "Contents", + "toc_cell": false, + "toc_position": {}, + "toc_section_display": false, + "toc_window_display": false + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/07_sequences/03_content.ipynb b/07_sequences/03_content.ipynb new file mode 100644 index 0000000..7b63f1e --- /dev/null +++ b/07_sequences/03_content.ipynb @@ -0,0 +1,2494 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "**Note**: Click on \"*Kernel*\" > \"*Restart Kernel and Clear All Outputs*\" in [JupyterLab](https://jupyterlab.readthedocs.io/en/stable/) *before* reading this notebook to reset its output. If you cannot run this file on your machine, you may want to open it [in the cloud ](https://mybinder.org/v2/gh/webartifex/intro-to-python/develop?urlpath=lab/tree/07_sequences/03_content.ipynb)." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "# Chapter 7: Sequential Data (continued)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "In this third part of the chapter, we first look at a major implication of the `list` type's mutability. Then, we see how its close relative, the `tuple` type, can mitigate this. Lastly, we see how Python's syntax assumes sequential data at various places: for example, when unpacking iterables during a `for`-loop or an assignment, or when working with `function` objects." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## Modifiers vs. Pure Functions" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "As `list` objects are mutable, the caller of a function can see the changes made to a `list` object passed to the function as an argument. That is often a surprising *side effect* and should be avoided.\n", + "\n", + "As an example, consider the `add_xyz()` function." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "letters = [\"a\", \"b\", \"c\"]" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "def add_xyz(arg):\n", + " \"\"\"Append letters to a list.\"\"\"\n", + " arg.extend([\"x\", \"y\", \"z\"])\n", + " return arg" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "While this function is being executed, two variables, namely `letters` in the global scope and `arg` inside the function's local scope, reference the *same* `list` object in memory. Furthermore, the passed in `arg` is also the return value.\n", + "\n", + "So, after the function call, `letters_with_xyz` and `letters` are **aliases** as well, referencing the *same* object. We can also visualize that with [PythonTutor ](http://pythontutor.com/visualize.html#code=letters%20%3D%20%5B%22a%22,%20%22b%22,%20%22c%22%5D%0A%0Adef%20add_xyz%28arg%29%3A%0A%20%20%20%20arg.extend%28%5B%22x%22,%20%22y%22,%20%22z%22%5D%29%0A%20%20%20%20return%20arg%0A%0Aletters_with_xyz%20%3D%20add_xyz%28letters%29&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false)." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "letters_with_xyz = add_xyz(letters)" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['a', 'b', 'c', 'x', 'y', 'z']" + ] + }, + "execution_count": 4, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "letters_with_xyz" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['a', 'b', 'c', 'x', 'y', 'z']" + ] + }, + "execution_count": 5, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "letters" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "A better practice is to first create a copy of `arg` within the function that is then modified and returned. If we are sure that `arg` contains immutable elements only, we get away with a shallow copy. The downside of this approach is the higher amount of memory necessary.\n", + "\n", + "The revised `add_xyz()` function below is more natural to reason about as it does *not* modify the passed in `arg` internally. [PythonTutor ](http://pythontutor.com/visualize.html#code=letters%20%3D%20%5B%22a%22,%20%22b%22,%20%22c%22%5D%0A%0Adef%20add_xyz%28arg%29%3A%0A%20%20%20%20new_arg%20%3D%20arg%5B%3A%5D%0A%20%20%20%20new_arg.extend%28%5B%22x%22,%20%22y%22,%20%22z%22%5D%29%0A%20%20%20%20return%20new_arg%0A%0Aletters_with_xyz%20%3D%20add_xyz%28letters%29&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false) shows that as well. This approach is following the **[functional programming ](https://en.wikipedia.org/wiki/Functional_programming)** paradigm that is going through a \"renaissance\" currently. Two essential characteristics of functional programming are that a function *never* changes its inputs and *always* returns the same output given the same inputs.\n", + "\n", + "For a beginner, it is probably better to stick to this idea and not change any arguments as the original `add_xyz()` above. However, functions that modify and return the argument passed in are an important aspect of object-oriented programming, as explained in [Chapter 11 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/11_classes/00_content.ipynb)." + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "letters = [\"a\", \"b\", \"c\"]" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "def add_xyz(arg):\n", + " \"\"\"Create a new list from an existing one.\"\"\"\n", + " new_arg = arg[:]\n", + " new_arg.extend([\"x\", \"y\", \"z\"])\n", + " return new_arg" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "letters_with_xyz = add_xyz(letters)" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": { + "scrolled": true, + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['a', 'b', 'c', 'x', 'y', 'z']" + ] + }, + "execution_count": 9, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "letters_with_xyz" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['a', 'b', 'c']" + ] + }, + "execution_count": 10, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "letters" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "If we want to modify the argument passed in, it is best to return `None` and not `arg`, as does the final version of `add_xyz()` below. Then, the user of our function cannot accidentally create two aliases to the same object. That is also why the list methods above all return `None`. [PythonTutor ](http://pythontutor.com/visualize.html#code=letters%20%3D%20%5B%22a%22,%20%22b%22,%20%22c%22%5D%0A%0Adef%20add_xyz%28arg%29%3A%0A%20%20%20%20arg.extend%28%5B%22x%22,%20%22y%22,%20%22z%22%5D%29%0A%20%20%20%20return%0A%0Aadd_xyz%28letters%29&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false) shows how there is only *one* reference to `letters` after the function call." + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "letters = [\"a\", \"b\", \"c\"]" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "def add_xyz(arg):\n", + " \"\"\"Append letters to a list.\"\"\"\n", + " arg.extend([\"x\", \"y\", \"z\"])\n", + " return # None" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "add_xyz(letters)" + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['a', 'b', 'c', 'x', 'y', 'z']" + ] + }, + "execution_count": 14, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "letters" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "If we call `add_xyz()` with `letters` as the argument again, we end up with an even longer `list` object." + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [], + "source": [ + "add_xyz(letters)" + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['a', 'b', 'c', 'x', 'y', 'z', 'x', 'y', 'z']" + ] + }, + "execution_count": 16, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "letters" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Functions that only work on the argument passed in are called **modifiers**. Their primary purpose is to change the **state** of the argument. On the contrary, functions that have *no* side effects on the arguments are said to be **pure**." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## The `tuple` Type" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "To create a `tuple` object, we can use the same literal notation as for `list` objects *without* the brackets and list all elements." + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "numbers = 7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4" + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4)" + ] + }, + "execution_count": 18, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "numbers" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "However, to be clearer, many Pythonistas write out the optional parentheses `(` and `)`." + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [], + "source": [ + "numbers = (7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4)" + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4)" + ] + }, + "execution_count": 20, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "numbers" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "As before, `numbers` is an object on its own." + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "140248673535456" + ] + }, + "execution_count": 21, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "id(numbers)" + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "tuple" + ] + }, + "execution_count": 22, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "type(numbers)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "While we could use empty parentheses `()` to create an empty `tuple` object ..." + ] + }, + { + "cell_type": "code", + "execution_count": 23, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [], + "source": [ + "empty_tuple = ()" + ] + }, + { + "cell_type": "code", + "execution_count": 24, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "()" + ] + }, + "execution_count": 24, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "empty_tuple" + ] + }, + { + "cell_type": "code", + "execution_count": 25, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "tuple" + ] + }, + "execution_count": 25, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "type(empty_tuple)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "... we must use a *trailing comma* to create a `tuple` object holding one element. If we forget the comma, the parentheses are interpreted as the grouping operator and effectively useless!" + ] + }, + { + "cell_type": "code", + "execution_count": 26, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [], + "source": [ + "one_tuple = (1,) # we could ommit the parentheses but not the comma" + ] + }, + { + "cell_type": "code", + "execution_count": 27, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(1,)" + ] + }, + "execution_count": 27, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "one_tuple" + ] + }, + { + "cell_type": "code", + "execution_count": 28, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "tuple" + ] + }, + "execution_count": 28, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "type(one_tuple)" + ] + }, + { + "cell_type": "code", + "execution_count": 29, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [], + "source": [ + "no_tuple = (1)" + ] + }, + { + "cell_type": "code", + "execution_count": 30, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "1" + ] + }, + "execution_count": 30, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "no_tuple" + ] + }, + { + "cell_type": "code", + "execution_count": 31, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "int" + ] + }, + "execution_count": 31, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "type(no_tuple)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Alternatively, we may use the [tuple() ](https://docs.python.org/3/library/functions.html#func-tuple) built-in that takes any iterable as its argument and creates a new `tuple` from its elements." + ] + }, + { + "cell_type": "code", + "execution_count": 32, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(1,)" + ] + }, + "execution_count": 32, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "tuple([1])" + ] + }, + { + "cell_type": "code", + "execution_count": 33, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "('i', 't', 'e', 'r', 'a', 'b', 'l', 'e')" + ] + }, + "execution_count": 33, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "tuple(\"iterable\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## Tuples are like \"Immutable Lists\"" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Most operations involving `tuple` objects work in the same way as with `list` objects. The main difference is that `tuple` objects are *immutable*. So, if our program does not depend on mutability, we may and should use `tuple` and not `list` objects to model sequential data. That way, we avoid the pitfalls seen above.\n", + "\n", + "`tuple` objects are *sequences* exhibiting the familiar *four* behaviors. So, `numbers` holds a *finite* number of elements ..." + ] + }, + { + "cell_type": "code", + "execution_count": 34, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "12" + ] + }, + "execution_count": 34, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "len(numbers)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "... that we can obtain individually by looping over it in a predictable *forward* or *reverse* order." + ] + }, + { + "cell_type": "code", + "execution_count": 35, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "7 11 8 5 3 12 2 6 9 10 1 4 " + ] + } + ], + "source": [ + "for number in numbers:\n", + " print(number, end=\" \")" + ] + }, + { + "cell_type": "code", + "execution_count": 36, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "4 1 10 9 6 2 12 3 5 8 11 7 " + ] + } + ], + "source": [ + "for number in reversed(numbers):\n", + " print(number, end=\" \")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "To check if a given object is *contained* in `numbers`, we use the `in` operator and conduct a linear search." + ] + }, + { + "cell_type": "code", + "execution_count": 37, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "False" + ] + }, + "execution_count": 37, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "0 in numbers" + ] + }, + { + "cell_type": "code", + "execution_count": 38, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 38, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "1 in numbers" + ] + }, + { + "cell_type": "code", + "execution_count": 39, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 39, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "1.0 in numbers # in relies on == behind the scenes" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "We may index and slice with the `[]` operator. The latter returns *new* `tuple` objects." + ] + }, + { + "cell_type": "code", + "execution_count": 40, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "7" + ] + }, + "execution_count": 40, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "numbers[0]" + ] + }, + { + "cell_type": "code", + "execution_count": 41, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "4" + ] + }, + "execution_count": 41, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "numbers[-1]" + ] + }, + { + "cell_type": "code", + "execution_count": 42, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(2, 6, 9, 10, 1, 4)" + ] + }, + "execution_count": 42, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "numbers[6:]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Index assignment does *not* work as tuples are *immutable* and results in a `TypeError`." + ] + }, + { + "cell_type": "code", + "execution_count": 43, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "ename": "TypeError", + "evalue": "'tuple' object does not support item assignment", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mnumbers\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m-\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;36m99\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;31mTypeError\u001b[0m: 'tuple' object does not support item assignment" + ] + } + ], + "source": [ + "numbers[-1] = 99" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The `+` and `*` operators work with `tuple` objects as well: They always create *new* `tuple` objects." + ] + }, + { + "cell_type": "code", + "execution_count": 44, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4, 99)" + ] + }, + "execution_count": 44, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "numbers + (99,) " + ] + }, + { + "cell_type": "code", + "execution_count": 45, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4, 7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4)" + ] + }, + "execution_count": 45, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "2 * numbers" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Being immutable, `tuple` objects only provide the `.count()` and `.index()` methods of `Sequence` types. The `.append()`, `.extend()`, `.insert()`, `.reverse()`, `.pop()`, and `.remove()` methods of `MutableSequence` types are *not* available. The same holds for the `list`-specific `.sort()`, `.copy()`, and `.clear()` methods." + ] + }, + { + "cell_type": "code", + "execution_count": 46, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "0" + ] + }, + "execution_count": 46, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "numbers.count(0)" + ] + }, + { + "cell_type": "code", + "execution_count": 47, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "10" + ] + }, + "execution_count": 47, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "numbers.index(1)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The relational operators work in the *same* way as for `list` objects." + ] + }, + { + "cell_type": "code", + "execution_count": 48, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4)" + ] + }, + "execution_count": 48, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "numbers" + ] + }, + { + "cell_type": "code", + "execution_count": 49, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 49, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "numbers == (7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4)" + ] + }, + { + "cell_type": "code", + "execution_count": 50, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 50, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "numbers != (99, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4)" + ] + }, + { + "cell_type": "code", + "execution_count": 51, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 51, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "numbers < (99, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "While `tuple` objects are immutable, this only relates to the references they hold. If a `tuple` object contains references to mutable objects, the entire nested structure is *not* immutable as a whole!\n", + "\n", + "Consider the following stylized example `not_immutable`: It contains *three* elements, `1`, `[2, ..., 11]`, and `12`, and the elements of the nested `list` object may be changed. While it is not practical to mix data types in a `tuple` object that is used as an \"immutable list,\" we want to make the point that the mere usage of the `tuple` type does *not* guarantee a nested object to be immutable as a whole." + ] + }, + { + "cell_type": "code", + "execution_count": 52, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [], + "source": [ + "not_immutable = (1, [2, 3, 4, 5, 6, 7, 8, 9, 10, 11], 12)" + ] + }, + { + "cell_type": "code", + "execution_count": 53, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(1, [2, 3, 4, 5, 6, 7, 8, 9, 10, 11], 12)" + ] + }, + "execution_count": 53, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "not_immutable" + ] + }, + { + "cell_type": "code", + "execution_count": 54, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [], + "source": [ + "not_immutable[1][:] = [99, 99, 99]" + ] + }, + { + "cell_type": "code", + "execution_count": 55, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(1, [99, 99, 99], 12)" + ] + }, + "execution_count": 55, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "not_immutable" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## Packing & Unpacking" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "In the \"*List Operations*\" section in the [second part ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/07_sequences/01_content.ipynb#List-Operations) of this chapter, the `*` symbol **unpacks** the elements of a `list` object into another one. This idea of *iterable unpacking* is built into Python at various places, even *without* the `*` symbol.\n", + "\n", + "For example, we may write variables on the left-hand side of a `=` statement in a literal `tuple` style. Then, any *finite* iterable on the right-hand side is unpacked. So, `numbers` is unpacked into *twelve* variables below." + ] + }, + { + "cell_type": "code", + "execution_count": 56, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "n1, n2, n3, n4, n5, n6, n7, n8, n9, n10, n11, n12 = numbers" + ] + }, + { + "cell_type": "code", + "execution_count": 57, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "7" + ] + }, + "execution_count": 57, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "n1" + ] + }, + { + "cell_type": "code", + "execution_count": 58, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "11" + ] + }, + "execution_count": 58, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "n2" + ] + }, + { + "cell_type": "code", + "execution_count": 59, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "8" + ] + }, + "execution_count": 59, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "n3" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Having to type twelve variables on the left is already tedious. Furthermore, if the iterable on the right yields a number of elements *different* from the number of variables, we get a `ValueError`." + ] + }, + { + "cell_type": "code", + "execution_count": 60, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "ename": "ValueError", + "evalue": "too many values to unpack (expected 11)", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mn1\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mn2\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mn3\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mn4\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mn5\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mn6\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mn7\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mn8\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mn9\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mn10\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mn11\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mnumbers\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;31mValueError\u001b[0m: too many values to unpack (expected 11)" + ] + } + ], + "source": [ + "n1, n2, n3, n4, n5, n6, n7, n8, n9, n10, n11 = numbers" + ] + }, + { + "cell_type": "code", + "execution_count": 61, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "ename": "ValueError", + "evalue": "not enough values to unpack (expected 13, got 12)", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mn1\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mn2\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mn3\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mn4\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mn5\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mn6\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mn7\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mn8\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mn9\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mn10\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mn11\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mn12\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mn13\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mnumbers\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;31mValueError\u001b[0m: not enough values to unpack (expected 13, got 12)" + ] + } + ], + "source": [ + "n1, n2, n3, n4, n5, n6, n7, n8, n9, n10, n11, n12, n13 = numbers" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "So, to make iterable unpacking useful, we prepend the `*` symbol to *one* of the variables on the left: That variable then becomes a `list` object holding the elements not captured by the other variables. We say that the excess elements from the iterable are **packed** into this variable.\n", + "\n", + "For example, let's get the `first` and `last` element of `numbers` and collect the rest in `middle`." + ] + }, + { + "cell_type": "code", + "execution_count": 62, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "first, *middle, last = numbers" + ] + }, + { + "cell_type": "code", + "execution_count": 63, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "7" + ] + }, + "execution_count": 63, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "first" + ] + }, + { + "cell_type": "code", + "execution_count": 64, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[11, 8, 5, 3, 12, 2, 6, 9, 10, 1]" + ] + }, + "execution_count": 64, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "middle # always a list!" + ] + }, + { + "cell_type": "code", + "execution_count": 65, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "4" + ] + }, + "execution_count": 65, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "last" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "We already used unpacking before this section without knowing it. Whenever we write a `for`-loop over the [zip() ](https://docs.python.org/3/library/functions.html#zip) built-in, that generates a new `tuple` object in each iteration that we unpack by listing several loop variables.\n", + "\n", + "So, the `name, position` below acts like a left-hand side of an `=` statement and unpacks the `tuple` objects generated from \"zipping\" the `names` list and the `positions` tuple together." + ] + }, + { + "cell_type": "code", + "execution_count": 66, + "metadata": {}, + "outputs": [], + "source": [ + "names = [\"Berthold\", \"Oliver\", \"Carl\"]" + ] + }, + { + "cell_type": "code", + "execution_count": 67, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [], + "source": [ + "positions = (\"goalkeeper\", \"defender\", \"midfielder\", \"striker\", \"coach\")" + ] + }, + { + "cell_type": "code", + "execution_count": 68, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Berthold is a goalkeeper\n", + "Oliver is a defender\n", + "Carl is a midfielder\n" + ] + } + ], + "source": [ + "for name, position in zip(names, positions):\n", + " print(name, \"is a\", position)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Without unpacking, [zip() ](https://docs.python.org/3/library/functions.html#zip) generates a series of `tuple` objects." + ] + }, + { + "cell_type": "code", + "execution_count": 69, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + " ('Berthold', 'goalkeeper')\n", + " ('Oliver', 'defender')\n", + " ('Carl', 'midfielder')\n" + ] + } + ], + "source": [ + "for pair in zip(names, positions):\n", + " print(type(pair), pair, sep=\" \")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Unpacking also works for nested objects. Below, we wrap [zip() ](https://docs.python.org/3/library/functions.html#zip) with the [enumerate() ](https://docs.python.org/3/library/functions.html#enumerate) built-in to have an index variable `number` inside the `for`-loop. In each iteration, a `tuple` object consisting of `number` and another `tuple` object is created. The inner one then holds the `name` and `position`." + ] + }, + { + "cell_type": "code", + "execution_count": 70, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Berthold (jersey #1) is a goalkeeper\n", + "Oliver (jersey #2) is a defender\n", + "Carl (jersey #3) is a midfielder\n" + ] + } + ], + "source": [ + "for number, (name, position) in enumerate(zip(names, positions), start=1):\n", + " print(f\"{name} (jersey #{number}) is a {position}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "### Swapping Variables" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "A popular use case of unpacking is **swapping** two variables.\n", + "\n", + "Consider `a` and `b` below." + ] + }, + { + "cell_type": "code", + "execution_count": 71, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "a = 0\n", + "b = 1" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Without unpacking, we must use a temporary variable `temp` to swap `a` and `b`." + ] + }, + { + "cell_type": "code", + "execution_count": 72, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "temp = a\n", + "a = b\n", + "b = temp\n", + "\n", + "del temp" + ] + }, + { + "cell_type": "code", + "execution_count": 73, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "1" + ] + }, + "execution_count": 73, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "a" + ] + }, + { + "cell_type": "code", + "execution_count": 74, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "0" + ] + }, + "execution_count": 74, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "b" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "With unpacking, the solution is more elegant. *All* expressions on the right-hand side are evaluated *before* any assignment takes place." + ] + }, + { + "cell_type": "code", + "execution_count": 75, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "a, b = 0, 1" + ] + }, + { + "cell_type": "code", + "execution_count": 76, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "a, b = b, a" + ] + }, + { + "cell_type": "code", + "execution_count": 77, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(1, 0)" + ] + }, + "execution_count": 77, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "a, b" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "#### Example: [Fibonacci Numbers ](https://en.wikipedia.org/wiki/Fibonacci_number) (revisited)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Unpacking allows us to rewrite the iterative `fibonacci()` function from [Chapter 4 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/04_iteration/02_content.ipynb#\"Hard-at-first-Glance\"-Example:-Fibonacci-Numbers-%28revisited%29) in a concise way." + ] + }, + { + "cell_type": "code", + "execution_count": 78, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "def fibonacci(i):\n", + " \"\"\"Calculate the ith Fibonacci number.\n", + "\n", + " Args:\n", + " i (int): index of the Fibonacci number to calculate\n", + "\n", + " Returns:\n", + " ith_fibonacci (int)\n", + " \"\"\"\n", + " a, b = 0, 1\n", + "\n", + " for _ in range(i - 1):\n", + " a, b = b, a + b\n", + "\n", + " return b" + ] + }, + { + "cell_type": "code", + "execution_count": 79, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "144" + ] + }, + "execution_count": 79, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "fibonacci(12)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## Function Definitions & Calls" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The concepts of packing and unpacking are also helpful when writing and using functions.\n", + "\n", + "For example, let's look at the `product()` function below. Its implementation suggests that `args` must be a sequence type. Otherwise, it would not make sense to index into it with `[0]` or take a slice with `[1:]`. In line with the function's name, the `for`-loop multiplies all elements of the `args` sequence. So, what does the `*` do in the header line, and what is the exact data type of `args`?\n", + "\n", + "The `*` is again *not* an operator in this context but a special syntax that makes Python *pack* all *positional* arguments passed to `product()` into a single `tuple` object called `args`." + ] + }, + { + "cell_type": "code", + "execution_count": 80, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "def product(*args):\n", + " \"\"\"Multiply all arguments.\"\"\"\n", + " result = args[0]\n", + "\n", + " for arg in args[1:]:\n", + " result *= arg\n", + "\n", + " return result" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "So, we can pass an *arbitrary* (i.e., also none) number of *positional* arguments to `product()`.\n", + "\n", + "The product of just one number is the number itself." + ] + }, + { + "cell_type": "code", + "execution_count": 81, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "42" + ] + }, + "execution_count": 81, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "product(42)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Passing in several numbers works as expected." + ] + }, + { + "cell_type": "code", + "execution_count": 82, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "100" + ] + }, + "execution_count": 82, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "product(2, 5, 10)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "However, this implementation of `product()` needs *at least* one argument passed in due to the expression `args[0]` used internally. Otherwise, we see a *runtime* error, namely an `IndexError`. We emphasize that this error is *not* caused in the header line." + ] + }, + { + "cell_type": "code", + "execution_count": 83, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "ename": "IndexError", + "evalue": "tuple index out of range", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mIndexError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mproduct\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;32m\u001b[0m in \u001b[0;36mproduct\u001b[0;34m(*args)\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mproduct\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0margs\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0;34m\"\"\"Multiply all arguments.\"\"\"\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 3\u001b[0;31m \u001b[0mresult\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0margs\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 4\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0marg\u001b[0m \u001b[0;32min\u001b[0m \u001b[0margs\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;31mIndexError\u001b[0m: tuple index out of range" + ] + } + ], + "source": [ + "product()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Another downside of this implementation is that we can easily generate *semantic* errors: For example, if we pass in an iterable object like the `one_hundred` list, *no* exception is raised. However, the return value is also not a numeric object as we expect. The reason for this is that during the function call, `args` becomes a `tuple` object holding *one* element, which is `one_hundred`, a `list` object. So, we created a nested structure by accident." + ] + }, + { + "cell_type": "code", + "execution_count": 84, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "one_hundred = [2, 5, 10]" + ] + }, + { + "cell_type": "code", + "execution_count": 85, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[2, 5, 10]" + ] + }, + "execution_count": 85, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "product(one_hundred) # a semantic error!" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "This error does not occur if we unpack `one_hundred` upon passing it as the argument." + ] + }, + { + "cell_type": "code", + "execution_count": 86, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "100" + ] + }, + "execution_count": 86, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "product(*one_hundred)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "That is the equivalent of writing out the following tedious expression. Yet, that does *not* scale for iterables with many elements in them." + ] + }, + { + "cell_type": "code", + "execution_count": 87, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "100" + ] + }, + "execution_count": 87, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "product(one_hundred[0], one_hundred[1], one_hundred[2])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "In the \"*Packing & Unpacking with Functions*\" [exercise ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/07_sequences/04_exercises.ipynb), we look at `product()` in more detail.\n", + "\n", + "While we needed to unpack `one_hundred` above to avoid the semantic error, unpacking an argument in a function call may also be a convenience in general. For example, to print the elements of `one_hundred` in one line, we need to use a `for` statement, until now. With unpacking, we get away *without* a loop." + ] + }, + { + "cell_type": "code", + "execution_count": 88, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[2, 5, 10]\n" + ] + } + ], + "source": [ + "print(one_hundred) # prints the tuple; we do not want that" + ] + }, + { + "cell_type": "code", + "execution_count": 89, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "2 5 10 " + ] + } + ], + "source": [ + "for number in one_hundred:\n", + " print(number, end=\" \")" + ] + }, + { + "cell_type": "code", + "execution_count": 90, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "2 5 10\n" + ] + } + ], + "source": [ + "print(*one_hundred) # replaces the for-loop" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.6" + }, + "livereveal": { + "auto_select": "code", + "auto_select_fragment": true, + "scroll": true, + "theme": "serif" + }, + "toc": { + "base_numbering": 1, + "nav_menu": {}, + "number_sections": false, + "sideBar": true, + "skip_h1_title": true, + "title_cell": "Table of Contents", + "title_sidebar": "Contents", + "toc_cell": false, + "toc_position": { + "height": "calc(100% - 180px)", + "left": "10px", + "top": "150px", + "width": "384px" + }, + "toc_section_display": false, + "toc_window_display": false + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/07_sequences/04_exercises.ipynb b/07_sequences/04_exercises.ipynb new file mode 100644 index 0000000..1d46524 --- /dev/null +++ b/07_sequences/04_exercises.ipynb @@ -0,0 +1,663 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Note**: Click on \"*Kernel*\" > \"*Restart Kernel and Run All*\" in [JupyterLab](https://jupyterlab.readthedocs.io/en/stable/) *after* finishing the exercises to ensure that your solution runs top to bottom *without* any errors. If you cannot run this file on your machine, you may want to open it [in the cloud ](https://mybinder.org/v2/gh/webartifex/intro-to-python/develop?urlpath=lab/tree/07_sequences/04_exercises.ipynb)." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Chapter 7: Sequential Data (Coding Exercises)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The exercises below assume that you have read the [third part ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/07_sequences/03_content.ipynb) of Chapter 7.\n", + "\n", + "The `...`'s in the code cells indicate where you need to fill in code snippets. The number of `...`'s within a code cell give you a rough idea of how many lines of code are needed to solve the task. You should not need to create any additional code cells for your final solution. However, you may want to use temporary code cells to try out some ideas." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Packing & Unpacking with Functions" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In the \"*Function Definitions & Calls*\" section in [Chapter 7 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/07_sequences/03_content.ipynb#Function-Definitions-&-Calls), we define the following function `product()`. In this exercise, you will improve it by making it more \"user-friendly.\"" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def product(*args):\n", + " \"\"\"Multiply all arguments.\"\"\"\n", + " result = args[0]\n", + "\n", + " for arg in args[1:]:\n", + " result *= arg\n", + "\n", + " return result" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The `*` in the function's header line *packs* all *positional* arguments passed to `product()` into one *iterable* called `args`.\n", + "\n", + "**Q1**: What is the data type of `args` within the function's body?" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " < your answer >" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Because of the packing, we may call `product()` with an abitrary number of *positional* arguments: The product of just `42` remains `42`, while `2`, `5`, and `10` multiplied together result in `100`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "product(42)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "product(2, 5, 10)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "However, \"abitrary\" does not mean that we can pass *no* argument. If we do so, we get an `IndexError`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "product()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q2**: What line in the body of `product()` causes this exception? What is the exact problem?" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " < your answer >" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "In [Chapter 7 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/07_sequences/00_content.ipynb#Function-Definitions-&-Calls), we also pass a `list` object, like `one_hundred`, to `product()`, and *no* exception is raised." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "one_hundred = [2, 5, 10]" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "product(one_hundred)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q3**: What is wrong with that? What *kind* of error (cf., [Chapter 1 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/01_elements/00_content.ipynb#Formal-vs.-Natural-Languages)) is that conceptually? Describe precisely what happens to the passed in `one_hundred` in every line within `product()`!" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " < your answer >" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Of course, one solution is to *unpack* `one_hundred` with the `*` symbol. We look at another solution further below." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "product(*one_hundred)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's continue with the issue when calling `product()` *without* any argument.\n", + "\n", + "This revised version of `product()` avoids the `IndexError` from before." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def product(*args):\n", + " \"\"\"Multiply all arguments.\"\"\"\n", + " result = None\n", + "\n", + " for arg in args:\n", + " result *= arg\n", + "\n", + " return result" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "product()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q4**: Describe why no error occurs by going over every line in `product()`!" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " < your answer >" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Unfortunately, the new version cannot process any arguments we pass in any more." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "product(42)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "product(2, 5, 10)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q5**: What line causes troubles now? What is the exact problem?" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " < your answer >" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q6**: Replace the `None` in `product()` above with something reasonable that does *not* cause exceptions! Ensure that `product(42)` and `product(2, 5, 10)` return a correct result.\n", + "\n", + "Hints: It is ok if `product()` returns a result *different* from the `None` above. Look at the documentation of the built-in [sum() ](https://docs.python.org/3/library/functions.html#sum) function for some inspiration." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def product(*args):\n", + " \"\"\"Multiply all arguments.\"\"\"\n", + " result = ...\n", + "\n", + " for arg in args:\n", + " result *= arg\n", + "\n", + " return result" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "product(42)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "product(2, 5, 10)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now, calling `product()` without any arguments returns what we would best describe as a *default* or *start* value. To be \"philosophical,\" what is the product of *no* numbers? We know that the product of *one* number is just the number itself, but what could be a reasonable result when multiplying *no* numbers? The answer is what you use as the initial value of `result` above, and there is only *one* way to make `product(42)` and `product(2, 5, 10)` work." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "product()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q7**: Rewrite `product()` so that it takes a *keyword-only* argument `start`, defaulting to the above *default* or *start* value, and use `start` internally instead of `result`!\n", + "\n", + "Hint: Remember that a *keyword-only* argument is any parameter specified in a function's header line after the first and only `*` (cf., [Chapter 2 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/02_functions/00_content.ipynb#Keyword-only-Arguments))." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def product(*args, ...):\n", + " \"\"\"Multiply all arguments.\"\"\"\n", + " ...\n", + " ...\n", + "\n", + " return ..." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now, we can call `product()` with a truly arbitrary number of *positional* arguments." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "product(42)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "product(2, 5, 10)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "product()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Without any *positional* arguments but only the *keyword* argument `start`, for example, `start=0`, we can adjust the answer to the \"philosophical\" problem of multiplying *no* numbers. Because of the *keyword-only* syntax, there is *no* way to pass in a `start` number *without* naming it." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "product(start=0)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We could use `start` to inject a multiplier, for example, to double the outcomes." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "product(42, start=2)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "product(2, 5, 10, start=2)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "There is still one issue left: Because of the function's name, a user of `product()` may assume that it is ok to pass a *collection* of numbers, like `one_hundred`, which are then multiplied." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "product(one_hundred)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q8**: What is a **collection**? How is that different from a **sequence**?" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " < your answer >" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q9**: Rewrite the latest version of `product()` to check if the *only* positional argument is a *collection* type! If so, its elements are multiplied together. Otherwise, the logic remains the same.\n", + "\n", + "Hints: Use the built-in [len() ](https://docs.python.org/3/library/functions.html#len) and [isinstance() ](https://docs.python.org/3/library/functions.html#isinstance) functions to check if there is only *one* positional argument and if it is a *collection* type. Use the *abstract base class* `Collection` from the [collections.abc ](https://docs.python.org/3/library/collections.abc.html) module in the [standard library ](https://docs.python.org/3/library/index.html). You may want to *re-assign* `args` inside the body." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import collections.abc as abc" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def product(*args, ...):\n", + " \"\"\"Multiply all arguments.\"\"\"\n", + " ...\n", + " ...\n", + "\n", + " ...\n", + " ...\n", + "\n", + " return ..." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "All *five* code cells below now return correct results. We may unpack `one_hundred` or not." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "product(42)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "product(2, 5, 10)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "product()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "product(one_hundred)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "product(*one_hundred)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Side Note**: Above, we make `product()` work with a single *collection* type argument instead of a *sequence* type to keep it more generic: For example, we can pass in a `set` object, like `{2, 5, 10}` below, and `product()` continues to work correctly. The `set` type is introducted in [Chapter 9 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/09_mappings/00_content.ipynb#The-set-Type), and one essential difference to the `list` type is that objects of type `set` have *no* order regarding their elements. So, even though `[2, 5, 10]` and `{2, 5, 10}` look almost the same, the order implied in the literal notation gets lost in memory!" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "product([2, 5, 10]) # the argument is a collection that is also a sequence" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "product({2, 5, 10}) # the argument is a collection that is NOT a sequence" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "isinstance({2, 5, 10}, abc.Sequence) # sets are NO sequences" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Let's continue to improve `product()` and make it more Pythonic. It is always a good idea to mimic the behavior of built-ins when writing our own functions. And, [sum() ](https://docs.python.org/3/library/functions.html#sum), for example, raises a `TypeError` if called *without* any arguments. It does *not* return the \"philosophical\" answer to adding *no* numbers, which would be `0`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "sum()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q10**: Adapt the latest version of `product()` to also raise a `TypeError` if called *without* any *positional* arguments!" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def product(*args, ...):\n", + " \"\"\"Multiply all arguments.\"\"\"\n", + " ...\n", + " ...\n", + " ...\n", + " ...\n", + "\n", + " ...\n", + " ...\n", + "\n", + " return ..." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "product()" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.6" + }, + "toc": { + "base_numbering": 1, + "nav_menu": {}, + "number_sections": false, + "sideBar": true, + "skip_h1_title": true, + "title_cell": "Table of Contents", + "title_sidebar": "Contents", + "toc_cell": false, + "toc_position": {}, + "toc_section_display": false, + "toc_window_display": false + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/07_sequences/05_appendix.ipynb b/07_sequences/05_appendix.ipynb new file mode 100644 index 0000000..09ebd0c --- /dev/null +++ b/07_sequences/05_appendix.ipynb @@ -0,0 +1,609 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "**Note**: Click on \"*Kernel*\" > \"*Restart Kernel and Clear All Outputs*\" in [JupyterLab](https://jupyterlab.readthedocs.io/en/stable/) *before* reading this notebook to reset its output. If you cannot run this file on your machine, you may want to open it [in the cloud ](https://mybinder.org/v2/gh/webartifex/intro-to-python/develop?urlpath=lab/tree/07_sequences/05_appendix.ipynb)." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "# Chapter 7: Sequential Data (Appendix)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "In the [third part ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/07_sequences/03_content.ipynb#Tuples-are-like-\"Immutable-Lists\") of the chapter, we proposed the idea that `tuple` objects are like \"immutable lists.\" Often, however, we use `tuple` objects to represent a **record** of related **fields**. Then, each element has a *semantic* meaning (i.e., a descriptive name).\n", + "\n", + "As an example, think of a spreadsheet with information on students in a course. Each row represents a record and holds all the data associated with an individual student. The columns (e.g., matriculation number, first name, last name) are the fields that may come as *different* data types (e.g., `int` for the matriculation number, `str` for the names).\n", + "\n", + "A simple way of modeling a single student is as a `tuple` object, for example, `(123456, \"John\", \"Doe\")`. A disadvantage of this approach is that we must remember the order and meaning of the elements/fields in the `tuple` object.\n", + "\n", + "An example from a different domain is the representation of $(x, y)$-points in the $x$-$y$-plane. Again, we could use a `tuple` object like `current_position` below to model the point $(4, 2)$." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "current_position = (4, 2)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "We implicitly assume that the first element represents the $x$ and the second the $y$ coordinate. While that follows intuitively from convention in math, we should at least add comments somewhere in the code to document this assumption." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## The `namedtuple` Type" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "A better way is to create a *custom* data type. While that is covered in depth in [Chapter 11 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/11_classes/00_content.ipynb), the [collections ](https://docs.python.org/3/library/collections.html) module in the [standard library ](https://docs.python.org/3/library/index.html) provides a [namedtuple() ](https://docs.python.org/3/library/collections.html#collections.namedtuple) **factory function** that creates \"simple\" custom data types on top of the standard `tuple` type." + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "from collections import namedtuple" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "[namedtuple() ](https://docs.python.org/3/library/collections.html#collections.namedtuple) takes two arguments. The first argument is the name of the data type. That could be different from the variable `Point` we use to refer to the new type, but in most cases it is best to keep them in sync. The second argument is a sequence with the field names as `str` objects. The names' order corresponds to the one assumed in `current_position`." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "Point = namedtuple(\"Point\", [\"x\", \"y\"])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The `Point` object is a so-called **class**. That is what it means if an object is of type `type`. It can be used as a **factory** to create *new* `tuple`-like objects of type `Point`. In a way, [namedtuple() ](https://docs.python.org/3/library/collections.html#collections.namedtuple) gives us a way to create our own custom **constructors**." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "94457911453856" + ] + }, + "execution_count": 4, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "id(Point)" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "type" + ] + }, + "execution_count": 5, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "type(Point)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The value of `Point` is just itself in a *literal notation*." + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "__main__.Point" + ] + }, + "execution_count": 6, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "Point" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "We write `Point(4, 2)` to create a *new* object of type `Point`." + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "current_position = Point(4, 2)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Now, `current_position` has a somewhat nicer representation. In particular, the coordinates are named `x` and `y`." + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "Point(x=4, y=2)" + ] + }, + "execution_count": 8, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "current_position" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "It is *not* a `tuple` any more but an object of type `Point`." + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "140376178109184" + ] + }, + "execution_count": 9, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "id(current_position)" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "__main__.Point" + ] + }, + "execution_count": 10, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "type(current_position)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "We use the dot operator `.` to access the defined attributes." + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "4" + ] + }, + "execution_count": 11, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "current_position.x" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "2" + ] + }, + "execution_count": 12, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "current_position.y" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "As before, we get an `AttributeError` if we try to access an undefined attribute." + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "ename": "AttributeError", + "evalue": "'Point' object has no attribute 'z'", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mAttributeError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mcurrent_position\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mz\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;31mAttributeError\u001b[0m: 'Point' object has no attribute 'z'" + ] + } + ], + "source": [ + "current_position.z" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`current_position` continues to work like a `tuple` object! That is why we can use `namedtuple` as a replacement for `tuple`. The underlying implementations exhibit the *same* computational efficiencies and memory usages.\n", + "\n", + "For example, we can index into or loop over `current_position` as it is still a sequence with the familiar four properties." + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "4" + ] + }, + "execution_count": 14, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "current_position[0]" + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "2" + ] + }, + "execution_count": 15, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "current_position[1]" + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "4\n", + "2\n" + ] + } + ], + "source": [ + "for number in current_position:\n", + " print(number)" + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "2\n", + "4\n" + ] + } + ], + "source": [ + "for number in reversed(current_position):\n", + " print(number)" + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "2" + ] + }, + "execution_count": 18, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "len(current_position)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.6" + }, + "livereveal": { + "auto_select": "code", + "auto_select_fragment": true, + "scroll": true, + "theme": "serif" + }, + "toc": { + "base_numbering": 1, + "nav_menu": {}, + "number_sections": false, + "sideBar": true, + "skip_h1_title": true, + "title_cell": "Table of Contents", + "title_sidebar": "Contents", + "toc_cell": false, + "toc_position": { + "height": "calc(100% - 180px)", + "left": "10px", + "top": "150px", + "width": "384px" + }, + "toc_section_display": false, + "toc_window_display": false + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/07_sequences/06_summary.ipynb b/07_sequences/06_summary.ipynb new file mode 100644 index 0000000..8115139 --- /dev/null +++ b/07_sequences/06_summary.ipynb @@ -0,0 +1,85 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "# Chapter 7: Sequential Data (TL;DR)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "**Sequences** are an *abstract* concept that summarizes *four* behaviors an object may or may not exhibit. Sequences are\n", + "- **finite** and\n", + "- **ordered**\n", + "- **containers** that we may\n", + "- **loop over**.\n", + "\n", + "Examples are the `list`, `tuple`, but also the `str` types.\n", + "\n", + "Objects that exhibit all behaviors *except* being ordered are referred to as **collections**.\n", + "\n", + "The objects inside a sequence are called its **elements** and may be labeled with a unique **index**, an `int` object in the range $0 \\leq \\text{index} < \\lvert \\text{sequence} \\rvert$.\n", + "\n", + "`list` objects are **mutable**. That means we can change the references to the other objects it contains, and, in particular, re-assign them.\n", + "\n", + "On the contrary, `tuple` objects are like **immutable** lists: We can use them in place of any `list` object as long as we do *not* need to mutate it. Often, `tuple` objects are also used to model **records** of related **fields**." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.6" + }, + "livereveal": { + "auto_select": "code", + "auto_select_fragment": true, + "scroll": true, + "theme": "serif" + }, + "toc": { + "base_numbering": 1, + "nav_menu": {}, + "number_sections": false, + "sideBar": true, + "skip_h1_title": true, + "title_cell": "Table of Contents", + "title_sidebar": "Contents", + "toc_cell": false, + "toc_position": { + "height": "calc(100% - 180px)", + "left": "10px", + "top": "150px", + "width": "384px" + }, + "toc_section_display": false, + "toc_window_display": false + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/07_sequences/07_review.ipynb b/07_sequences/07_review.ipynb new file mode 100644 index 0000000..de10156 --- /dev/null +++ b/07_sequences/07_review.ipynb @@ -0,0 +1,252 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Chapter 7: Sequential Data (Review Questions)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The questions below assume that you have read the [first ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/07_sequences/00_content.ipynb), [second ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/07_sequences/01_content.ipynb), and the [third ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/07_sequences/03_content.ipynb) part of Chapter 7. Some questions regard the [Appendix ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/07_sequences/05_appendix.ipynb); that is indicated with a **\\***.\n", + "\n", + "Be concise in your answers! Most questions can be answered in *one* sentence." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Essay Questions " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Answer the following questions *briefly*!" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q1**: We have seen **containers** and **iterables** before in [Chapter 4 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/04_iteration/02_content.ipynb#Containers-vs.-Iterables). How do they relate to **sequences**? " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " < your answer >" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q2**: What are **abstract base classes**? How can we make use of the ones from the [collections.abc ](https://docs.python.org/3/library/collections.abc.html) module in the [standard library ](https://docs.python.org/3/library/index.html)?" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " < your answer >" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q3**: How are the *abstract behaviors* of **reversibility** and **finiteness** essential for *indexing* and *slicing* sequences?" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " < your answer >" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q4**: Explain the difference between **mutable** and **immutable** objects in Python with the examples of the `list` and `tuple` types!" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " < your answer >" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q5**: What is the difference between a **shallow** and a **deep** copy of an object? How can one of them become a \"problem?\"" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " < your answer >" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q6**: Many **list methods** change `list` objects \"**in place**.\" What do we mean by that?" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " < your answer >" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q7.1**: `tuple` objects have *two* primary usages. First, they can be used in place of `list` objects where **mutability** is *not* required. Second, we use them to model data **records**.\n", + "\n", + "Describe why `tuple` objects are a suitable replacement for `list` objects in general!" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " < your answer >" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q7.2\\***: What do we mean by a **record**? How are `tuple` objects suitable to model records? How can we integrate a **semantic meaning** when working with records into our code?" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " < your answer >" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q8**: How is (iterable) **packing** and **unpacking** useful in the context of **function definitions** and **calls**?" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " < your answer >" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## True / False Questions" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Motivate your answer with *one short* sentence!" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q9**: `sequence` objects are *not* part of core Python but may be imported from the [standard library ](https://docs.python.org/3/library/index.html)." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " < your answer >" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q10**: The built-in [.sort() ](https://docs.python.org/3/library/stdtypes.html#list.sort) function takes a *finite* **iterable** as its argument an returns a *new* `list` object. On the contrary, the [sorted() ](https://docs.python.org/3/library/functions.html#sorted) method on `list` objects *mutates* them *in place*." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " < your answer >" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q11**: Passing **mutable** objects as arguments to functions is not problematic because functions operate in a **local** scope without affecting the **global** scope." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " < your answer >" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.6" + }, + "toc": { + "base_numbering": 1, + "nav_menu": {}, + "number_sections": false, + "sideBar": true, + "skip_h1_title": true, + "title_cell": "Table of Contents", + "title_sidebar": "Contents", + "toc_cell": false, + "toc_position": {}, + "toc_section_display": false, + "toc_window_display": false + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/CONTENTS.md b/CONTENTS.md index 50bf5f4..bbc1dee 100644 --- a/CONTENTS.md +++ b/CONTENTS.md @@ -139,7 +139,7 @@ If this is not possible, - [further resources ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/05_numbers/06_resources.ipynb) [](https://mybinder.org/v2/gh/webartifex/intro-to-python/develop?urlpath=lab/tree/05_numbers/06_resources.ipynb) - *Chapter 6*: Text & Bytes - - [content ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/06_numbers/00_content.ipynb) + - [content ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/06_text/00_content.ipynb) [](https://mybinder.org/v2/gh/webartifex/intro-to-python/develop?urlpath=lab/tree/06_text/00_content.ipynb) (`str` Type; Reading Files; @@ -160,4 +160,32 @@ If this is not possible, - [summary ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/06_text/03_summary.ipynb) - [review questions ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/06_text/04_review.ipynb) - [further resources ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/06_text/05_resources.ipynb) - [](https://mybinder.org/v2/gh/webartifex/intro-to-python/develop?urlpath=lab/tree/06_text/05_resources.ipynb) \ No newline at end of file + [](https://mybinder.org/v2/gh/webartifex/intro-to-python/develop?urlpath=lab/tree/06_text/05_resources.ipynb) + - *Chapter 7*: Sequential Data + - [content ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/07_sequences/00_content.ipynb) + [](https://mybinder.org/v2/gh/webartifex/intro-to-python/develop?urlpath=lab/tree/07_sequences/00_content.ipynb) + (Collections vs. Sequences; + ABCs: `Container`, `Iterable`, `Sized`, & `Reversible`) + - [content ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/07_sequences/01_content.ipynb) + [](https://mybinder.org/v2/gh/webartifex/intro-to-python/develop?urlpath=lab/tree/07_sequences/01_content.ipynb) + (`list` Type; + Indexing & Slicing; + Shallow vs. Deep Copies; + List Methods & Operations) + - [exercises ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/07_sequences/02_exercises.ipynb) + [](https://mybinder.org/v2/gh/webartifex/intro-to-python/develop?urlpath=lab/tree/07_sequences/02_exercises.ipynb) + (Working with Lists) + - [content ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/07_sequences/03_content.ipynb) + [](https://mybinder.org/v2/gh/webartifex/intro-to-python/develop?urlpath=lab/tree/07_sequences/03_content.ipynb) + (Modifiers vs. Pure Functions; + `tuple` Type; + Packing & Unpacking; + `*args` in Function Definitions) + - [exercises ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/07_sequences/04_exercises.ipynb) + [](https://mybinder.org/v2/gh/webartifex/intro-to-python/develop?urlpath=lab/tree/07_sequences/04_exercises.ipynb) + (Packing & Unpacking with Functions) + - [appendix ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/07_sequences/05_appendix.ipynb) + [](https://mybinder.org/v2/gh/webartifex/intro-to-python/develop?urlpath=lab/tree/07_sequences/05_appendix.ipynb) + (`namedtuple` Type) + - [summary ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/07_sequences/06_summary.ipynb) + - [review questions ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/07_sequences/07_review.ipynb) diff --git a/README.md b/README.md index db07011..e9b2add 100644 --- a/README.md +++ b/README.md @@ -19,6 +19,7 @@ For a more *detailed version* with **clickable links** - **Part B: Managing Data and Memory** - *Chapter 5*: Numbers & Bits - *Chapter 6*: Text & Bytes + - *Chapter 7*: Sequential Data #### Videos