From d6bec8afa2bb493d53e185865485398ae065b067 Mon Sep 17 00:00:00 2001 From: Alexander Hess Date: Mon, 19 Oct 2020 19:15:05 +0200 Subject: [PATCH] Add initial version of chapter 07, part 2 --- 07_sequences/01_content.ipynb | 2995 +++++++++++++++++++++++++++++++++ CONTENTS.md | 9 +- 2 files changed, 3003 insertions(+), 1 deletion(-) create mode 100644 07_sequences/01_content.ipynb diff --git a/07_sequences/01_content.ipynb b/07_sequences/01_content.ipynb new file mode 100644 index 0000000..5123ac8 --- /dev/null +++ b/07_sequences/01_content.ipynb @@ -0,0 +1,2995 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "**Note**: Click on \"*Kernel*\" > \"*Restart Kernel and Clear All Outputs*\" in [JupyterLab](https://jupyterlab.readthedocs.io/en/stable/) *before* reading this notebook to reset its output. If you cannot run this file on your machine, you may want to open it [in the cloud ](https://mybinder.org/v2/gh/webartifex/intro-to-python/develop?urlpath=lab/tree/07_sequences/01_content.ipynb)." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "# Chapter 7: Sequential Data (continued)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "In this second part of the chapter, we look closely at the built-in `list` type, which is probably the most commonly used sequence data type in practice." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## The `list` Type" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "As already seen multiple times, to create a `list` object, we use the *literal notation* and list all elements within brackets `[` and `]`." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "empty = []" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "simple = [40, 50]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The elements do *not* need to be of the *same* type, and `list` objects may also be **nested**." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "nested = [empty, 10, 20.0, \"Thirty\", simple]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "[PythonTutor ](http://pythontutor.com/visualize.html#code=empty%20%3D%20%5B%5D%0Asimple%20%3D%20%5B40,%2050%5D%0Anested%20%3D%20%5Bempty,%2010,%2020.0,%20%22Thirty%22,%20simple%5D&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false) shows how `nested` holds references to the `empty` and `simple` objects. Technically, it holds three more references to the `10`, `20.0`, and `\"Thirty\"` objects as well. However, to simplify the visualization, these three objects are shown right inside the `nested` object. That may be done because they are immutable and \"flat\" data types. In general, the $0$s and $1$s inside a `list` object in memory always constitute references to other objects only." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[[], 10, 20.0, 'Thirty', [40, 50]]" + ] + }, + "execution_count": 4, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Let's not forget that `nested` is an object on its own with an *identity* and *data type*." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "139688113982848" + ] + }, + "execution_count": 5, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "id(nested)" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "list" + ] + }, + "execution_count": 6, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "type(nested)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Alternatively, we use the built-in [list() ](https://docs.python.org/3/library/functions.html#func-list) constructor to create a `list` object out of any (finite) *iterable* we pass to it as the argument.\n", + "\n", + "For example, we can wrap the [range() ](https://docs.python.org/3/library/functions.html#func-range) built-in with [list() ](https://docs.python.org/3/library/functions.html#func-list): As described in [Chapter 4 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/04_iteration/02_content.ipynb#Containers-vs.-Iterables), `range` objects, like `range(1, 13)` below, are iterable and generate `int` objects \"on the fly\" (i.e., one by one). The [list() ](https://docs.python.org/3/library/functions.html#func-list) around it acts like a `for`-loop and **materializes** twelve `int` objects in memory that then become the elements of the newly created `list` object. [PythonTutor ](http://pythontutor.com/visualize.html#code=r%20%3D%20range%281,%2013%29%0Al%20%3D%20list%28range%281,%2013%29%29&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false) shows this difference visually." + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]" + ] + }, + "execution_count": 7, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "list(range(1, 13))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Beware of passing a `range` object over a \"big\" horizon as the argument to [list() ](https://docs.python.org/3/library/functions.html#func-list) as that may lead to a `MemoryError` and the computer crashing." + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "ename": "MemoryError", + "evalue": "", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mMemoryError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mlist\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mrange\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m999_999_999_999\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;31mMemoryError\u001b[0m: " + ] + } + ], + "source": [ + "list(range(999_999_999_999))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "As another example, we create a `list` object from a `str` object, which is iterable, as well. Then, the individual characters become the elements of the new `list` object!" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['i', 't', 'e', 'r', 'a', 'b', 'l', 'e']" + ] + }, + "execution_count": 9, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "list(\"iterable\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## Sequence Behaviors" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`list` objects are *sequences*. To reiterate that, we briefly summarize the *four* behaviors of a sequence and provide some more `list`-specific details below:" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "- `Container`:\n", + " - holds references to other objects in memory (with their own *identity* and *type*)\n", + " - implements membership testing via the `in` operator\n", + "- `Iterable`:\n", + " - supports being looped over\n", + " - works with the `for` or `while` statements\n", + "- `Reversible`:\n", + " - the elements come in a *predictable* order that we may loop over in a forward or backward fashion\n", + " - works with the [reversed() ](https://docs.python.org/3/library/functions.html#reversed) built-in\n", + "- `Sized`:\n", + " - the number of elements is finite *and* known in advance\n", + " - works with the built-in [len() ](https://docs.python.org/3/library/functions.html#len) function" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The \"length\" of `nested` is *five* because `empty` and `simple` count as *one* element each. In other words, `nested` holds five references to other objects, two of which are `list` objects." + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "5" + ] + }, + "execution_count": 10, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "len(nested)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "With a `for`-loop, we can iterate over all elements in a *predictable* order, forward or backward. As `list` objects hold *references* to other *objects*, these have an *indentity* and may even be of *different* types; however, the latter observation is rarely, if ever, useful in practice." + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[] 139688113991744 \n", + "10 94291567220768 \n", + "20.0 139688114016656 \n", + "Thirty 139688114205808 \n", + "[40, 50] 139688113982208 \n" + ] + } + ], + "source": [ + "for element in nested:\n", + " print(str(element).ljust(10), str(id(element)).ljust(18), type(element))" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "[40, 50] Thirty 20.0 10 [] " + ] + } + ], + "source": [ + "for element in reversed(nested):\n", + " print(element, end=\" \")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The `in` operator checks if a given object is \"contained\" in a `list` object. It uses the `==` operator behind the scenes (i.e., *not* the `is` operator) conducting a **[linear search ](https://en.wikipedia.org/wiki/Linear_search)**: So, Python implicitly loops over *all* elements and only stops prematurely if an element evaluates equal to the searched object. A linear search may, therefore, be relatively *slow* for big `list` objects." + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 13, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "10 in nested" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`20` compares equal to the `20.0` in `nested`." + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 14, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "20 in nested" + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "False" + ] + }, + "execution_count": 15, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "30 in nested" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## Indexing" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Because of the *predictable* order and the *finiteness*, each element in a sequence can be labeled with a unique *index*, an `int` object in the range $0 \\leq \\text{index} < \\lvert \\text{sequence} \\rvert$.\n", + "\n", + "Brackets, `[` and `]`, are the literal syntax for accessing individual elements of any sequence type. In this book, we also call them the *indexing operator* in this context." + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[]" + ] + }, + "execution_count": 16, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested[0]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The last index is one less than `len(nested)`, and Python raises an `IndexError` if we look up an index that is not in the range." + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "ename": "IndexError", + "evalue": "list index out of range", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mIndexError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mnested\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m5\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;31mIndexError\u001b[0m: list index out of range" + ] + } + ], + "source": [ + "nested[5]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Negative indices are used to count in reverse order from the end of a sequence, and brackets may be chained to access nested objects. So, to access the `50` inside `simple` via the `nested` object, we write `nested[-1][1]`." + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[40, 50]" + ] + }, + "execution_count": 18, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested[-1]" + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "50" + ] + }, + "execution_count": 19, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested[-1][1]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## Slicing" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Slicing `list` objects works analogously to slicing `str` objects: We use the literal syntax with either one or two colons `:` inside the brackets `[]` to separate the *start*, *stop*, and *step* values. Slicing creates a *new* `list` object with the elements chosen from the original one.\n", + "\n", + "For example, to obtain the three elements in the \"middle\" of `nested`, we slice from `1` (including) to `4` (excluding)." + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[10, 20.0, 'Thirty']" + ] + }, + "execution_count": 20, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested[1:4]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "To obtain \"every other\" element, we slice from beginning to end, defaulting to `0` and `len(nested)` when omitted, in steps of `2`." + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[[], 20.0, [40, 50]]" + ] + }, + "execution_count": 21, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested[::2]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The literal notation with the colons `:` is *syntactic sugar*. It saves us from using the [slice() ](https://docs.python.org/3/library/functions.html#slice) built-in to create `slice` objects. [slice() ](https://docs.python.org/3/library/functions.html#slice) takes *start*, *stop*, and *step* arguments in the same way as the familiar [range() ](https://docs.python.org/3/library/functions.html#func-range), and the `slice` objects it creates are used just as *indexes* above.\n", + "\n", + "In most cases, the literal notation is more convenient to use; however, with `slice` objects, we can give names to slices and reuse them across several sequences." + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [], + "source": [ + "middle = slice(1, 4)" + ] + }, + { + "cell_type": "code", + "execution_count": 23, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "slice" + ] + }, + "execution_count": 23, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "type(middle)" + ] + }, + { + "cell_type": "code", + "execution_count": 24, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[10, 20.0, 'Thirty']" + ] + }, + "execution_count": 24, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested[middle]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`slice` objects come with three read-only attributes `start`, `stop`, and `step` on them." + ] + }, + { + "cell_type": "code", + "execution_count": 25, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "1" + ] + }, + "execution_count": 25, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "middle.start" + ] + }, + { + "cell_type": "code", + "execution_count": 26, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "4" + ] + }, + "execution_count": 26, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "middle.stop" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "If not passed to [slice() ](https://docs.python.org/3/library/functions.html#slice), these attributes default to `None`. That is why the cell below has no output." + ] + }, + { + "cell_type": "code", + "execution_count": 27, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [], + "source": [ + "middle.step" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "A good trick to know is taking a \"full\" slice: This copies *all* elements of a `list` object into a *new* `list` object." + ] + }, + { + "cell_type": "code", + "execution_count": 28, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "nested_copy = nested[:]" + ] + }, + { + "cell_type": "code", + "execution_count": 29, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[[], 10, 20.0, 'Thirty', [40, 50]]" + ] + }, + "execution_count": 29, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested_copy" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "At first glance, `nested` and `nested_copy` seem to cause no pain. For `list` objects, the comparison operator `==` goes over the elements in both operands in a pairwise fashion and checks if they all evaluate equal (cf., the \"*List Comparison*\" section below for more details).\n", + "\n", + "We confirm that `nested` and `nested_copy` compare equal as could be expected but also that they are *different* objects." + ] + }, + { + "cell_type": "code", + "execution_count": 30, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 30, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested == nested_copy" + ] + }, + { + "cell_type": "code", + "execution_count": 31, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "False" + ] + }, + "execution_count": 31, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested is nested_copy" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "However, as [PythonTutor ](http://pythontutor.com/visualize.html#code=nested%20%3D%20%5B%5B%5D,%2010,%2020.0,%20%22Thirty%22,%20%5B40,%2050%5D%5D%0Anested_copy%20%3D%20nested%5B%3A%5D&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false) reveals, only the *references* to the elements are copied, and not the objects in `nested` themselves! Because of that, `nested_copy` is a so-called **[shallow copy ](https://en.wikipedia.org/wiki/Object_copying#Shallow_copy)** of `nested`.\n", + "\n", + "We could also see this with the [id() ](https://docs.python.org/3/library/functions.html#id) function: The respective first elements in both `nested` and `nested_copy` are the *same* object, namely `empty`. So, we have three ways of accessing the *same* address in memory. Also, we say that `nested` and `nested_copy` partially share the *same* state." + ] + }, + { + "cell_type": "code", + "execution_count": 32, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 32, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested[0] is nested_copy[0]" + ] + }, + { + "cell_type": "code", + "execution_count": 33, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "139688113991744" + ] + }, + "execution_count": 33, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "id(nested[0])" + ] + }, + { + "cell_type": "code", + "execution_count": 34, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "139688113991744" + ] + }, + "execution_count": 34, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "id(nested_copy[0])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Knowing this becomes critical if the elements in a `list` object are mutable objects (i.e., we can change them *in place*), and this is the case with `nested` and `nested_copy`, as we see in the next section on \"*Mutability*\".\n", + "\n", + "As both the original `nested` object and its copy reference the *same* `list` objects in memory, any changes made to them are visible to both! Because of that, working with shallow copies can easily become confusing.\n", + "\n", + "Instead of a shallow copy, we could also create a so-called **[deep copy ](https://en.wikipedia.org/wiki/Object_copying#Deep_copy)** of `nested`: Then, the copying process recursively follows every reference in a nested data structure and creates copies of *every* object found.\n", + "\n", + "To explicitly create shallow or deep copies, the [copy ](https://docs.python.org/3/library/copy.html) module in the [standard library ](https://docs.python.org/3/library/index.html) provides two functions, [copy() ](https://docs.python.org/3/library/copy.html#copy.copy) and [deepcopy() ](https://docs.python.org/3/library/copy.html#copy.deepcopy). We must always remember that slicing creates *shallow* copies only." + ] + }, + { + "cell_type": "code", + "execution_count": 35, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [], + "source": [ + "import copy" + ] + }, + { + "cell_type": "code", + "execution_count": 36, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [], + "source": [ + "nested_deep_copy = copy.deepcopy(nested)" + ] + }, + { + "cell_type": "code", + "execution_count": 37, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 37, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested == nested_deep_copy" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Now, the first elements of `nested` and `nested_deep_copy` are *different* objects, and [PythonTutor ](http://pythontutor.com/visualize.html#code=import%20copy%0Anested%20%3D%20%5B%5B%5D,%2010,%2020.0,%20%22Thirty%22,%20%5B40,%2050%5D%5D%0Anested_deep_copy%20%3D%20copy.deepcopy%28nested%29&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false) shows that there are *six* `list` objects in memory." + ] + }, + { + "cell_type": "code", + "execution_count": 38, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "False" + ] + }, + "execution_count": 38, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested[0] is nested_deep_copy[0]" + ] + }, + { + "cell_type": "code", + "execution_count": 39, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "139688113991744" + ] + }, + "execution_count": 39, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "id(nested[0])" + ] + }, + { + "cell_type": "code", + "execution_count": 40, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "139688113233152" + ] + }, + "execution_count": 40, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "id(nested_deep_copy[0])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "As this [StackOverflow question ](https://stackoverflow.com/questions/184710/what-is-the-difference-between-a-deep-copy-and-a-shallow-copy) shows, understanding shallow and deep copies is a common source of confusion, independent of the programming language." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## Mutability" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "In contrast to `str` objects, `list` objects are *mutable*: We may assign new elements to indices or slices and also remove elements. That changes the *references* in a `list` object. In general, if an object is *mutable*, we say that it may be changed *in place*." + ] + }, + { + "cell_type": "code", + "execution_count": 41, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "nested[0] = 0" + ] + }, + { + "cell_type": "code", + "execution_count": 42, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[0, 10, 20.0, 'Thirty', [40, 50]]" + ] + }, + "execution_count": 42, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "When we re-assign a slice, we can even change the size of the `list` object." + ] + }, + { + "cell_type": "code", + "execution_count": 43, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "nested[:4] = [100, 100, 100]" + ] + }, + { + "cell_type": "code", + "execution_count": 44, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[100, 100, 100, [40, 50]]" + ] + }, + "execution_count": 44, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested" + ] + }, + { + "cell_type": "code", + "execution_count": 45, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "4" + ] + }, + "execution_count": 45, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "len(nested)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The `list` object's identity does *not* change. That is the main point behind mutable objects." + ] + }, + { + "cell_type": "code", + "execution_count": 46, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "139688113982848" + ] + }, + "execution_count": 46, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "id(nested) # same memory location as before" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`nested_copy` is unchanged!" + ] + }, + { + "cell_type": "code", + "execution_count": 47, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[[], 10, 20.0, 'Thirty', [40, 50]]" + ] + }, + "execution_count": 47, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested_copy" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Let's change the nested `[40, 50]` via `nested_copy` into `[1, 2, 3]` by replacing all its elements." + ] + }, + { + "cell_type": "code", + "execution_count": 48, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "nested_copy[-1][:] = [1, 2, 3]" + ] + }, + { + "cell_type": "code", + "execution_count": 49, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[[], 10, 20.0, 'Thirty', [1, 2, 3]]" + ] + }, + "execution_count": 49, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested_copy" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "That has a surprising side effect on `nested`." + ] + }, + { + "cell_type": "code", + "execution_count": 50, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[100, 100, 100, [1, 2, 3]]" + ] + }, + "execution_count": 50, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "That is precisely the confusion we talked about above when we said that `nested_copy` is a *shallow* copy of `nested`. [PythonTutor ](http://pythontutor.com/visualize.html#code=nested%20%3D%20%5B%5B%5D,%2010,%2020.0,%20%22Thirty%22,%20%5B40,%2050%5D%5D%0Anested_copy%20%3D%20nested%5B%3A%5D%0Anested%5B%3A4%5D%20%3D%20%5B100,%20100,%20100%5D%0Anested_copy%5B-1%5D%5B%3A%5D%20%3D%20%5B1,%202,%203%5D&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false) shows how both reference the *same* nested `list` object that is changed *in place* from `[40, 50]` into `[1, 2, 3]`.\n", + "\n", + "Lastly, we use the `del` statement to remove an element." + ] + }, + { + "cell_type": "code", + "execution_count": 51, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "del nested[-1]" + ] + }, + { + "cell_type": "code", + "execution_count": 52, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[100, 100, 100]" + ] + }, + "execution_count": 52, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The `del` statement also works for slices. Here, we remove all references `nested` holds." + ] + }, + { + "cell_type": "code", + "execution_count": 53, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "del nested[:]" + ] + }, + { + "cell_type": "code", + "execution_count": 54, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[]" + ] + }, + "execution_count": 54, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "nested" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Mutability for sequences is formalized by the `MutableSequence` ABC in the [collections.abc ](https://docs.python.org/3/library/collections.abc.html) module." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## List Methods" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The `list` type is an essential data structure in any real-world Python application, and many typical `list`-related algorithms from computer science theory are already built into it at the C level (cf., the [documentation ](https://docs.python.org/3/library/stdtypes.html#mutable-sequence-types) or the [tutorial ](https://docs.python.org/3/tutorial/datastructures.html#more-on-lists) for a full overview; unfortunately, not all methods have direct links). So, understanding and applying the built-in methods of the `list` type not only speeds up the development process but also makes programs significantly faster.\n", + "\n", + "In contrast to the `str` type's methods in [Chapter 6 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/06_text/00_content.ipynb#String-Methods) (e.g., [.upper() ](https://docs.python.org/3/library/stdtypes.html#str.upper) or [.lower() ](https://docs.python.org/3/library/stdtypes.html#str.lower)), the `list` type's methods that mutate an object do so *in place*. That means they *never* create *new* `list` objects and return `None` to indicate that. So, we must *never* assign the return value of `list` methods to the variable holding the list!\n", + "\n", + "Let's look at the following `names` example." + ] + }, + { + "cell_type": "code", + "execution_count": 55, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "names = [\"Carl\", \"Peter\"]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "To add an object to the end of `names`, we use the `.append()` method. The code cell shows no output indicating that `None` must be the return value." + ] + }, + { + "cell_type": "code", + "execution_count": 56, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "names.append(\"Eckardt\")" + ] + }, + { + "cell_type": "code", + "execution_count": 57, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Carl', 'Peter', 'Eckardt']" + ] + }, + "execution_count": 57, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "With the `.extend()` method, we may also append multiple elements provided by an iterable. Here, the iterable is a `list` object itself holding two `str` objects." + ] + }, + { + "cell_type": "code", + "execution_count": 58, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "names.extend([\"Karl\", \"Oliver\"])" + ] + }, + { + "cell_type": "code", + "execution_count": 59, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Carl', 'Peter', 'Eckardt', 'Karl', 'Oliver']" + ] + }, + "execution_count": 59, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Similar to `.append()`, we may add a new element at an arbitrary position with the `.insert()` method. `.insert()` takes two arguments, an *index* and the element to be inserted." + ] + }, + { + "cell_type": "code", + "execution_count": 60, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "names.insert(1, \"Berthold\")" + ] + }, + { + "cell_type": "code", + "execution_count": 61, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Carl', 'Berthold', 'Peter', 'Eckardt', 'Karl', 'Oliver']" + ] + }, + "execution_count": 61, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`list` objects may be sorted *in place* with the [.sort() ](https://docs.python.org/3/library/stdtypes.html#list.sort) method. That is different from the built-in [sorted() ](https://docs.python.org/3/library/functions.html#sorted) function that takes any *finite* and *iterable* object and returns a *new* `list` object with the iterable's elements sorted!" + ] + }, + { + "cell_type": "code", + "execution_count": 62, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Berthold', 'Carl', 'Eckardt', 'Karl', 'Oliver', 'Peter']" + ] + }, + "execution_count": 62, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "sorted(names)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "As the previous code cell created a *new* `list` object, `names` is still unsorted." + ] + }, + { + "cell_type": "code", + "execution_count": 63, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Carl', 'Berthold', 'Peter', 'Eckardt', 'Karl', 'Oliver']" + ] + }, + "execution_count": 63, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Let's sort the elements in `names` instead." + ] + }, + { + "cell_type": "code", + "execution_count": 64, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "names.sort()" + ] + }, + { + "cell_type": "code", + "execution_count": 65, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Berthold', 'Carl', 'Eckardt', 'Karl', 'Oliver', 'Peter']" + ] + }, + "execution_count": 65, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "To sort in reverse order, we pass a keyword-only `reverse=True` argument to either the [.sort() ](https://docs.python.org/3/library/stdtypes.html#list.sort) method or the [sorted() ](https://docs.python.org/3/library/functions.html#sorted) function." + ] + }, + { + "cell_type": "code", + "execution_count": 66, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "names.sort(reverse=True)" + ] + }, + { + "cell_type": "code", + "execution_count": 67, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Peter', 'Oliver', 'Karl', 'Eckardt', 'Carl', 'Berthold']" + ] + }, + "execution_count": 67, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The [.sort() ](https://docs.python.org/3/library/stdtypes.html#list.sort) method and the [sorted() ](https://docs.python.org/3/library/functions.html#sorted) function sort the elements in `names` in alphabetical order, forward or backward. However, that does *not* hold in general.\n", + "\n", + "We mention above that `list` objects may contain objects of *any* type and even of *mixed* types. Because of that, the sorting is **[delegated ](https://en.wikipedia.org/wiki/Delegation_(object-oriented_programming))** to the elements in a `list` object. In a way, Python \"asks\" the elements in a `list` object to sort themselves. As `names` contains only `str` objects, they are sorted according the the comparison rules explained in [Chapter 6 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/06_text/00_content.ipynb#String-Comparison).\n", + "\n", + "To customize the sorting, we pass a keyword-only `key` argument to [.sort() ](https://docs.python.org/3/library/stdtypes.html#list.sort) or [sorted() ](https://docs.python.org/3/library/functions.html#sorted), which must be a `function` object accepting *one* positional argument. Then, the elements in the `list` object are passed to that one by one, and the return values are used as the **sort keys**. The `key` argument is also a popular use case for `lambda` expressions.\n", + "\n", + "For example, to sort `names` not by alphabet but by the names' lengths, we pass in a reference to the built-in [len() ](https://docs.python.org/3/library/functions.html#len) function as `key=len`. Note that there are *no* parentheses after `len`!" + ] + }, + { + "cell_type": "code", + "execution_count": 68, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "names.sort(key=len)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "If two names have the same length, their relative order is kept as is. That is why `\"Karl\"` comes before `\"Carl\" ` below. A [sorting algorithm ](https://en.wikipedia.org/wiki/Sorting_algorithm) with that property is called **[stable ](https://en.wikipedia.org/wiki/Sorting_algorithm#Stability)**.\n", + "\n", + "Sorting is an important topic in programming, and we refer to the official [HOWTO ](https://docs.python.org/3/howto/sorting.html) for a more comprehensive introduction." + ] + }, + { + "cell_type": "code", + "execution_count": 69, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Karl', 'Carl', 'Peter', 'Oliver', 'Eckardt', 'Berthold']" + ] + }, + "execution_count": 69, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`.sort(reverse=True)` is different from the `.reverse()` method. Whereas the former applies some sorting rule in reverse order, the latter simply reverses the elements in a `list` object." + ] + }, + { + "cell_type": "code", + "execution_count": 70, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "names.reverse()" + ] + }, + { + "cell_type": "code", + "execution_count": 71, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Berthold', 'Eckardt', 'Oliver', 'Peter', 'Carl', 'Karl']" + ] + }, + "execution_count": 71, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The `.pop()` method removes the *last* element from a `list` object *and* returns it. Below we **capture** the `removed` element to show that the return value is not `None` as with all the methods introduced so far." + ] + }, + { + "cell_type": "code", + "execution_count": 72, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "removed = names.pop()" + ] + }, + { + "cell_type": "code", + "execution_count": 73, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'Karl'" + ] + }, + "execution_count": 73, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "removed" + ] + }, + { + "cell_type": "code", + "execution_count": 74, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Berthold', 'Eckardt', 'Oliver', 'Peter', 'Carl']" + ] + }, + "execution_count": 74, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`.pop()` takes an optional *index* argument and removes that instead.\n", + "\n", + "So, to remove the second element, `\"Eckhardt\"`, from `names`, we write this." + ] + }, + { + "cell_type": "code", + "execution_count": 75, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "removed = names.pop(1)" + ] + }, + { + "cell_type": "code", + "execution_count": 76, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'Eckardt'" + ] + }, + "execution_count": 76, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "removed" + ] + }, + { + "cell_type": "code", + "execution_count": 77, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Berthold', 'Oliver', 'Peter', 'Carl']" + ] + }, + "execution_count": 77, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Instead of removing an element by its index, we can also remove it by its value with the `.remove()` method. Behind the scenes, Python then compares the object to be removed, `\"Peter\"` in the example, sequentially to each element with the `==` operator and removes the *first* one that evaluates equal to it. `.remove()` does *not* return the removed element." + ] + }, + { + "cell_type": "code", + "execution_count": 78, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "names.remove(\"Peter\")" + ] + }, + { + "cell_type": "code", + "execution_count": 79, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Berthold', 'Oliver', 'Carl']" + ] + }, + "execution_count": 79, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Also, `.remove()` raises a `ValueError` if the value is not found." + ] + }, + { + "cell_type": "code", + "execution_count": 80, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "ename": "ValueError", + "evalue": "list.remove(x): x not in list", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mnames\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mremove\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"Peter\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;31mValueError\u001b[0m: list.remove(x): x not in list" + ] + } + ], + "source": [ + "names.remove(\"Peter\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`list` objects implement an `.index()` method that returns the position of the first element that compares equal to its argument. It fails *loudly* with a `ValueError` if no element compares equal." + ] + }, + { + "cell_type": "code", + "execution_count": 81, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Berthold', 'Oliver', 'Carl']" + ] + }, + "execution_count": 81, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names" + ] + }, + { + "cell_type": "code", + "execution_count": 82, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "1" + ] + }, + "execution_count": 82, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names.index(\"Oliver\")" + ] + }, + { + "cell_type": "code", + "execution_count": 83, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "ename": "ValueError", + "evalue": "'Karl' is not in list", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mnames\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mindex\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"Karl\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;31mValueError\u001b[0m: 'Karl' is not in list" + ] + } + ], + "source": [ + "names.index(\"Karl\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The `.count()` method returns the number of elements that compare equal to its argument." + ] + }, + { + "cell_type": "code", + "execution_count": 84, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "1" + ] + }, + "execution_count": 84, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names.count(\"Carl\")" + ] + }, + { + "cell_type": "code", + "execution_count": 85, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "0" + ] + }, + "execution_count": 85, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names.count(\"Karl\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Two more methods, `.copy()` and `.clear()`, are *syntactic sugar* and replace working with slices.\n", + "\n", + "`.copy()` creates a *shallow* copy. So, `names.copy()` below does the same as taking a full slice with `names[:]`, and the caveats from above apply, too." + ] + }, + { + "cell_type": "code", + "execution_count": 86, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [], + "source": [ + "names_copy = names.copy()" + ] + }, + { + "cell_type": "code", + "execution_count": 87, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Berthold', 'Oliver', 'Carl']" + ] + }, + "execution_count": 87, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names_copy" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`.clear()` removes all references from a `list` object. So, `names_copy.clear()` is the same as `del names_copy[:]`." + ] + }, + { + "cell_type": "code", + "execution_count": 88, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [], + "source": [ + "names_copy.clear()" + ] + }, + { + "cell_type": "code", + "execution_count": 89, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[]" + ] + }, + "execution_count": 89, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names_copy" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Many methods introduced in this section are mentioned in the [collections.abc ](https://docs.python.org/3/library/collections.abc.html) module's documentation as well: While the `.index()` and `.count()` methods come with any data type that is a `Sequence`, the `.append()`, `.extend()`, `.insert()`, `.reverse()`, `.pop()`, and `.remove()` methods are part of any `MutableSequence` type. The `.sort()`, `.copy()`, and `.clear()` methods are `list`-specific.\n", + "\n", + "So, being a sequence does not only imply the four *behaviors* specified above, but also means that a data type comes with certain standardized methods." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## List Operations" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "As with `str` objects, the `+` and `*` operators are overloaded for concatenation and always return a *new* `list` object. The references in this newly created `list` object reference the *same* objects as the two original `list` objects. So, the same caveat as with *shallow* copies from above applies!" + ] + }, + { + "cell_type": "code", + "execution_count": 90, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Berthold', 'Oliver', 'Carl']" + ] + }, + "execution_count": 90, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names" + ] + }, + { + "cell_type": "code", + "execution_count": 91, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Berthold', 'Oliver', 'Carl', 'Diedrich', 'Yves']" + ] + }, + "execution_count": 91, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names + [\"Diedrich\", \"Yves\"]" + ] + }, + { + "cell_type": "code", + "execution_count": 92, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Berthold', 'Oliver', 'Carl', 'Berthold', 'Oliver', 'Carl']" + ] + }, + "execution_count": 92, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "2 * names" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Unpacking" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Besides being an operator, the `*` symbol has a second syntactical use, as explained in [PEP 3132 ](https://www.python.org/dev/peps/pep-3132/) and [PEP 448 ](https://www.python.org/dev/peps/pep-0448/): It implements what is called **iterable unpacking**. It is *not* an operator syntactically but a notation that Python reads as a literal.\n", + "\n", + "In the example, Python interprets the expression as if the elements of the iterable `names` were placed between `\"Achim\"` and `\"Xavier\"` one by one. So, we do not obtain a nested but a *flat* list." + ] + }, + { + "cell_type": "code", + "execution_count": 93, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Achim', 'Berthold', 'Oliver', 'Carl', 'Xavier']" + ] + }, + "execution_count": 93, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "[\"Achim\", *names, \"Xavier\"]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Effectively, Python reads that as if we wrote the following." + ] + }, + { + "cell_type": "code", + "execution_count": 94, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Achim', 'Berthold', 'Oliver', 'Carl', 'Xavier']" + ] + }, + "execution_count": 94, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "[\"Achim\", names[0], names[1], names[2], \"Xavier\"]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "### List Comparison" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The relational operators also work with `list` objects; yet another example of operator overloading.\n", + "\n", + "Comparison is made in a pairwise fashion until the first pair of elements does not evaluate equal or one of the `list` objects ends. The exact comparison rules depend on the elements and not the `list` objects. As with [.sort() ](https://docs.python.org/3/library/stdtypes.html#list.sort) or [sorted() ](https://docs.python.org/3/library/functions.html#sorted) above, comparison is *delegated* to the objects to be compared, and Python \"asks\" the elements in the two `list` objects to compare themselves. Usually, all elements are of the *same* type, and comparison is straightforward." + ] + }, + { + "cell_type": "code", + "execution_count": 95, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Berthold', 'Oliver', 'Carl']" + ] + }, + "execution_count": 95, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names" + ] + }, + { + "cell_type": "code", + "execution_count": 96, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 96, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names == [\"Berthold\", \"Oliver\", \"Carl\"]" + ] + }, + { + "cell_type": "code", + "execution_count": 97, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 97, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names != [\"Berthold\", \"Oliver\", \"Karl\"]" + ] + }, + { + "cell_type": "code", + "execution_count": 98, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 98, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "names < [\"Berthold\", \"Oliver\", \"Karl\"]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "If two `list` objects have a different number of elements and all overlapping elements compare equal, the shorter `list` object is considered \"smaller.\"" + ] + }, + { + "cell_type": "code", + "execution_count": 99, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 99, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "[\"Berthold\", \"Oliver\"] < names < [\"Berthold\", \"Oliver\", \"Carl\", \"Xavier\"]" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.6" + }, + "livereveal": { + "auto_select": "code", + "auto_select_fragment": true, + "scroll": true, + "theme": "serif" + }, + "toc": { + "base_numbering": 1, + "nav_menu": {}, + "number_sections": false, + "sideBar": true, + "skip_h1_title": true, + "title_cell": "Table of Contents", + "title_sidebar": "Contents", + "toc_cell": false, + "toc_position": { + "height": "calc(100% - 180px)", + "left": "10px", + "top": "150px", + "width": "384px" + }, + "toc_section_display": false, + "toc_window_display": false + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/CONTENTS.md b/CONTENTS.md index fd55699..bc87a09 100644 --- a/CONTENTS.md +++ b/CONTENTS.md @@ -165,4 +165,11 @@ If this is not possible, - [content ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/07_sequences/00_content.ipynb) [](https://mybinder.org/v2/gh/webartifex/intro-to-python/develop?urlpath=lab/tree/07_sequences/00_content.ipynb) (Collections vs. Sequences; - ABCs: `Container`, `Iterable`, `Sized`, & `Reversible`) \ No newline at end of file + ABCs: `Container`, `Iterable`, `Sized`, & `Reversible`) + - [content ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/07_sequences/01_content.ipynb) + [](https://mybinder.org/v2/gh/webartifex/intro-to-python/develop?urlpath=lab/tree/07_sequences/01_content.ipynb) + (`list` Type; + Indexing & Slicing; + Shallow vs. Deep Copies; + List Methods & Operations; + Unpacking) \ No newline at end of file