{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "**Note**: Click on \"*Kernel*\" > \"*Restart Kernel and Clear All Outputs*\" in [JupyterLab](https://jupyterlab.readthedocs.io/en/stable/) *before* reading this notebook to reset its output. If you cannot run this file on your machine, you may want to open it [in the cloud ](https://mybinder.org/v2/gh/webartifex/intro-to-python/main?urlpath=lab/tree/07_sequences/01_content.ipynb)." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Chapter 7: Sequential Data (continued)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "In this second part of the chapter, we look closely at the built-in `list` type, which is probably the most commonly used sequence data type in practice." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## The `list` Type" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "As already seen multiple times, to create a `list` object, we use the *literal notation* and list all elements within brackets `[` and `]`." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "empty = []" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "simple = [40, 50]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "The elements do *not* need to be of the *same* type, and `list` objects may also be **nested**." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "nested = [empty, 10, 20.0, \"Thirty\", simple]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "[PythonTutor ](http://pythontutor.com/visualize.html#code=empty%20%3D%20%5B%5D%0Asimple%20%3D%20%5B40,%2050%5D%0Anested%20%3D%20%5Bempty,%2010,%2020.0,%20%22Thirty%22,%20simple%5D&cumulative=false&curstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false) shows how `nested` holds references to the `empty` and `simple` objects. Technically, it holds three more references to the `10`, `20.0`, and `\"Thirty\"` objects as well. However, to simplify the visualization, these three objects are shown right inside the `nested` object. That may be done because they are immutable and \"flat\" data types. In general, the $0$s and $1$s inside a `list` object in memory always constitute references to other objects only." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "[[], 10, 20.0, 'Thirty', [40, 50]]" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nested" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Let's not forget that `nested` is an object on its own with an *identity* and *data type*." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "139688113982848" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "id(nested)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "list" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(nested)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Alternatively, we use the built-in [list() ](https://docs.python.org/3/library/functions.html#func-list) constructor to create a `list` object out of any (finite) *iterable* we pass to it as the argument.\n", "\n", "For example, we can wrap the [range() ](https://docs.python.org/3/library/functions.html#func-range) built-in with [list() ](https://docs.python.org/3/library/functions.html#func-list): As described in [Chapter 4 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/main/04_iteration/02_content.ipynb#Containers-vs.-Iterables), `range` objects, like `range(1, 13)` below, are iterable and generate `int` objects \"on the fly\" (i.e., one by one). The [list() ](https://docs.python.org/3/library/functions.html#func-list) around it acts like a `for`-loop and **materializes** twelve `int` objects in memory that then become the elements of the newly created `list` object. [PythonTutor ](http://pythontutor.com/visualize.html#code=r%20%3D%20range%281,%2013%29%0Al%20%3D%20list%28range%281,%2013%29%29&cumulative=false&curstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false) shows this difference visually." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/plain": [ "[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list(range(1, 13))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Beware of passing a `range` object over a \"big\" horizon as the argument to [list() ](https://docs.python.org/3/library/functions.html#func-list) as that may lead to a `MemoryError` and the computer crashing." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "ename": "MemoryError", "evalue": "", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mMemoryError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[8], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[38;5;28;43mlist\u001b[39;49m\u001b[43m(\u001b[49m\u001b[38;5;28;43mrange\u001b[39;49m\u001b[43m(\u001b[49m\u001b[38;5;241;43m999_999_999_999\u001b[39;49m\u001b[43m)\u001b[49m\u001b[43m)\u001b[49m\n", "\u001b[0;31mMemoryError\u001b[0m: " ] } ], "source": [ "list(range(999_999_999_999))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "As another example, we create a `list` object from a `str` object, which is iterable, as well. Then, the individual characters become the elements of the new `list` object!" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "['i', 't', 'e', 'r', 'a', 'b', 'l', 'e']" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list(\"iterable\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Sequence Behaviors" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "`list` objects are *sequences*. To reiterate that, we briefly summarize the *four* behaviors of a sequence and provide some more `list`-specific details below:" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "- `Container`:\n", " - holds references to other objects in memory (with their own *identity* and *type*)\n", " - implements membership testing via the `in` operator\n", "- `Iterable`:\n", " - supports being looped over\n", " - works with the `for` or `while` statements\n", "- `Reversible`:\n", " - the elements come in a *predictable* order that we may loop over in a forward or backward fashion\n", " - works with the [reversed() ](https://docs.python.org/3/library/functions.html#reversed) built-in\n", "- `Sized`:\n", " - the number of elements is finite *and* known in advance\n", " - works with the built-in [len() ](https://docs.python.org/3/library/functions.html#len) function" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "The \"length\" of `nested` is *five* because `empty` and `simple` count as *one* element each. In other words, `nested` holds five references to other objects, two of which are `list` objects." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/plain": [ "5" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(nested)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "With a `for`-loop, we can iterate over all elements in a *predictable* order, forward or backward. As `list` objects hold *references* to other *objects*, these have an *indentity* and may even be of *different* types; however, the latter observation is rarely, if ever, useful in practice." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[] 139688113991744 \n", "10 94291567220768 \n", "20.0 139688114016656 \n", "Thirty 139688114205808 \n", "[40, 50] 139688113982208 \n" ] } ], "source": [ "for element in nested:\n", " print(str(element).ljust(10), str(id(element)).ljust(18), type(element))" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[40, 50] Thirty 20.0 10 [] " ] } ], "source": [ "for element in reversed(nested):\n", " print(element, end=\" \")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "The `in` operator checks if a given object is \"contained\" in a `list` object. It uses the `==` operator behind the scenes (i.e., *not* the `is` operator) conducting a **[linear search ](https://en.wikipedia.org/wiki/Linear_search)**: So, Python implicitly loops over *all* elements and only stops prematurely if an element evaluates equal to the searched object. A linear search may, therefore, be relatively *slow* for big `list` objects." ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "10 in nested" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "`20` compares equal to the `20.0` in `nested`." ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "20 in nested" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "30 in nested" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Indexing" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Because of the *predictable* order and the *finiteness*, each element in a sequence can be labeled with a unique *index*, an `int` object in the range $0 \\leq \\text{index} < \\lvert \\text{sequence} \\rvert$.\n", "\n", "Brackets, `[` and `]`, are the literal syntax for accessing individual elements of any sequence type. In this book, we also call them the *indexing operator* in this context." ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/plain": [ "[]" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nested[0]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "The last index is one less than `len(nested)`, and Python raises an `IndexError` if we look up an index that is not in the range." ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "ename": "IndexError", "evalue": "list index out of range", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mIndexError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[17], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43mnested\u001b[49m\u001b[43m[\u001b[49m\u001b[38;5;241;43m5\u001b[39;49m\u001b[43m]\u001b[49m\n", "\u001b[0;31mIndexError\u001b[0m: list index out of range" ] } ], "source": [ "nested[5]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Negative indices are used to count in reverse order from the end of a sequence, and brackets may be chained to access nested objects. So, to access the `50` inside `simple` via the `nested` object, we write `nested[-1][1]`." ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "[40, 50]" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nested[-1]" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "50" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nested[-1][1]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Slicing" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Slicing `list` objects works analogously to slicing `str` objects: We use the literal syntax with either one or two colons `:` inside the brackets `[]` to separate the *start*, *stop*, and *step* values. Slicing creates a *new* `list` object with the elements chosen from the original one.\n", "\n", "For example, to obtain the three elements in the \"middle\" of `nested`, we slice from `1` (including) to `4` (excluding)." ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/plain": [ "[10, 20.0, 'Thirty']" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nested[1:4]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "To obtain \"every other\" element, we slice from beginning to end, defaulting to `0` and `len(nested)` when omitted, in steps of `2`." ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "[[], 20.0, [40, 50]]" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nested[::2]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "The literal notation with the colons `:` is *syntactic sugar*. It saves us from using the [slice() ](https://docs.python.org/3/library/functions.html#slice) built-in to create `slice` objects. [slice() ](https://docs.python.org/3/library/functions.html#slice) takes *start*, *stop*, and *step* arguments in the same way as the familiar [range() ](https://docs.python.org/3/library/functions.html#func-range), and the `slice` objects it creates are used just as *indexes* above.\n", "\n", "In most cases, the literal notation is more convenient to use; however, with `slice` objects, we can give names to slices and reuse them across several sequences." ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "middle = slice(1, 4)" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "slice" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(middle)" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "[10, 20.0, 'Thirty']" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nested[middle]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "`slice` objects come with three read-only attributes `start`, `stop`, and `step` on them." ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "1" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "middle.start" ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "4" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "middle.stop" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "If not passed to [slice() ](https://docs.python.org/3/library/functions.html#slice), these attributes default to `None`. That is why the cell below has no output." ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "middle.step" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "A good trick to know is taking a \"full\" slice: This copies *all* elements of a `list` object into a *new* `list` object." ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "nested_copy = nested[:]" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "[[], 10, 20.0, 'Thirty', [40, 50]]" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nested_copy" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "At first glance, `nested` and `nested_copy` seem to cause no pain. For `list` objects, the comparison operator `==` goes over the elements in both operands in a pairwise fashion and checks if they all evaluate equal (cf., the \"*List Comparison*\" section below for more details).\n", "\n", "We confirm that `nested` and `nested_copy` compare equal as could be expected but also that they are *different* objects." ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nested == nested_copy" ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nested is nested_copy" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "However, as [PythonTutor ](http://pythontutor.com/visualize.html#code=nested%20%3D%20%5B%5B%5D,%2010,%2020.0,%20%22Thirty%22,%20%5B40,%2050%5D%5D%0Anested_copy%20%3D%20nested%5B%3A%5D&cumulative=false&curstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false) reveals, only the *references* to the elements are copied, and not the objects in `nested` themselves! Because of that, `nested_copy` is a so-called **[shallow copy ](https://en.wikipedia.org/wiki/Object_copying#Shallow_copy)** of `nested`.\n", "\n", "We could also see this with the [id() ](https://docs.python.org/3/library/functions.html#id) function: The respective first elements in both `nested` and `nested_copy` are the *same* object, namely `empty`. So, we have three ways of accessing the *same* address in memory. Also, we say that `nested` and `nested_copy` partially share the *same* state." ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nested[0] is nested_copy[0]" ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "139688113991744" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "id(nested[0])" ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "139688113991744" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "id(nested_copy[0])" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Knowing this becomes critical if the elements in a `list` object are mutable objects (i.e., we can change them *in place*), and this is the case with `nested` and `nested_copy`, as we see in the next section on \"*Mutability*\".\n", "\n", "As both the original `nested` object and its copy reference the *same* `list` objects in memory, any changes made to them are visible to both! Because of that, working with shallow copies can easily become confusing.\n", "\n", "Instead of a shallow copy, we could also create a so-called **[deep copy ](https://en.wikipedia.org/wiki/Object_copying#Deep_copy)** of `nested`: Then, the copying process recursively follows every reference in a nested data structure and creates copies of *every* object found.\n", "\n", "To explicitly create shallow or deep copies, the [copy ](https://docs.python.org/3/library/copy.html) module in the [standard library ](https://docs.python.org/3/library/index.html) provides two functions, [copy() ](https://docs.python.org/3/library/copy.html#copy.copy) and [deepcopy() ](https://docs.python.org/3/library/copy.html#copy.deepcopy). We must always remember that slicing creates *shallow* copies only." ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "import copy" ] }, { "cell_type": "code", "execution_count": 36, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "nested_deep_copy = copy.deepcopy(nested)" ] }, { "cell_type": "code", "execution_count": 37, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nested == nested_deep_copy" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Now, the first elements of `nested` and `nested_deep_copy` are *different* objects, and [PythonTutor ](http://pythontutor.com/visualize.html#code=import%20copy%0Anested%20%3D%20%5B%5B%5D,%2010,%2020.0,%20%22Thirty%22,%20%5B40,%2050%5D%5D%0Anested_deep_copy%20%3D%20copy.deepcopy%28nested%29&cumulative=false&curstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false) shows that there are *six* `list` objects in memory." ] }, { "cell_type": "code", "execution_count": 38, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nested[0] is nested_deep_copy[0]" ] }, { "cell_type": "code", "execution_count": 39, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "139688113991744" ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "id(nested[0])" ] }, { "cell_type": "code", "execution_count": 40, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "139688113233152" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "id(nested_deep_copy[0])" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "As this [StackOverflow question ](https://stackoverflow.com/questions/184710/what-is-the-difference-between-a-deep-copy-and-a-shallow-copy) shows, understanding shallow and deep copies is a common source of confusion, independent of the programming language." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Mutability" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "In contrast to `str` objects, `list` objects are *mutable*: We may assign new elements to indices or slices and also remove elements. That changes the *references* in a `list` object. In general, if an object is *mutable*, we say that it may be changed *in place*." ] }, { "cell_type": "code", "execution_count": 41, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "nested[0] = 0" ] }, { "cell_type": "code", "execution_count": 42, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "[0, 10, 20.0, 'Thirty', [40, 50]]" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nested" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "When we re-assign a slice, we can even change the size of the `list` object." ] }, { "cell_type": "code", "execution_count": 43, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "nested[:4] = [100, 100, 100]" ] }, { "cell_type": "code", "execution_count": 44, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "[100, 100, 100, [40, 50]]" ] }, "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nested" ] }, { "cell_type": "code", "execution_count": 45, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "4" ] }, "execution_count": 45, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(nested)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "The `list` object's identity does *not* change. That is the main point behind mutable objects." ] }, { "cell_type": "code", "execution_count": 46, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "139688113982848" ] }, "execution_count": 46, "metadata": {}, "output_type": "execute_result" } ], "source": [ "id(nested) # same memory location as before" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "`nested_copy` is unchanged!" ] }, { "cell_type": "code", "execution_count": 47, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/plain": [ "[[], 10, 20.0, 'Thirty', [40, 50]]" ] }, "execution_count": 47, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nested_copy" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Let's change the nested `[40, 50]` via `nested_copy` into `[1, 2, 3]` by replacing all its elements." ] }, { "cell_type": "code", "execution_count": 48, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "nested_copy[-1][:] = [1, 2, 3]" ] }, { "cell_type": "code", "execution_count": 49, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "[[], 10, 20.0, 'Thirty', [1, 2, 3]]" ] }, "execution_count": 49, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nested_copy" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "That has a surprising side effect on `nested`." ] }, { "cell_type": "code", "execution_count": 50, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "[100, 100, 100, [1, 2, 3]]" ] }, "execution_count": 50, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nested" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "That is precisely the confusion we talked about above when we said that `nested_copy` is a *shallow* copy of `nested`. [PythonTutor ](http://pythontutor.com/visualize.html#code=nested%20%3D%20%5B%5B%5D,%2010,%2020.0,%20%22Thirty%22,%20%5B40,%2050%5D%5D%0Anested_copy%20%3D%20nested%5B%3A%5D%0Anested%5B%3A4%5D%20%3D%20%5B100,%20100,%20100%5D%0Anested_copy%5B-1%5D%5B%3A%5D%20%3D%20%5B1,%202,%203%5D&cumulative=false&curstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false) shows how both reference the *same* nested `list` object that is changed *in place* from `[40, 50]` into `[1, 2, 3]`.\n", "\n", "Lastly, we use the `del` statement to remove an element." ] }, { "cell_type": "code", "execution_count": 51, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "del nested[-1]" ] }, { "cell_type": "code", "execution_count": 52, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "[100, 100, 100]" ] }, "execution_count": 52, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nested" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "The `del` statement also works for slices. Here, we remove all references `nested` holds." ] }, { "cell_type": "code", "execution_count": 53, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "del nested[:]" ] }, { "cell_type": "code", "execution_count": 54, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "[]" ] }, "execution_count": 54, "metadata": {}, "output_type": "execute_result" } ], "source": [ "nested" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Mutability for sequences is formalized by the `MutableSequence` ABC in the [collections.abc ](https://docs.python.org/3/library/collections.abc.html) module." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## List Methods" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "The `list` type is an essential data structure in any real-world Python application, and many typical `list`-related algorithms from computer science theory are already built into it at the C level (cf., the [documentation ](https://docs.python.org/3/library/stdtypes.html#mutable-sequence-types) or the [tutorial ](https://docs.python.org/3/tutorial/datastructures.html#more-on-lists) for a full overview; unfortunately, not all methods have direct links). So, understanding and applying the built-in methods of the `list` type not only speeds up the development process but also makes programs significantly faster.\n", "\n", "In contrast to the `str` type's methods in [Chapter 6 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/main/06_text/00_content.ipynb#String-Methods) (e.g., [.upper() ](https://docs.python.org/3/library/stdtypes.html#str.upper) or [.lower() ](https://docs.python.org/3/library/stdtypes.html#str.lower)), the `list` type's methods that mutate an object do so *in place*. That means they *never* create *new* `list` objects and return `None` to indicate that. So, we must *never* assign the return value of `list` methods to the variable holding the list!\n", "\n", "Let's look at the following `names` example." ] }, { "cell_type": "code", "execution_count": 55, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "names = [\"Carl\", \"Peter\"]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "To add an object to the end of `names`, we use the `.append()` method. The code cell shows no output indicating that `None` must be the return value." ] }, { "cell_type": "code", "execution_count": 56, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "names.append(\"Eckardt\")" ] }, { "cell_type": "code", "execution_count": 57, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "['Carl', 'Peter', 'Eckardt']" ] }, "execution_count": 57, "metadata": {}, "output_type": "execute_result" } ], "source": [ "names" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "With the `.extend()` method, we may also append multiple elements provided by an iterable. Here, the iterable is a `list` object itself holding two `str` objects." ] }, { "cell_type": "code", "execution_count": 58, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "names.extend([\"Karl\", \"Oliver\"])" ] }, { "cell_type": "code", "execution_count": 59, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "['Carl', 'Peter', 'Eckardt', 'Karl', 'Oliver']" ] }, "execution_count": 59, "metadata": {}, "output_type": "execute_result" } ], "source": [ "names" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Similar to `.append()`, we may add a new element at an arbitrary position with the `.insert()` method. `.insert()` takes two arguments, an *index* and the element to be inserted." ] }, { "cell_type": "code", "execution_count": 60, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "names.insert(1, \"Berthold\")" ] }, { "cell_type": "code", "execution_count": 61, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "['Carl', 'Berthold', 'Peter', 'Eckardt', 'Karl', 'Oliver']" ] }, "execution_count": 61, "metadata": {}, "output_type": "execute_result" } ], "source": [ "names" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "`list` objects may be sorted *in place* with the [.sort() ](https://docs.python.org/3/library/stdtypes.html#list.sort) method. That is different from the built-in [sorted() ](https://docs.python.org/3/library/functions.html#sorted) function that takes any *finite* and *iterable* object and returns a *new* `list` object with the iterable's elements sorted!" ] }, { "cell_type": "code", "execution_count": 62, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/plain": [ "['Berthold', 'Carl', 'Eckardt', 'Karl', 'Oliver', 'Peter']" ] }, "execution_count": 62, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sorted(names)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "As the previous code cell created a *new* `list` object, `names` is still unsorted." ] }, { "cell_type": "code", "execution_count": 63, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "['Carl', 'Berthold', 'Peter', 'Eckardt', 'Karl', 'Oliver']" ] }, "execution_count": 63, "metadata": {}, "output_type": "execute_result" } ], "source": [ "names" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Let's sort the elements in `names` instead." ] }, { "cell_type": "code", "execution_count": 64, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "names.sort()" ] }, { "cell_type": "code", "execution_count": 65, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "['Berthold', 'Carl', 'Eckardt', 'Karl', 'Oliver', 'Peter']" ] }, "execution_count": 65, "metadata": {}, "output_type": "execute_result" } ], "source": [ "names" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "To sort in reverse order, we pass a keyword-only `reverse=True` argument to either the [.sort() ](https://docs.python.org/3/library/stdtypes.html#list.sort) method or the [sorted() ](https://docs.python.org/3/library/functions.html#sorted) function." ] }, { "cell_type": "code", "execution_count": 66, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "names.sort(reverse=True)" ] }, { "cell_type": "code", "execution_count": 67, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "['Peter', 'Oliver', 'Karl', 'Eckardt', 'Carl', 'Berthold']" ] }, "execution_count": 67, "metadata": {}, "output_type": "execute_result" } ], "source": [ "names" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "The [.sort() ](https://docs.python.org/3/library/stdtypes.html#list.sort) method and the [sorted() ](https://docs.python.org/3/library/functions.html#sorted) function sort the elements in `names` in alphabetical order, forward or backward. However, that does *not* hold in general.\n", "\n", "We mention above that `list` objects may contain objects of *any* type and even of *mixed* types. Because of that, the sorting is **[delegated ](https://en.wikipedia.org/wiki/Delegation_(object-oriented_programming))** to the elements in a `list` object. In a way, Python \"asks\" the elements in a `list` object to sort themselves. As `names` contains only `str` objects, they are sorted according the the comparison rules explained in [Chapter 6 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/main/06_text/00_content.ipynb#String-Comparison).\n", "\n", "To customize the sorting, we pass a keyword-only `key` argument to [.sort() ](https://docs.python.org/3/library/stdtypes.html#list.sort) or [sorted() ](https://docs.python.org/3/library/functions.html#sorted), which must be a `function` object accepting *one* positional argument. Then, the elements in the `list` object are passed to that one by one, and the return values are used as the **sort keys**. The `key` argument is also a popular use case for `lambda` expressions.\n", "\n", "For example, to sort `names` not by alphabet but by the names' lengths, we pass in a reference to the built-in [len() ](https://docs.python.org/3/library/functions.html#len) function as `key=len`. Note that there are *no* parentheses after `len`!" ] }, { "cell_type": "code", "execution_count": 68, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "names.sort(key=len)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "If two names have the same length, their relative order is kept as is. That is why `\"Karl\"` comes before `\"Carl\" ` below. A [sorting algorithm ](https://en.wikipedia.org/wiki/Sorting_algorithm) with that property is called **[stable ](https://en.wikipedia.org/wiki/Sorting_algorithm#Stability)**.\n", "\n", "Sorting is an important topic in programming, and we refer to the official [HOWTO ](https://docs.python.org/3/howto/sorting.html) for a more comprehensive introduction." ] }, { "cell_type": "code", "execution_count": 69, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "['Karl', 'Carl', 'Peter', 'Oliver', 'Eckardt', 'Berthold']" ] }, "execution_count": 69, "metadata": {}, "output_type": "execute_result" } ], "source": [ "names" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "`.sort(reverse=True)` is different from the `.reverse()` method. Whereas the former applies some sorting rule in reverse order, the latter simply reverses the elements in a `list` object." ] }, { "cell_type": "code", "execution_count": 70, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "names.reverse()" ] }, { "cell_type": "code", "execution_count": 71, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "['Berthold', 'Eckardt', 'Oliver', 'Peter', 'Carl', 'Karl']" ] }, "execution_count": 71, "metadata": {}, "output_type": "execute_result" } ], "source": [ "names" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "The `.pop()` method removes the *last* element from a `list` object *and* returns it. Below we **capture** the `removed` element to show that the return value is not `None` as with all the methods introduced so far." ] }, { "cell_type": "code", "execution_count": 72, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "removed = names.pop()" ] }, { "cell_type": "code", "execution_count": 73, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "'Karl'" ] }, "execution_count": 73, "metadata": {}, "output_type": "execute_result" } ], "source": [ "removed" ] }, { "cell_type": "code", "execution_count": 74, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "['Berthold', 'Eckardt', 'Oliver', 'Peter', 'Carl']" ] }, "execution_count": 74, "metadata": {}, "output_type": "execute_result" } ], "source": [ "names" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "`.pop()` takes an optional *index* argument and removes that instead.\n", "\n", "So, to remove the second element, `\"Eckhardt\"`, from `names`, we write this." ] }, { "cell_type": "code", "execution_count": 75, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "removed = names.pop(1)" ] }, { "cell_type": "code", "execution_count": 76, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "'Eckardt'" ] }, "execution_count": 76, "metadata": {}, "output_type": "execute_result" } ], "source": [ "removed" ] }, { "cell_type": "code", "execution_count": 77, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "['Berthold', 'Oliver', 'Peter', 'Carl']" ] }, "execution_count": 77, "metadata": {}, "output_type": "execute_result" } ], "source": [ "names" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Instead of removing an element by its index, we can also remove it by its value with the `.remove()` method. Behind the scenes, Python then compares the object to be removed, `\"Peter\"` in the example, sequentially to each element with the `==` operator and removes the *first* one that evaluates equal to it. `.remove()` does *not* return the removed element." ] }, { "cell_type": "code", "execution_count": 78, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "names.remove(\"Peter\")" ] }, { "cell_type": "code", "execution_count": 79, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "['Berthold', 'Oliver', 'Carl']" ] }, "execution_count": 79, "metadata": {}, "output_type": "execute_result" } ], "source": [ "names" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Also, `.remove()` raises a `ValueError` if the value is not found." ] }, { "cell_type": "code", "execution_count": 80, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "ename": "ValueError", "evalue": "list.remove(x): x not in list", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[80], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43mnames\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mremove\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mPeter\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m)\u001b[49m\n", "\u001b[0;31mValueError\u001b[0m: list.remove(x): x not in list" ] } ], "source": [ "names.remove(\"Peter\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "`list` objects implement an `.index()` method that returns the position of the first element that compares equal to its argument. It fails *loudly* with a `ValueError` if no element compares equal." ] }, { "cell_type": "code", "execution_count": 81, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/plain": [ "['Berthold', 'Oliver', 'Carl']" ] }, "execution_count": 81, "metadata": {}, "output_type": "execute_result" } ], "source": [ "names" ] }, { "cell_type": "code", "execution_count": 82, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "1" ] }, "execution_count": 82, "metadata": {}, "output_type": "execute_result" } ], "source": [ "names.index(\"Oliver\")" ] }, { "cell_type": "code", "execution_count": 83, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "ename": "ValueError", "evalue": "'Karl' is not in list", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[83], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43mnames\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mindex\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mKarl\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m)\u001b[49m\n", "\u001b[0;31mValueError\u001b[0m: 'Karl' is not in list" ] } ], "source": [ "names.index(\"Karl\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "The `.count()` method returns the number of elements that compare equal to its argument." ] }, { "cell_type": "code", "execution_count": 84, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "1" ] }, "execution_count": 84, "metadata": {}, "output_type": "execute_result" } ], "source": [ "names.count(\"Carl\")" ] }, { "cell_type": "code", "execution_count": 85, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "0" ] }, "execution_count": 85, "metadata": {}, "output_type": "execute_result" } ], "source": [ "names.count(\"Karl\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Two more methods, `.copy()` and `.clear()`, are *syntactic sugar* and replace working with slices.\n", "\n", "`.copy()` creates a *shallow* copy. So, `names.copy()` below does the same as taking a full slice with `names[:]`, and the caveats from above apply, too." ] }, { "cell_type": "code", "execution_count": 86, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "names_copy = names.copy()" ] }, { "cell_type": "code", "execution_count": 87, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "['Berthold', 'Oliver', 'Carl']" ] }, "execution_count": 87, "metadata": {}, "output_type": "execute_result" } ], "source": [ "names_copy" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "`.clear()` removes all references from a `list` object. So, `names_copy.clear()` is the same as `del names_copy[:]`." ] }, { "cell_type": "code", "execution_count": 88, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "names_copy.clear()" ] }, { "cell_type": "code", "execution_count": 89, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "[]" ] }, "execution_count": 89, "metadata": {}, "output_type": "execute_result" } ], "source": [ "names_copy" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Many methods introduced in this section are mentioned in the [collections.abc ](https://docs.python.org/3/library/collections.abc.html) module's documentation as well: While the `.index()` and `.count()` methods come with any data type that is a `Sequence`, the `.append()`, `.extend()`, `.insert()`, `.reverse()`, `.pop()`, and `.remove()` methods are part of any `MutableSequence` type. The `.sort()`, `.copy()`, and `.clear()` methods are `list`-specific.\n", "\n", "So, being a sequence does not only imply the four *behaviors* specified above, but also means that a data type comes with certain standardized methods." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## List Operations" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "As with `str` objects, the `+` and `*` operators are overloaded for concatenation and always return a *new* `list` object. The references in this newly created `list` object reference the *same* objects as the two original `list` objects. So, the same caveat as with *shallow* copies from above applies!" ] }, { "cell_type": "code", "execution_count": 90, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/plain": [ "['Berthold', 'Oliver', 'Carl']" ] }, "execution_count": 90, "metadata": {}, "output_type": "execute_result" } ], "source": [ "names" ] }, { "cell_type": "code", "execution_count": 91, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "['Berthold', 'Oliver', 'Carl', 'Diedrich', 'Yves']" ] }, "execution_count": 91, "metadata": {}, "output_type": "execute_result" } ], "source": [ "names + [\"Diedrich\", \"Yves\"]" ] }, { "cell_type": "code", "execution_count": 92, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "['Berthold', 'Oliver', 'Carl', 'Berthold', 'Oliver', 'Carl']" ] }, "execution_count": 92, "metadata": {}, "output_type": "execute_result" } ], "source": [ "2 * names" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Unpacking" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Besides being an operator, the `*` symbol has a second syntactical use, as explained in [PEP 3132 ](https://www.python.org/dev/peps/pep-3132/) and [PEP 448 ](https://www.python.org/dev/peps/pep-0448/): It implements what is called **iterable unpacking**. It is *not* an operator syntactically but a notation that Python reads as a literal.\n", "\n", "In the example, Python interprets the expression as if the elements of the iterable `names` were placed between `\"Achim\"` and `\"Xavier\"` one by one. So, we do not obtain a nested but a *flat* list." ] }, { "cell_type": "code", "execution_count": 93, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/plain": [ "['Achim', 'Berthold', 'Oliver', 'Carl', 'Xavier']" ] }, "execution_count": 93, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[\"Achim\", *names, \"Xavier\"]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Effectively, Python reads that as if we wrote the following." ] }, { "cell_type": "code", "execution_count": 94, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "['Achim', 'Berthold', 'Oliver', 'Carl', 'Xavier']" ] }, "execution_count": 94, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[\"Achim\", names[0], names[1], names[2], \"Xavier\"]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### List Comparison" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "The relational operators also work with `list` objects; yet another example of operator overloading.\n", "\n", "Comparison is made in a pairwise fashion until the first pair of elements does not evaluate equal or one of the `list` objects ends. The exact comparison rules depend on the elements and not the `list` objects. As with [.sort() ](https://docs.python.org/3/library/stdtypes.html#list.sort) or [sorted() ](https://docs.python.org/3/library/functions.html#sorted) above, comparison is *delegated* to the objects to be compared, and Python \"asks\" the elements in the two `list` objects to compare themselves. Usually, all elements are of the *same* type, and comparison is straightforward." ] }, { "cell_type": "code", "execution_count": 95, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/plain": [ "['Berthold', 'Oliver', 'Carl']" ] }, "execution_count": 95, "metadata": {}, "output_type": "execute_result" } ], "source": [ "names" ] }, { "cell_type": "code", "execution_count": 96, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 96, "metadata": {}, "output_type": "execute_result" } ], "source": [ "names == [\"Berthold\", \"Oliver\", \"Carl\"]" ] }, { "cell_type": "code", "execution_count": 97, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 97, "metadata": {}, "output_type": "execute_result" } ], "source": [ "names != [\"Berthold\", \"Oliver\", \"Karl\"]" ] }, { "cell_type": "code", "execution_count": 98, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 98, "metadata": {}, "output_type": "execute_result" } ], "source": [ "names < [\"Berthold\", \"Oliver\", \"Karl\"]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "If two `list` objects have a different number of elements and all overlapping elements compare equal, the shorter `list` object is considered \"smaller.\"" ] }, { "cell_type": "code", "execution_count": 99, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 99, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[\"Berthold\", \"Oliver\"] < names < [\"Berthold\", \"Oliver\", \"Carl\", \"Xavier\"]" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.2" }, "livereveal": { "auto_select": "code", "auto_select_fragment": true, "scroll": true, "theme": "serif" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": false, "sideBar": true, "skip_h1_title": true, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": { "height": "calc(100% - 180px)", "left": "10px", "top": "150px", "width": "384px" }, "toc_section_display": false, "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 4 }