{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "**Note**: Click on \"*Kernel*\" > \"*Restart Kernel and Clear All Outputs*\" in [JupyterLab](https://jupyterlab.readthedocs.io/en/stable/) *before* reading this notebook to reset its output. If you cannot run this file on your machine, you may want to open it [in the cloud ](https://mybinder.org/v2/gh/webartifex/intro-to-python/main?urlpath=lab/tree/07_sequences/03_content.ipynb)." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Chapter 7: Sequential Data (continued)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "In this third part of the chapter, we first look at a major implication of the `list` type's mutability. Then, we see how its close relative, the `tuple` type, can mitigate this. Lastly, we see how Python's syntax assumes sequential data at various places: for example, when unpacking iterables during a `for`-loop or an assignment, or when working with `function` objects." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Modifiers vs. Pure Functions" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "As `list` objects are mutable, the caller of a function can see the changes made to a `list` object passed to the function as an argument. That is often a surprising *side effect* and should be avoided.\n", "\n", "As an example, consider the `add_xyz()` function." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "letters = [\"a\", \"b\", \"c\"]" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "def add_xyz(arg):\n", " \"\"\"Append letters to a list.\"\"\"\n", " arg.extend([\"x\", \"y\", \"z\"])\n", " return arg" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "While this function is being executed, two variables, namely `letters` in the global scope and `arg` inside the function's local scope, reference the *same* `list` object in memory. Furthermore, the passed in `arg` is also the return value.\n", "\n", "So, after the function call, `letters_with_xyz` and `letters` are **aliases** as well, referencing the *same* object. We can also visualize that with [PythonTutor ](http://pythontutor.com/visualize.html#code=letters%20%3D%20%5B%22a%22,%20%22b%22,%20%22c%22%5D%0A%0Adef%20add_xyz%28arg%29%3A%0A%20%20%20%20arg.extend%28%5B%22x%22,%20%22y%22,%20%22z%22%5D%29%0A%20%20%20%20return%20arg%0A%0Aletters_with_xyz%20%3D%20add_xyz%28letters%29&cumulative=false&curstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false)." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "letters_with_xyz = add_xyz(letters)" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "['a', 'b', 'c', 'x', 'y', 'z']" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "letters_with_xyz" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "['a', 'b', 'c', 'x', 'y', 'z']" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "letters" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "A better practice is to first create a copy of `arg` within the function that is then modified and returned. If we are sure that `arg` contains immutable elements only, we get away with a shallow copy. The downside of this approach is the higher amount of memory necessary.\n", "\n", "The revised `add_xyz()` function below is more natural to reason about as it does *not* modify the passed in `arg` internally. [PythonTutor ](http://pythontutor.com/visualize.html#code=letters%20%3D%20%5B%22a%22,%20%22b%22,%20%22c%22%5D%0A%0Adef%20add_xyz%28arg%29%3A%0A%20%20%20%20new_arg%20%3D%20arg%5B%3A%5D%0A%20%20%20%20new_arg.extend%28%5B%22x%22,%20%22y%22,%20%22z%22%5D%29%0A%20%20%20%20return%20new_arg%0A%0Aletters_with_xyz%20%3D%20add_xyz%28letters%29&cumulative=false&curstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false) shows that as well. This approach is following the **[functional programming ](https://en.wikipedia.org/wiki/Functional_programming)** paradigm that is going through a \"renaissance\" currently. Two essential characteristics of functional programming are that a function *never* changes its inputs and *always* returns the same output given the same inputs.\n", "\n", "For a beginner, it is probably better to stick to this idea and not change any arguments as the original `add_xyz()` above. However, functions that modify and return the argument passed in are an important aspect of object-oriented programming, as explained in [Chapter 11 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/main/11_classes/00_content.ipynb)." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "letters = [\"a\", \"b\", \"c\"]" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "def add_xyz(arg):\n", " \"\"\"Create a new list from an existing one.\"\"\"\n", " new_arg = arg[:]\n", " new_arg.extend([\"x\", \"y\", \"z\"])\n", " return new_arg" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "letters_with_xyz = add_xyz(letters)" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "scrolled": true, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "['a', 'b', 'c', 'x', 'y', 'z']" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "letters_with_xyz" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "['a', 'b', 'c']" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "letters" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "If we want to modify the argument passed in, it is best to return `None` and not `arg`, as does the final version of `add_xyz()` below. Then, the user of our function cannot accidentally create two aliases to the same object. That is also why the list methods above all return `None`. [PythonTutor ](http://pythontutor.com/visualize.html#code=letters%20%3D%20%5B%22a%22,%20%22b%22,%20%22c%22%5D%0A%0Adef%20add_xyz%28arg%29%3A%0A%20%20%20%20arg.extend%28%5B%22x%22,%20%22y%22,%20%22z%22%5D%29%0A%20%20%20%20return%0A%0Aadd_xyz%28letters%29&cumulative=false&curstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false) shows how there is only *one* reference to `letters` after the function call." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "letters = [\"a\", \"b\", \"c\"]" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "def add_xyz(arg):\n", " \"\"\"Append letters to a list.\"\"\"\n", " arg.extend([\"x\", \"y\", \"z\"])\n", " return # None" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "add_xyz(letters)" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "['a', 'b', 'c', 'x', 'y', 'z']" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "letters" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "If we call `add_xyz()` with `letters` as the argument again, we end up with an even longer `list` object." ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "add_xyz(letters)" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "['a', 'b', 'c', 'x', 'y', 'z', 'x', 'y', 'z']" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "letters" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Functions that only work on the argument passed in are called **modifiers**. Their primary purpose is to change the **state** of the argument. On the contrary, functions that have *no* side effects on the arguments are said to be **pure**." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## The `tuple` Type" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "To create a `tuple` object, we can use the same literal notation as for `list` objects *without* the brackets and list all elements." ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "numbers = 7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "(7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4)" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "numbers" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "However, to be clearer, many Pythonistas write out the optional parentheses `(` and `)`." ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "numbers = (7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4)" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "(7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4)" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "numbers" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "As before, `numbers` is an object on its own." ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "140248673535456" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "id(numbers)" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "tuple" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(numbers)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "While we could use empty parentheses `()` to create an empty `tuple` object ..." ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "empty_tuple = ()" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "()" ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "empty_tuple" ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "tuple" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(empty_tuple)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "... we must use a *trailing comma* to create a `tuple` object holding one element. If we forget the comma, the parentheses are interpreted as the grouping operator and effectively useless!" ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "one_tuple = (1,) # we could ommit the parentheses but not the comma" ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "(1,)" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "one_tuple" ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "tuple" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(one_tuple)" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "no_tuple = (1)" ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "1" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "no_tuple" ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "int" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(no_tuple)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Alternatively, we may use the [tuple() ](https://docs.python.org/3/library/functions.html#func-tuple) built-in that takes any iterable as its argument and creates a new `tuple` from its elements." ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/plain": [ "(1,)" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "tuple([1])" ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "('i', 't', 'e', 'r', 'a', 'b', 'l', 'e')" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "tuple(\"iterable\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Tuples are like \"Immutable Lists\"" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Most operations involving `tuple` objects work in the same way as with `list` objects. The main difference is that `tuple` objects are *immutable*. So, if our program does not depend on mutability, we may and should use `tuple` and not `list` objects to model sequential data. That way, we avoid the pitfalls seen above.\n", "\n", "`tuple` objects are *sequences* exhibiting the familiar *four* behaviors. So, `numbers` holds a *finite* number of elements ..." ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/plain": [ "12" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(numbers)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "... that we can obtain individually by looping over it in a predictable *forward* or *reverse* order." ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "7 11 8 5 3 12 2 6 9 10 1 4 " ] } ], "source": [ "for number in numbers:\n", " print(number, end=\" \")" ] }, { "cell_type": "code", "execution_count": 36, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "4 1 10 9 6 2 12 3 5 8 11 7 " ] } ], "source": [ "for number in reversed(numbers):\n", " print(number, end=\" \")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "To check if a given object is *contained* in `numbers`, we use the `in` operator and conduct a linear search." ] }, { "cell_type": "code", "execution_count": 37, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "False" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "0 in numbers" ] }, { "cell_type": "code", "execution_count": 38, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "1 in numbers" ] }, { "cell_type": "code", "execution_count": 39, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "1.0 in numbers # in relies on == behind the scenes" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "We may index and slice with the `[]` operator. The latter returns *new* `tuple` objects." ] }, { "cell_type": "code", "execution_count": 40, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/plain": [ "7" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "numbers[0]" ] }, { "cell_type": "code", "execution_count": 41, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "4" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "numbers[-1]" ] }, { "cell_type": "code", "execution_count": 42, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "(2, 6, 9, 10, 1, 4)" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "numbers[6:]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Index assignment does *not* work as tuples are *immutable* and results in a `TypeError`." ] }, { "cell_type": "code", "execution_count": 43, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "ename": "TypeError", "evalue": "'tuple' object does not support item assignment", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[43], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43mnumbers\u001b[49m\u001b[43m[\u001b[49m\u001b[38;5;241;43m-\u001b[39;49m\u001b[38;5;241;43m1\u001b[39;49m\u001b[43m]\u001b[49m \u001b[38;5;241m=\u001b[39m \u001b[38;5;241m99\u001b[39m\n", "\u001b[0;31mTypeError\u001b[0m: 'tuple' object does not support item assignment" ] } ], "source": [ "numbers[-1] = 99" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "The `+` and `*` operators work with `tuple` objects as well: They always create *new* `tuple` objects." ] }, { "cell_type": "code", "execution_count": 44, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "(7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4, 99)" ] }, "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ "numbers + (99,)" ] }, { "cell_type": "code", "execution_count": 45, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "(7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4, 7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4)" ] }, "execution_count": 45, "metadata": {}, "output_type": "execute_result" } ], "source": [ "2 * numbers" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Being immutable, `tuple` objects only provide the `.count()` and `.index()` methods of `Sequence` types. The `.append()`, `.extend()`, `.insert()`, `.reverse()`, `.pop()`, and `.remove()` methods of `MutableSequence` types are *not* available. The same holds for the `list`-specific `.sort()`, `.copy()`, and `.clear()` methods." ] }, { "cell_type": "code", "execution_count": 46, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "0" ] }, "execution_count": 46, "metadata": {}, "output_type": "execute_result" } ], "source": [ "numbers.count(0)" ] }, { "cell_type": "code", "execution_count": 47, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "10" ] }, "execution_count": 47, "metadata": {}, "output_type": "execute_result" } ], "source": [ "numbers.index(1)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "The relational operators work in the *same* way as for `list` objects." ] }, { "cell_type": "code", "execution_count": 48, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "(7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4)" ] }, "execution_count": 48, "metadata": {}, "output_type": "execute_result" } ], "source": [ "numbers" ] }, { "cell_type": "code", "execution_count": 49, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 49, "metadata": {}, "output_type": "execute_result" } ], "source": [ "numbers == (7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4)" ] }, { "cell_type": "code", "execution_count": 50, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 50, "metadata": {}, "output_type": "execute_result" } ], "source": [ "numbers != (99, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4)" ] }, { "cell_type": "code", "execution_count": 51, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "True" ] }, "execution_count": 51, "metadata": {}, "output_type": "execute_result" } ], "source": [ "numbers < (99, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "While `tuple` objects are immutable, this only relates to the references they hold. If a `tuple` object contains references to mutable objects, the entire nested structure is *not* immutable as a whole!\n", "\n", "Consider the following stylized example `not_immutable`: It contains *three* elements, `1`, `[2, ..., 11]`, and `12`, and the elements of the nested `list` object may be changed. While it is not practical to mix data types in a `tuple` object that is used as an \"immutable list,\" we want to make the point that the mere usage of the `tuple` type does *not* guarantee a nested object to be immutable as a whole." ] }, { "cell_type": "code", "execution_count": 52, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "not_immutable = (1, [2, 3, 4, 5, 6, 7, 8, 9, 10, 11], 12)" ] }, { "cell_type": "code", "execution_count": 53, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "(1, [2, 3, 4, 5, 6, 7, 8, 9, 10, 11], 12)" ] }, "execution_count": 53, "metadata": {}, "output_type": "execute_result" } ], "source": [ "not_immutable" ] }, { "cell_type": "code", "execution_count": 54, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "not_immutable[1][:] = [99, 99, 99]" ] }, { "cell_type": "code", "execution_count": 55, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "(1, [99, 99, 99], 12)" ] }, "execution_count": 55, "metadata": {}, "output_type": "execute_result" } ], "source": [ "not_immutable" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Packing & Unpacking" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "In the \"*List Operations*\" section in the [second part ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/main/07_sequences/01_content.ipynb#List-Operations) of this chapter, the `*` symbol **unpacks** the elements of a `list` object into another one. This idea of *iterable unpacking* is built into Python at various places, even *without* the `*` symbol.\n", "\n", "For example, we may write variables on the left-hand side of a `=` statement in a literal `tuple` style. Then, any *finite* iterable on the right-hand side is unpacked. So, `numbers` is unpacked into *twelve* variables below." ] }, { "cell_type": "code", "execution_count": 56, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "n1, n2, n3, n4, n5, n6, n7, n8, n9, n10, n11, n12 = numbers" ] }, { "cell_type": "code", "execution_count": 57, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "7" ] }, "execution_count": 57, "metadata": {}, "output_type": "execute_result" } ], "source": [ "n1" ] }, { "cell_type": "code", "execution_count": 58, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "11" ] }, "execution_count": 58, "metadata": {}, "output_type": "execute_result" } ], "source": [ "n2" ] }, { "cell_type": "code", "execution_count": 59, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "8" ] }, "execution_count": 59, "metadata": {}, "output_type": "execute_result" } ], "source": [ "n3" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Having to type twelve variables on the left is already tedious. Furthermore, if the iterable on the right yields a number of elements *different* from the number of variables, we get a `ValueError`." ] }, { "cell_type": "code", "execution_count": 60, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "ename": "ValueError", "evalue": "too many values to unpack (expected 11)", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[60], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m n1, n2, n3, n4, n5, n6, n7, n8, n9, n10, n11 \u001b[38;5;241m=\u001b[39m numbers\n", "\u001b[0;31mValueError\u001b[0m: too many values to unpack (expected 11)" ] } ], "source": [ "n1, n2, n3, n4, n5, n6, n7, n8, n9, n10, n11 = numbers" ] }, { "cell_type": "code", "execution_count": 61, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "ename": "ValueError", "evalue": "not enough values to unpack (expected 13, got 12)", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[61], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m n1, n2, n3, n4, n5, n6, n7, n8, n9, n10, n11, n12, n13 \u001b[38;5;241m=\u001b[39m numbers\n", "\u001b[0;31mValueError\u001b[0m: not enough values to unpack (expected 13, got 12)" ] } ], "source": [ "n1, n2, n3, n4, n5, n6, n7, n8, n9, n10, n11, n12, n13 = numbers" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "So, to make iterable unpacking useful, we prepend the `*` symbol to *one* of the variables on the left: That variable then becomes a `list` object holding the elements not captured by the other variables. We say that the excess elements from the iterable are **packed** into this variable.\n", "\n", "For example, let's get the `first` and `last` element of `numbers` and collect the rest in `middle`." ] }, { "cell_type": "code", "execution_count": 62, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "first, *middle, last = numbers" ] }, { "cell_type": "code", "execution_count": 63, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "7" ] }, "execution_count": 63, "metadata": {}, "output_type": "execute_result" } ], "source": [ "first" ] }, { "cell_type": "code", "execution_count": 64, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "[11, 8, 5, 3, 12, 2, 6, 9, 10, 1]" ] }, "execution_count": 64, "metadata": {}, "output_type": "execute_result" } ], "source": [ "middle # always a list!" ] }, { "cell_type": "code", "execution_count": 65, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "4" ] }, "execution_count": 65, "metadata": {}, "output_type": "execute_result" } ], "source": [ "last" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "We already used unpacking before this section without knowing it. Whenever we write a `for`-loop over the [zip() ](https://docs.python.org/3/library/functions.html#zip) built-in, that generates a new `tuple` object in each iteration that we unpack by listing several loop variables.\n", "\n", "So, the `name, position` below acts like a left-hand side of an `=` statement and unpacks the `tuple` objects generated from \"zipping\" the `names` list and the `positions` tuple together." ] }, { "cell_type": "code", "execution_count": 66, "metadata": {}, "outputs": [], "source": [ "names = [\"Berthold\", \"Oliver\", \"Carl\"]" ] }, { "cell_type": "code", "execution_count": 67, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [], "source": [ "positions = (\"goalkeeper\", \"defender\", \"midfielder\", \"striker\", \"coach\")" ] }, { "cell_type": "code", "execution_count": 68, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Berthold is a goalkeeper\n", "Oliver is a defender\n", "Carl is a midfielder\n" ] } ], "source": [ "for name, position in zip(names, positions):\n", " print(name, \"is a\", position)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Without unpacking, [zip() ](https://docs.python.org/3/library/functions.html#zip) generates a series of `tuple` objects." ] }, { "cell_type": "code", "execution_count": 69, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " ('Berthold', 'goalkeeper')\n", " ('Oliver', 'defender')\n", " ('Carl', 'midfielder')\n" ] } ], "source": [ "for pair in zip(names, positions):\n", " print(type(pair), pair, sep=\" \")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Unpacking also works for nested objects. Below, we wrap [zip() ](https://docs.python.org/3/library/functions.html#zip) with the [enumerate() ](https://docs.python.org/3/library/functions.html#enumerate) built-in to have an index variable `number` inside the `for`-loop. In each iteration, a `tuple` object consisting of `number` and another `tuple` object is created. The inner one then holds the `name` and `position`." ] }, { "cell_type": "code", "execution_count": 70, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Berthold (jersey #1) is a goalkeeper\n", "Oliver (jersey #2) is a defender\n", "Carl (jersey #3) is a midfielder\n" ] } ], "source": [ "for number, (name, position) in enumerate(zip(names, positions), start=1):\n", " print(f\"{name} (jersey #{number}) is a {position}\")" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Swapping Variables" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "A popular use case of unpacking is **swapping** two variables.\n", "\n", "Consider `a` and `b` below." ] }, { "cell_type": "code", "execution_count": 71, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "a = 0\n", "b = 1" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Without unpacking, we must use a temporary variable `temp` to swap `a` and `b`." ] }, { "cell_type": "code", "execution_count": 72, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "temp = a\n", "a = b\n", "b = temp\n", "\n", "del temp" ] }, { "cell_type": "code", "execution_count": 73, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "1" ] }, "execution_count": 73, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a" ] }, { "cell_type": "code", "execution_count": 74, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "0" ] }, "execution_count": 74, "metadata": {}, "output_type": "execute_result" } ], "source": [ "b" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "With unpacking, the solution is more elegant. *All* expressions on the right-hand side are evaluated *before* any assignment takes place." ] }, { "cell_type": "code", "execution_count": 75, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "a, b = 0, 1" ] }, { "cell_type": "code", "execution_count": 76, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "a, b = b, a" ] }, { "cell_type": "code", "execution_count": 77, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "(1, 0)" ] }, "execution_count": 77, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a, b" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#### Example: [Fibonacci Numbers ](https://en.wikipedia.org/wiki/Fibonacci_number) (revisited)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Unpacking allows us to rewrite the iterative `fibonacci()` function from [Chapter 4 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/main/04_iteration/02_content.ipynb#\"Hard-at-first-Glance\"-Example:-Fibonacci-Numbers-%28revisited%29) in a concise way." ] }, { "cell_type": "code", "execution_count": 78, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "def fibonacci(i):\n", " \"\"\"Calculate the ith Fibonacci number.\n", "\n", " Args:\n", " i (int): index of the Fibonacci number to calculate\n", "\n", " Returns:\n", " ith_fibonacci (int)\n", " \"\"\"\n", " a, b = 0, 1\n", "\n", " for _ in range(i - 1):\n", " a, b = b, a + b\n", "\n", " return b" ] }, { "cell_type": "code", "execution_count": 79, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "144" ] }, "execution_count": 79, "metadata": {}, "output_type": "execute_result" } ], "source": [ "fibonacci(12)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Function Definitions & Calls" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "The concepts of packing and unpacking are also helpful when writing and using functions.\n", "\n", "For example, let's look at the `product()` function below. Its implementation suggests that `args` must be a sequence type. Otherwise, it would not make sense to index into it with `[0]` or take a slice with `[1:]`. In line with the function's name, the `for`-loop multiplies all elements of the `args` sequence. So, what does the `*` do in the header line, and what is the exact data type of `args`?\n", "\n", "The `*` is again *not* an operator in this context but a special syntax that makes Python *pack* all *positional* arguments passed to `product()` into a single `tuple` object called `args`." ] }, { "cell_type": "code", "execution_count": 80, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "def product(*args):\n", " \"\"\"Multiply all arguments.\"\"\"\n", " result = args[0]\n", "\n", " for arg in args[1:]:\n", " result *= arg\n", "\n", " return result" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "So, we can pass an *arbitrary* (i.e., also none) number of *positional* arguments to `product()`.\n", "\n", "The product of just one number is the number itself." ] }, { "cell_type": "code", "execution_count": 81, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "42" ] }, "execution_count": 81, "metadata": {}, "output_type": "execute_result" } ], "source": [ "product(42)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Passing in several numbers works as expected." ] }, { "cell_type": "code", "execution_count": 82, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "100" ] }, "execution_count": 82, "metadata": {}, "output_type": "execute_result" } ], "source": [ "product(2, 5, 10)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "However, this implementation of `product()` needs *at least* one argument passed in due to the expression `args[0]` used internally. Otherwise, we see a *runtime* error, namely an `IndexError`. We emphasize that this error is *not* caused in the header line." ] }, { "cell_type": "code", "execution_count": 83, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "ename": "IndexError", "evalue": "tuple index out of range", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mIndexError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[83], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43mproduct\u001b[49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\n", "Cell \u001b[0;32mIn[80], line 3\u001b[0m, in \u001b[0;36mproduct\u001b[0;34m(*args)\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mproduct\u001b[39m(\u001b[38;5;241m*\u001b[39margs):\n\u001b[1;32m 2\u001b[0m \u001b[38;5;250m \u001b[39m\u001b[38;5;124;03m\"\"\"Multiply all arguments.\"\"\"\u001b[39;00m\n\u001b[0;32m----> 3\u001b[0m result \u001b[38;5;241m=\u001b[39m \u001b[43margs\u001b[49m\u001b[43m[\u001b[49m\u001b[38;5;241;43m0\u001b[39;49m\u001b[43m]\u001b[49m\n\u001b[1;32m 5\u001b[0m \u001b[38;5;28;01mfor\u001b[39;00m arg \u001b[38;5;129;01min\u001b[39;00m args[\u001b[38;5;241m1\u001b[39m:]:\n\u001b[1;32m 6\u001b[0m result \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m=\u001b[39m arg\n", "\u001b[0;31mIndexError\u001b[0m: tuple index out of range" ] } ], "source": [ "product()" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Another downside of this implementation is that we can easily generate *semantic* errors: For example, if we pass in an iterable object like the `one_hundred` list, *no* exception is raised. However, the return value is also not a numeric object as we expect. The reason for this is that during the function call, `args` becomes a `tuple` object holding *one* element, which is `one_hundred`, a `list` object. So, we created a nested structure by accident." ] }, { "cell_type": "code", "execution_count": 84, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "one_hundred = [2, 5, 10]" ] }, { "cell_type": "code", "execution_count": 85, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "[2, 5, 10]" ] }, "execution_count": 85, "metadata": {}, "output_type": "execute_result" } ], "source": [ "product(one_hundred) # a semantic error!" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "This error does not occur if we unpack `one_hundred` upon passing it as the argument." ] }, { "cell_type": "code", "execution_count": 86, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "100" ] }, "execution_count": 86, "metadata": {}, "output_type": "execute_result" } ], "source": [ "product(*one_hundred)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "That is the equivalent of writing out the following tedious expression. Yet, that does *not* scale for iterables with many elements in them." ] }, { "cell_type": "code", "execution_count": 87, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "100" ] }, "execution_count": 87, "metadata": {}, "output_type": "execute_result" } ], "source": [ "product(one_hundred[0], one_hundred[1], one_hundred[2])" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "In the \"*Packing & Unpacking with Functions*\" [exercise ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/main/07_sequences/04_exercises.ipynb), we look at `product()` in more detail.\n", "\n", "While we needed to unpack `one_hundred` above to avoid the semantic error, unpacking an argument in a function call may also be a convenience in general. For example, to print the elements of `one_hundred` in one line, we need to use a `for` statement, until now. With unpacking, we get away *without* a loop." ] }, { "cell_type": "code", "execution_count": 88, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[2, 5, 10]\n" ] } ], "source": [ "print(one_hundred) # prints the tuple; we do not want that" ] }, { "cell_type": "code", "execution_count": 89, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2 5 10 " ] } ], "source": [ "for number in one_hundred:\n", " print(number, end=\" \")" ] }, { "cell_type": "code", "execution_count": 90, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2 5 10\n" ] } ], "source": [ "print(*one_hundred) # replaces the for-loop" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.2" }, "livereveal": { "auto_select": "code", "auto_select_fragment": true, "scroll": true, "theme": "serif" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": false, "sideBar": true, "skip_h1_title": true, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": { "height": "calc(100% - 180px)", "left": "10px", "top": "150px", "width": "384px" }, "toc_section_display": false, "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 4 }