"**Note**: Click on \"*Kernel*\" > \"*Restart Kernel and Clear All Outputs*\" in [JupyterLab](https://jupyterlab.readthedocs.io/en/stable/) *before* reading this notebook to reset its output. If you cannot run this file on your machine, you may want to open it [in the cloud <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_mb.png\">](https://mybinder.org/v2/gh/webartifex/intro-to-python/main?urlpath=lab/tree/04_iteration/02_content.ipynb)."
"After learning about the concept of **recursion** in the [first part <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_nb.png\">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/main/04_iteration/00_content.ipynb) of this chapter, we look at other ways of running code repeatedly, namely **looping** with the `for` and `while` statements. We start with the latter as it is more generic. Throughout this second part of the chapter, we revisit the same examples from the first part to show how recursion and looping are really two sides of the same coin."
"Whereas functions combined with `if` statements suffice to model any repetitive logic, Python comes with a compound `while` statement (cf., [reference <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/reference/compound_stmts.html#the-while-statement)) that often makes it easier to implement iterative ideas.\n",
"\n",
"It consists of a header line with a boolean expression followed by an indented code block. Before the first and after every execution of the code block, the boolean expression is evaluated, and if it is (still) equal to `True`, the code block runs (again). Eventually, some variable referenced in the boolean expression is changed in the code block such that the condition becomes `False`.\n",
"\n",
"If the condition is `False` before the first iteration, the entire code block is *never* executed. As the flow of control keeps \"looping\" (i.e., more formally, **iterating**) back to the beginning of the code block, this concept is also called a `while`-loop and each pass through the loop an **iteration**."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Trivial Example: Countdown (revisited)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"Let's rewrite the `countdown()` example in an iterative style. We also build in **input validation** by allowing the function only to be called with strictly positive integers. As any positive integer hits $0$ at some point when iteratively decremented by $1$, `countdown()` is guaranteed to **terminate**. Also, the base case is now handled at the end of the function, which commonly happens with iterative solutions to problems."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"code_folding": [],
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
"def countdown(n):\n",
" \"\"\"Print a countdown until the party starts.\n",
"\n",
" Args:\n",
" n (int): seconds until the party begins; must be positive\n",
"As [PythonTutor <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](http://pythontutor.com/visualize.html#code=def%20countdown%28n%29%3A%0A%20%20%20%20while%20n%20!%3D%200%3A%0A%20%20%20%20%20%20%20%20print%28n%29%0A%20%20%20%20%20%20%20%20n%20-%3D%201%0A%0A%20%20%20%20print%28%22Happy%20new%20Year!%22%29%0A%0Acountdown%283%29&cumulative=false&curstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false) shows, there is a subtle but essential difference in the way a `while` statement is treated in memory: In short, `while` statements can *not* run into a `RecursionError` as only *one* frame is needed to manage the names. After all, there is only *one* function call to be made. For typical day-to-day applications, this difference is, however, not so important *unless* a problem instance becomes so big that a large (i.e., $> 3.000$) number of recursive calls must be made."
"Finding the greatest common divisor of two numbers is still not so obvious when using a `while`-loop instead of a recursive formulation.\n",
"\n",
"The iterative implementation of `gcd()` below accepts any two strictly positive integers. As in any iteration through the loop, the smaller number is subtracted from the larger one, the two decremented values of `a` and `b` eventually become equal. Thus, this algorithm is also guaranteed to terminate. If one of the two numbers were negative or $0$ in the first place, `gcd()` would run forever, and not even Python could detect this. Try this out by removing the input validation and running the function with negative arguments!"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"code_folding": [],
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
"def gcd(a, b):\n",
" \"\"\"Calculate the greatest common divisor of two numbers.\n",
"\n",
" Args:\n",
" a (int): first number; must be positive\n",
" b (int): second number; must be positive\n",
"\n",
" Returns:\n",
" gcd (int)\n",
" \"\"\"\n",
" while a != b:\n",
" if a > b:\n",
" a -= b\n",
" else:\n",
" b -= a\n",
"\n",
" return a"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"data": {
"text/plain": [
"4"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"gcd(12, 4)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/plain": [
"1"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"gcd(7, 7919)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Efficiency of Algorithms (continued)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"We also see that this implementation is a lot *less* efficient than its recursive counterpart which solves `gcd()` for the same two numbers `112233445566778899` and `987654321` within microseconds."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"5.32 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)\n"
]
}
],
"source": [
"%%timeit -n 1 -r 1\n",
"gcd(112233445566778899, 987654321)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Infinite Loops"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"As with recursion, we must ensure that the iteration ends. For the above `countdown()` and `gcd()` examples, we could \"prove\" (i.e., at least argue in favor) that some pre-defined **termination criterion** is reached eventually. However, this cannot be done in all cases, as the following example shows."
"- If $n$ is even, the next $n$ is half the old $n$.\n",
"- If $n$ is odd, multiply the old $n$ by $3$ and add $1$ to obtain the next $n$.\n",
"- Repeat these steps until you reach $1$.\n",
"\n",
"**Do we always reach the final $1$?**"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"The function below implements this game. Does it always reach $1$? No one has proven it so far! We include some input validation as before because `collatz()` would for sure not terminate if we called it with a negative number. Further, the Collatz sequence also works for real numbers, but then we would have to study fractals (cf., [this <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_wiki.png\">](https://en.wikipedia.org/wiki/Collatz_conjecture#Iterating_on_real_or_complex_numbers)). So we restrict our example to integers only."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"code_folding": [],
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
"def collatz(n):\n",
" \"\"\"Print a Collatz sequence in descending order.\n",
"\n",
" Given a positive integer n, modify it according to these rules:\n",
" - if n is even, the next n is half the previous one\n",
" - if n is odd, the next n is 3 times the previous one plus 1\n",
" - if n is 1, stop the iteration\n",
"\n",
" Args:\n",
" n (int): a positive number to start the Collatz sequence at\n",
" \"\"\"\n",
" while n != 1:\n",
" print(n, end=\" \")\n",
" if n % 2 == 0:\n",
" n //= 2 # //= to preserve the int type\n",
" else:\n",
" n = 3 * n + 1\n",
"\n",
" print(1)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"Collatz sequences do not necessarily become longer with a larger initial `n`."
"Recursion and the `while` statement are two sides of the same coin. Disregarding that in the case of recursion Python internally faces some additional burden for managing the stack of frames in memory, both approaches lead to the *same* computational steps in memory. More importantly, we can formulate any recursive implementation in an iterative way and vice versa despite one of the two ways often \"feeling\" a lot more natural given a particular problem.\n",
"\n",
"So how does the compound `for` statement (cf., [reference <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/reference/compound_stmts.html#the-for-statement)) in this book's very first example fit into this picture? It is a *redundant* language construct to provide a *shorter* and more *convenient* syntax for common applications of the `while` statement. In programming, such additions to a language are called **syntactic sugar**. A cup of tea tastes better with sugar, but we may drink tea without sugar too.\n",
"\n",
"Consider `elements` below. Without the `for` statement, we must manage a temporary **index variable**, `index`, to loop over all the elements and also obtain the individual elements with the `[]` operator in each iteration of the loop."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
"elements = [0, 1, 2, 3, 4]"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0 1 2 3 4 "
]
}
],
"source": [
"index = 0\n",
"\n",
"while index < len(elements):\n",
" element = elements[index]\n",
" print(element, end=\" \")\n",
" index += 1\n",
"\n",
"del index"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"The `for` statement, on the contrary, makes the actual business logic more apparent by stripping all the **[boilerplate code <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_wiki.png\">](https://en.wikipedia.org/wiki/Boilerplate_code)** away. The variable that is automatically set by Python in each iteration of the loop (i.e., `element` in the example) is called the **target variable**."
"For sequences of integers, the [range() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#func-range) built-in makes the `for` statement even more convenient: It creates a `list`-like object of type `range` that generates integers \"on the fly,\" and we look closely at the underlying effects in memory in [Chapter 8 <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_nb.png\">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/main/08_mfr/00_content.ipynb#Mapping)."
"[range() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#func-range) takes optional `start` and `step` arguments that we use to customize the sequence of integers even more."
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1 3 5 7 9 "
]
}
],
"source": [
"for element in [1, 3, 5, 7, 9]:\n",
" print(element, end=\" \")"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1 3 5 7 9 "
]
}
],
"source": [
"for element in range(1, 10, 2):\n",
" print(element, end=\" \")"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Containers vs. Iterables"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"The essential difference between the above `list` objects, `[0, 1, 2, 3, 4]` and `[1, 3, 5, 7, 9]`, and the `range` objects, `range(5)` and `range(1, 10, 2)`, is that in the former case *six* objects are created in memory *before* the `for` statement starts running, *one* `list` holding references to *five* `int` objects, whereas in the latter case only *one* `range` object is created that **generates** `int` objects one at a time *while* the `for`-loop runs.\n",
"\n",
"However, we can loop over both of them. So a natural question to ask is why Python treats objects of *different* types in the *same* way when used with a `for` statement.\n",
"\n",
"So far, the overarching storyline in this book goes like this: In Python, *everything* is an object. Besides its *identity* and *value*, every object is characterized by \"belonging\" to *one* data type that determines how the object behaves and what we may do with it.\n",
"\n",
"Now, just as we classify objects by data type, we also classify these data types (e.g., `int`, `float`, `str`, or `list`) into **abstract concepts**.\n",
"We did this already in [Chapter 1 <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_nb.png\">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/main/01_elements/03_content.ipynb#Who-am-I?-And-how-many?) when we described a `list` object as \"some sort of container that holds [...] references to other objects\". So, abstractly speaking, **containers** are any objects that are \"composed\" of other objects and also \"manage\" how these objects are organized. `list` objects, for example, have the property that they model an order associated with their elements. There exist, however, other container types, many of which do *not* come with an order. So, containers primarily \"contain\" other objects and have *nothing* to do with looping.\n",
"On the contrary, the abstract concept of **iterables** is all about looping: Any object that we can loop over is, by definition, an iterable. So, `range` objects, for example, are iterables, even though they hold no references to other objects. Moreover, looping does *not* have to occur in a *predictable* order, although this is the case for both `list` and `range` objects.\n",
"Typically, containers are iterables, and iterables are containers. Yet, only because these two concepts coincide often, we must not think of them as the same. In [Chapter 7 <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_nb.png\">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/main/07_sequences/00_content.ipynb#Collections-vs.-Sequences), we formalize these two concepts and introduce many more. Finally, [Chapter 11 <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_nb.png\">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/main/11_classes/00_content.ipynb) gives an explanation how abstract concepts are implemented and play together.\n",
"The characteristic operator associated with container types is the `in` operator: It checks if a given object evaluates equal to at least one of the objects in the container. Colloquially, it checks if an object is \"contained\" in the container. Formally, this operation is called **membership testing**."
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"\"Achim\" in first_names"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/plain": [
"False"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"\"Alexander\" in first_names"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"The cell below shows the *exact* workings of the `in` operator: Although `3.0` is *not* contained in `elements`, it evaluates equal to the `3` that is, which is why the following expression evaluates to `True`. So, while we could colloquially say that `elements` \"contains\" `3.0`, it actually does not."
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [
{
"data": {
"text/plain": [
"[0, 1, 2, 3, 4]"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"elements"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [
{
"data": {
"text/plain": [
"True"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"3.0 in elements"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"Similarly, the characteristic operation of an iterable type is that it supports being looped over, for example, with the `for` statement."
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Achim Berthold Carl Diedrich Eckardt "
]
}
],
"source": [
"for name in first_names:\n",
" print(name, end=\" \")"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"If we must have an index variable in the loop's body, we use the [enumerate() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#enumerate) built-in that takes an *iterable* as its argument and then generates a \"stream\" of \"pairs\" of an index variable, `i` below, and an object provided by the iterable, `name`, separated by a `,`. There is *no* need to ever revert to the `while` statement with an explicitly managed index variable to loop over an iterable object."
"for i, name in enumerate(first_names, start=1):\n",
" print(i, \">\", name, end=\" \")"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"The [zip() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#zip) built-in allows us to combine the elements of two or more iterables in a *pairwise* fashion: It conceptually works like a zipper for a jacket."
"for first_name, last_name in zip(first_names, last_names):\n",
" print(first_name, last_name, end=\" \")"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### \"Hard at first Glance\" Example: [Fibonacci Numbers <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_wiki.png\">](https://en.wikipedia.org/wiki/Fibonacci_number) (revisited)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"In contrast to its recursive counterpart, the iterative `fibonacci()` function below is somewhat harder to read. For example, it is not so obvious as to how many iterations through the `for`-loop we need to make when implementing it. There is an increased risk of making an *off-by-one* error. Moreover, we need to track a `temp` variable along.\n",
"\n",
"However, one advantage of calculating Fibonacci numbers in a **forward** fashion with a `for` statement is that we could list the entire sequence in ascending order as we calculate the desired number. To show this, we added `print()` statements in `fibonacci()` below.\n",
"\n",
"We do *not* need to store the index variable in the `for`-loop's header line: That is what the underscore variable `_` indicates; we \"throw it away.\""
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {
"code_folding": [],
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
"def fibonacci(i):\n",
" \"\"\"Calculate the ith Fibonacci number.\n",
"\n",
" Args:\n",
" i (int): index of the Fibonacci number to calculate\n",
"The iterative `factorial()` implementation is comparable to its recursive counterpart when it comes to readability. One advantage of calculating the factorial in a forward fashion is that we could track the intermediate `product` as it grows."
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
"def factorial(n):\n",
" \"\"\"Calculate the factorial of a number.\n",
"\n",
" Args:\n",
" n (int): number to calculate the factorial for, must be positive\n",
"\n",
" Returns:\n",
" factorial (int)\n",
" \"\"\"\n",
" product = 1 # because 0! = 1\n",
" for i in range(1, n + 1):\n",
" product *= i\n",
" print(product, end=\" \") # added for didactical purposes\n",