"**Note**: Click on \"*Kernel*\" > \"*Restart Kernel and Clear All Outputs*\" in [JupyterLab](https://jupyterlab.readthedocs.io/en/stable/) *before* reading this notebook to reset its output. If you cannot run this file on your machine, you may want to open it [in the cloud <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_mb.png\">](https://mybinder.org/v2/gh/webartifex/intro-to-python/develop?urlpath=lab/tree/04_iteration/03_content.ipynb)."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Chapter 4: Recursion & Looping (continued)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"While what we learned about the `for` and `while` statements in the [second part <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_nb.png\">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/04_iteration/02_content.ipynb) of this chapter suffices to translate any iterative algorithm into code, both come with some syntactic sugar to make life easier for the developer. This last part of the chapter shows how we can further customize the looping logic and introduces as \"trick\" for situations where we cannot come up with a stopping criterion in a `while`-loop."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Stopping Loops prematurely"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"This section introduces additional syntax to customize `for` and `while` statements in our code even further. They are mostly syntactic sugar in that they do not change how a program runs but make its code more readable. We illustrate them for the `for` statement only. However, everything presented in this section also works for the `while` statement."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Example: Is the square of a number in `[7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4]` greater than `100`?"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"Let's say we have a list of `numbers` and want to check if the square of at least one of its elements is greater than `100`. So, conceptually, we are asking the question if a list of numbers as a whole satisfies a certain condition."
"A first naive implementation could look like this: We loop over *every* element in `numbers` and set an **indicator variable** `is_above`, initialized as `False`, to `True` once we encounter an element satisfying the condition.\n",
"\n",
"This implementation is *inefficient* as even if the *first* element in `numbers` has a square greater than `100`, we loop until the last element: This could take a long time for a big list.\n",
"\n",
"Moreover, we must initialize `is_above` *before* the `for`-loop and write an `if`-`else`-logic *after* it to check for the result. The actual business logic is *not* conveyed in a clear way."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"7 11 8 5 3 12 2 6 9 10 1 4 => at least one number satisfies the condition\n"
]
}
],
"source": [
"is_above = False\n",
"\n",
"for number in numbers:\n",
" print(number, end=\" \") # added for didactical purposes\n",
" if number ** 2 > 100:\n",
" is_above = True\n",
"\n",
"if is_above:\n",
" print(\"=> at least one number satisfies the condition\")\n",
"else:\n",
" print(\"=> no number satisfies the condition\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### The `break` Statement"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"Python provides the `break` statement (cf., [reference <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/reference/simple_stmts.html#the-break-statement)) that lets us stop a loop prematurely at any iteration. It is yet another means of controlling the flow of execution, and we say that we \"break out of a loop.\""
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"7 11 => at least one number satisfies the condition\n"
]
}
],
"source": [
"is_above = False\n",
"\n",
"for number in numbers:\n",
" print(number, end=\" \") # added for didactical purposes\n",
" if number ** 2 > 100:\n",
" is_above = True\n",
" break\n",
"\n",
"if is_above:\n",
" print(\"=> at least one number satisfies the condition\")\n",
"else:\n",
" print(\"=> no number satisfies the condition\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"This is a computational improvement. However, the code still consists of *three* sections: Some initialization *before* the `for`-loop, the loop itself, and some finalizing logic. We prefer to convey the program's idea in *one* compound statement instead."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### The `else`-clause"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"To express the logic in a prettier way, we add an `else`-clause at the end of the `for`-loop (cf., [reference <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/reference/compound_stmts.html#the-for-statement)). The `else`-clause is executed *only if* the `for`-loop is *not* stopped with a `break` statement *prematurely* (i.e., *before* reaching the *last* iteration in the loop). The word \"else\" implies a somewhat unintuitive meaning and may have better been named a `then`-clause. In most use cases, however, the `else`-clause logically goes together with some `if` statement in the loop's body.\n",
"\n",
"Overall, the code's expressive power increases. Not many programming languages support an optional `else`-branching for the `for` and `while` statements, which turns out to be very useful in practice."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"7 11 => at least one number satisfies the condition\n"
]
}
],
"source": [
"for number in numbers:\n",
" print(number, end=\" \") # added for didactical purposes\n",
" if number ** 2 > 100:\n",
" is_above = True\n",
" break\n",
"else:\n",
" is_above = False\n",
"\n",
"if is_above:\n",
" print(\"=> at least one number satisfies the condition\")\n",
"else:\n",
" print(\"=> no number satisfies the condition\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"Lastly, we incorporate the finalizing `if`-`else` logic into the `for`-loop, avoiding the `is_above` variable altogether."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"7 11 => at least one number satisfies the condition\n"
]
}
],
"source": [
"for number in numbers:\n",
" print(number, end=\" \") # added for didactical purposes\n",
" if number ** 2 > 100:\n",
" print(\"=> at least one number satisfies the condition\")\n",
" break\n",
"else:\n",
" print(\"=> no number satisfies the condition\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"Of course, if we choose the number an element's square has to pass to be larger, for example, to `200`, we have to loop over all `numbers`. There is *no way* to optimize this **[linear search <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_wiki.png\">](https://en.wikipedia.org/wiki/Linear_search)** further."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"7 11 8 5 3 12 2 6 9 10 1 4 => no number satisfies the condition\n"
]
}
],
"source": [
"for number in numbers:\n",
" print(number, end=\" \") # added for didactical purposes\n",
" if number ** 2 > 200:\n",
" print(\"=> at least one number satisfies the condition\")\n",
" break\n",
"else:\n",
" print(\"=> no number satisfies the condition\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## A first Glance at the **Map-Filter-Reduce** Paradigm"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"Often, we process some iterable with numeric data, for example, a list of `numbers` as in this book's introductory example in [Chapter 1 <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_nb.png\">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/01_elements/00_content.ipynb#Example:-Averaging-all-even-Numbers-in-a-List) or, more realistically, data from a CSV file with many rows and columns.\n",
"\n",
"Processing numeric data usually comes down to operations that may be grouped into one of the following three categories:\n",
"\n",
"- **mapping**: transform a number according to some functional relationship $y = f(x)$\n",
"- **filtering**: throw away individual numbers (e.g., statistical outliers in a sample)\n",
"- **reducing**: collect individual numbers into summary statistics\n",
"\n",
"We study this **map-filter-reduce** paradigm extensively in [Chapter 8 <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_nb.png\">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/08_mfr/00_content.ipynb) after introducing more advanced data types that are needed to work with \"big\" data.\n",
"\n",
"Here, we focus on *filtering out* some numbers in a `for`-loop."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Example: A simple Filter"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"Calculate the sum of all even numbers in `[7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4]` after squaring them and adding `1` to the squares:\n",
"\n",
"- \"*all*\" => **loop** over an iterable\n",
"- \"*even*\" => **filter** out the odd numbers\n",
"- \"*square and add $1$*\" => apply the **map** $y = f(x) = x^2 + 1$\n",
"- \"*sum*\" => **reduce** the remaining and mapped numbers to their sum"
"The above code is easy to read as it involves only two levels of indentation.\n",
"\n",
"In general, code gets harder to comprehend the more **horizontal space** it occupies. It is commonly considered good practice to grow a program **vertically** rather than horizontally. Code compliant with [PEP 8 <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://www.python.org/dev/peps/pep-0008/#maximum-line-length) requires us to use *at most* 79 characters in a line!\n",
"\n",
"Consider the next example, whose implementation in code already starts to look unbalanced."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Example: Several Filters"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"Calculate the sum of every third and even number in `[7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4]` after squaring them and adding `1` to the squares:\n",
"\n",
"- \"*every*\" => **loop** over an iterable\n",
"- \"*third*\" => **filter** out all numbers except every third\n",
"- \"*even*\" => **filter** out the odd numbers\n",
"- \"*square and add $1$*\" => apply the **map** $y = f(x) = x^2 + 1$\n",
"- \"*sum*\" => **reduce** the remaining and mapped numbers to their sum"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"8 -> 65 12 -> 145 4 -> 17 "
]
},
{
"data": {
"text/plain": [
"227"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"total = 0\n",
"\n",
"for i, number in enumerate(numbers, start=1):\n",
" if i % 3 == 0: # only keep every third number\n",
" if number % 2 == 0: # only keep even numbers\n",
"With already three levels of indentation, less horizontal space is available for the actual code block. Of course, one could flatten the two `if` statements with the logical `and` operator, as shown in [Chapter 3 <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_nb.png\">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/03_conditionals/00_content.ipynb#The-if-Statement). Then, however, we trade off horizontal space against a more \"complex\" `if` logic, and this is *not* a real improvement."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"### The `continue` Statement"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"A Pythonista would instead make use of the `continue` statement (cf., [reference <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/reference/simple_stmts.html#the-continue-statement)) that causes a loop to jump into the next iteration skipping the rest of the code block.\n",
"\n",
"The revised code fragment below occupies more vertical space and less horizontal space: A *good* trade-off.\n",
"\n",
"One caveat is that we need to negate the conditions in the `if` statements. Conceptually, we are now filtering \"out\" and not \"in.\""
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"8 -> 65 12 -> 145 4 -> 17 "
]
},
{
"data": {
"text/plain": [
"227"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"total = 0\n",
"\n",
"for i, number in enumerate(numbers, start=1):\n",
" if i % 3 != 0: # only keep every third number\n",
" continue\n",
" elif number % 2 != 0: # only keep even numbers\n",
"This is yet another illustration of why programming is an art. The two preceding code cells do the *same* with *identical* time complexity. However, the latter is arguably easier to read for a human, even more so when the business logic grows beyond two filters."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Indefinite Loops"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"Sometimes we find ourselves in situations where we *cannot* know ahead of time how often or until which point in time a code block is to be executed."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Example: Guessing a Coin Toss"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"Let's consider a game where we randomly choose a variable to be either \"Heads\" or \"Tails\" and the user of our program has to guess it.\n",
"\n",
"Python provides the built-in [input() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#input) function that prints a message to the user, called the **prompt**, and reads in what was typed in response as a `str` object. We use it to process a user's \"unreliable\" input to our program (i.e., a user might type in some invalid response). Further, we use the [random() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/random.html#random.random) function in the [random <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/random.html) module to model the coin toss.\n",
"\n",
"A popular pattern to approach such **indefinite loops** is to go with a `while True` statement, which on its own would cause Python to enter into an infinite loop. Then, once a particular event occurs, we `break` out of the loop.\n",
"\n",
"Let's look at a first and naive implementation."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
"import random"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [],
"source": [
"random.seed(42)"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"code_folding": [],
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
"Guess if the coin comes up as heads or tails: heads\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Ooops, it was tails\n"
]
},
{
"name": "stdin",
"output_type": "stream",
"text": [
"Guess if the coin comes up as heads or tails: heads\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Yes, it was heads\n"
]
}
],
"source": [
"while True:\n",
" guess = input(\"Guess if the coin comes up as heads or tails: \")\n",
"\n",
" if random.random() < 0.5:\n",
" if guess == \"heads\":\n",
" print(\"Yes, it was heads\")\n",
" break\n",
" else:\n",
" print(\"Ooops, it was heads\")\n",
" else:\n",
" if guess == \"tails\":\n",
" print(\"Yes, it was tails\")\n",
" break\n",
" else:\n",
" print(\"Ooops, it was tails\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"This version exhibits two *severe* issues where we should improve on:\n",
"\n",
"1. If a user enters *anything* other than `\"heads\"` or `\"tails\"`, for example, `\"Heads\"` or `\"Tails\"`, the program keeps running *without* the user knowing about the mistake!\n",
"2. The code *intermingles* the coin tossing with the processing of the user's input: Mixing *unrelated* business logic in the *same* code block makes a program harder to read and, more importantly, maintain in the long run."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"### Example: Guessing a Coin Toss (revisited)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"Let's refactor the code and make it *modular*.\n",
"\n",
"First, we divide the business logic into two functions `get_guess()` and `toss_coin()` that are controlled from within a `while`-loop.\n",
"`get_guess()` not only reads in the user's input but also implements a simple input validation pattern in that the [.strip() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/stdtypes.html?highlight=__contains__#str.strip) and [.lower() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/stdtypes.html?highlight=__contains__#str.lower) methods remove preceding and trailing whitespace and lower case the input ensuring that the user may spell the input in any possible way (e.g., all upper or lower case). Also, `get_guess()` checks if the user entered one of the two valid options. If so, it returns either `\"heads\"` or `\"tails\"`; if not, it returns `None`."
" guess (str / NoneType): either \"heads\" or \"tails\"\n",
" if the input can be parsed and None otherwise\n",
" \"\"\"\n",
" guess = input(\"Guess if the coin comes up as heads or tails: \")\n",
" # handle frequent cases of \"misspelled\" user input\n",
" guess = guess.strip().lower()\n",
"\n",
" if guess in [\"heads\", \"tails\"]:\n",
" return guess\n",
" return None"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"`toss_coin()` models a fair coin toss when called with default arguments."
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
"def toss_coin(p_heads=0.5):\n",
" \"\"\"Simulate the tossing of a coin.\n",
"\n",
" Args:\n",
" p_heads (optional, float): probability that the coin comes up \"heads\";\n",
" defaults to 0.5 resembling a fair coin\n",
"\n",
" Returns:\n",
" side_on_top (str): \"heads\" or \"tails\"\n",
" \"\"\"\n",
" if random.random() < p_heads:\n",
" return \"heads\"\n",
" return \"tails\""
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"Second, we rewrite the `if`-`else`-logic to handle the case where `get_guess()` returns `None` explicitly: Whenever the user enters something invalid, a warning is shown, and another try is granted. We use the `is` operator and not the `==` operator as `None` is a singleton object.\n",
"\n",
"The `while`-loop takes on the role of **glue code** that manages how other parts of the program interact with each other."
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [],
"source": [
"random.seed(42)"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stdin",
"output_type": "stream",
"text": [
"Guess if the coin comes up as heads or tails: invalid\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Make sure to enter your guess correctly!\n"
]
},
{
"name": "stdin",
"output_type": "stream",
"text": [
"Guess if the coin comes up as heads or tails: Heads\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Yes, it was heads\n"
]
}
],
"source": [
"while True:\n",
" guess = get_guess()\n",
" result = toss_coin()\n",
"\n",
" if guess is None:\n",
" print(\"Make sure to enter your guess correctly!\")\n",
" elif guess == result:\n",
" print(\"Yes, it was\", result)\n",
" break\n",
" else:\n",
" print(\"Ooops, it was\", result)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"Now, the program's business logic is expressed in a clearer way. More importantly, we can now change it more easily. For example, we could make the `toss_coin()` function base the tossing on a probability distribution other than the uniform (i.e., replace the [random.random() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/random.html#random.random) function with another one). In general, modular architecture leads to improved software maintenance."