"**Note**: Click on \"*Kernel*\" > \"*Restart Kernel and Clear All Outputs*\" in [JupyterLab](https://jupyterlab.readthedocs.io/en/stable/) *before* reading this notebook to reset its output. If you cannot run this file on your machine, you may want to open it [in the cloud <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_mb.png\">](https://mybinder.org/v2/gh/webartifex/intro-to-python/main?urlpath=lab/tree/09_mappings/02_content.ipynb)."
"After introducing the `dict` type in the [first part <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_nb.png\">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/main/09_mappings/00_content.ipynb) of this chapter, we first look at an extension of the packing and unpacking syntax that involves `dict` objects. Then, we see how mappings can help us write computationally more efficient implementations to recursive solutions of problems as introduced in [Chapter 4 <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_nb.png\">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/main/04_iteration/00_content.ipynb#Recursion). In a way, this second part of the chapter \"finishes\" Chapter 4."
"Just as a single `*` symbol is used for packing and unpacking iterables in [Chapter 7 <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_nb.png\">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/main/07_sequences/03_content.ipynb#Packing-&-Unpacking), a double `**` symbol implements packing and unpacking for mappings.\n",
"Let's say we have `to_words` and `more_words` as below and want to merge the items together into a *new* `dict` object."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
"to_words = {\n",
" 0: \"zero\",\n",
" 1: \"one\",\n",
" 2: \"two\",\n",
"}"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [],
"source": [
"more_words = {\n",
" 2: \"TWO\", # to illustrate a point\n",
" 3: \"three\",\n",
" 4: \"four\",\n",
"}"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"By *unpacking* the items with `**`, the newly created `dict` object is first filled with the items from `to_words` and then from `more_words`. The item with the key `2` from `more_words` overwrites its counterpart from `to_words` as it is mentioned last."
"Both, `*` and `**` may be used within the header line of a function definition, for example, as in `print_args1()` below. Here, *positional* arguments not captured by positional parameters are *packed* into the `tuple` object `args`, and *keyword* arguments not captured by keyword parameters are *packed* into the `dict` object `kwargs`.\n",
"\n",
"For `print_args1()`, all arguments are optional, and ..."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
"def print_args1(*args, **kwargs):\n",
" \"\"\"Print out all arguments passed in.\"\"\"\n",
" for index, arg in enumerate(args):\n",
" print(\"position\", index, arg)\n",
"\n",
" for key, value in kwargs.items():\n",
" print(\"keyword\", key, value)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"... we may pass whatever we want to it, or nothing at all."
"The next example, `print_args2()`, requires the caller to pass one positional argument, captured in the `positional` parameter, and one keyword argument, captured in `keyword`. Further, an optional keyword argument `default` may be passed in. Any other positional or keyword arguments are packed into either `args` or `kwargs`."
"The *recursive* implementation of the [Fibonacci numbers <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_wiki.png\">](https://en.wikipedia.org/wiki/Fibonacci_number) in [Chapter 4 <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_nb.png\">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/main/04_iteration/00_content.ipynb#\"Easy-at-first-Glance\"-Example:-Fibonacci-Numbers) takes long to compute for large Fibonacci numbers. For easier comparison, we show the old `fibonacci()` version here again."
"Timing the code cells below with the `%%timeit` magic shows how doubling the input (i.e., `12` becomes `24`) more than doubles how long it takes `fibonacci()` to calculate the solution. This is actually an understatement as we see the time go up by roughly a factor of $1000$ (i.e., from nano-seconds to milli-seconds). That is an example of **exponential growth**."
"12.1 ms ± 149 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n"
]
}
],
"source": [
"%%timeit -n 100\n",
"fibonacci(24)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"The computation graph below visualizes what the problem is and also suggests a solution: In the recursive implementation, the same function calls are made over and over again. For example, in the visualization the call `fibonacci(3)`, shown as $F(3)$, is made *twice* when the actual goal is to calculate `fibonacci(5)`, shown as $F(5)$. This problem \"grows\" if the initial argument (i.e., `5` in the example) is chosen to be larger as we see with the many `fibonacci(2)`, `fibonacci(1)` and `fibonacci(0)` calls.\n",
"\n",
"Instead of calculating the return value of the `fibonacci()` function for the *same* argument over and over again, it makes sense to **cache** (i.e., \"store\") the result and reuse it. This concept is called **[memoization <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_wiki.png\">](https://en.wikipedia.org/wiki/Memoization)**."
"### \"Easy at second Glance\" Example: [Fibonacci Numbers <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_wiki.png\">](https://en.wikipedia.org/wiki/Fibonacci_number) (revisited)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"Below is a revision of the recursive `fibonacci()` implementation that uses a **globally** defined `dict` object, called `memo`, to store intermediate results and look them up.\n",
"\n",
"To be precise, the the revised `fibonacci()` first checks if the `i`th Fibonacci number has already been calculated before. If yes, it is in the `memo`. That number is then returned immediately *without* any more calculations. As `dict` objects are *optimized* for constant-time key look-ups, this takes essentially \"no\" time! With a `list` object, for example, the `in` operator would trigger a linear search, which takes longer the more elements are in the list. If the `i`th Fibonacci number has not been calculated before, there is no corresponding item in the `memo` and a recursive function call must be made. The result obtained by recursion is then inserted into the `memo`."
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
"memo = {\n",
" 0: 0,\n",
" 1: 1,\n",
"}"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {
"code_folding": [],
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [],
"source": [
"def fibonacci(i, *, debug=False):\n",
" \"\"\"Calculate the ith Fibonacci number.\n",
"\n",
" Args:\n",
" i (int): index of the Fibonacci number to calculate\n",
" debug (bool): show non-cached calls; defaults to False\n",
"When we follow the flow of execution closely, we realize that the intermediate results represented by the left-most path in the graph above are calculated first. `fibonacci(1)`, the left-most leaf node $F(1)$, is the first base case reached, followed immediately by `fibonacci(0)`. From that moment onwards, the flow of execution moves back up the left-most path while adding together the two corresponding child nodes. Effectively, this mirrors the *iterative* implementation in that the order of all computational steps are *identical* (cf., the \"*Hard at first Glance*\" example in [Chapter 4 <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_nb.png\">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/main/04_iteration/02_content.ipynb#\"Hard-at-first-Glance\"-Example:-Fibonacci-Numbers--(revisited))).\n",
"We added a keyword-only argument `debug` that allows the caller to print out a message every time a `i` was *not* in the `memo`."
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"fibonacci(12) is calculated\n",
"fibonacci(11) is calculated\n",
"fibonacci(10) is calculated\n",
"fibonacci(9) is calculated\n",
"fibonacci(8) is calculated\n",
"fibonacci(7) is calculated\n",
"fibonacci(6) is calculated\n",
"fibonacci(5) is calculated\n",
"fibonacci(4) is calculated\n",
"fibonacci(3) is calculated\n",
"fibonacci(2) is calculated\n"
]
},
{
"data": {
"text/plain": [
"144"
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"fibonacci(12, debug=True)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"Now, calling `fibonacci()` has the *side effect* of growing the `memo` in the *global scope*. So, subsequent calls to `fibonacci()` need not calculate any Fibonacci number with an index `i` smaller than the maximum `i` used so far. Because of that, this `fibonacci()` is *not* a *pure* function."
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [
{
"data": {
"text/plain": [
"144"
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"fibonacci(12, debug=True) # no more recursive calls needed"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"data": {
"text/plain": [
"{0: 0,\n",
" 1: 1,\n",
" 2: 1,\n",
" 3: 2,\n",
" 4: 3,\n",
" 5: 5,\n",
" 6: 8,\n",
" 7: 13,\n",
" 8: 21,\n",
" 9: 34,\n",
" 10: 55,\n",
" 11: 89,\n",
" 12: 144}"
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"memo"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"#### Efficiency of Algorithms (continued)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"With memoization, the recursive `fibonacci()` implementation is as fast as its iterative counterpart, even for large numbers.\n",
"\n",
"The `%%timeit` magic, by default, runs a code cell seven times. Whereas in the first run, *new* Fibonacci numbers (i.e., intermediate results) are added to the `memo`, `fibonacci()` has no work to do in the subsequent six runs. `%%timeit` realizes this and tells us that \"an intermediate result is being cached.\""
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The slowest run took 252.65 times longer than the fastest. This could mean that an intermediate result is being cached.\n",
"6.68 µs ± 15.8 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
]
}
],
"source": [
"%%timeit -n 1\n",
"fibonacci(99)"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The slowest run took 3603.20 times longer than the fastest. This could mean that an intermediate result is being cached.\n",
"85.1 µs ± 208 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
]
}
],
"source": [
"%%timeit -n 1\n",
"fibonacci(999)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"The iterative implementation still has an advantage as the `RecursionError` shows for larger `i`."
"Cell \u001b[0;32mIn[32], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43mget_ipython\u001b[49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mrun_cell_magic\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;124;43m'\u001b[39;49m\u001b[38;5;124;43mtimeit\u001b[39;49m\u001b[38;5;124;43m'\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;124;43m'\u001b[39;49m\u001b[38;5;124;43m-n 1\u001b[39;49m\u001b[38;5;124;43m'\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;124;43m'\u001b[39;49m\u001b[38;5;124;43mfibonacci(9999)\u001b[39;49m\u001b[38;5;130;43;01m\\n\u001b[39;49;00m\u001b[38;5;124;43m'\u001b[39;49m\u001b[43m)\u001b[49m\n",
"File \u001b[0;32m~/Repositories/intro-to-python/.venv/lib/python3.12/site-packages/IPython/core/interactiveshell.py:2541\u001b[0m, in \u001b[0;36mInteractiveShell.run_cell_magic\u001b[0;34m(self, magic_name, line, cell)\u001b[0m\n\u001b[1;32m 2539\u001b[0m \u001b[38;5;28;01mwith\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mbuiltin_trap:\n\u001b[1;32m 2540\u001b[0m args \u001b[38;5;241m=\u001b[39m (magic_arg_s, cell)\n\u001b[0;32m-> 2541\u001b[0m result \u001b[38;5;241m=\u001b[39m \u001b[43mfn\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43margs\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 2543\u001b[0m \u001b[38;5;66;03m# The code below prevents the output from being displayed\u001b[39;00m\n\u001b[1;32m 2544\u001b[0m \u001b[38;5;66;03m# when using magics with decorator @output_can_be_silenced\u001b[39;00m\n\u001b[1;32m 2545\u001b[0m \u001b[38;5;66;03m# when the last Python token in the expression is a ';'.\u001b[39;00m\n\u001b[1;32m 2546\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28mgetattr\u001b[39m(fn, magic\u001b[38;5;241m.\u001b[39mMAGIC_OUTPUT_CAN_BE_SILENCED, \u001b[38;5;28;01mFalse\u001b[39;00m):\n",
"\u001b[0;31mRecursionError\u001b[0m: maximum recursion depth exceeded"
]
}
],
"source": [
"%%timeit -n 1\n",
"fibonacci(9999)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"This exception occurs as Python must keep track of *every* function call *until* it has returned, and with large enough `i`, the recursion tree above grows too big. By default, Python has a limit of up to 3000 *simultaneous* function calls. So, theoretically this exception is not a bug in the narrow sense but the result of a \"security\" measure that is supposed to keep a computer from crashing. However, practically most high-level languages like Python incur such an overhead cost: It results from the fact that someone (i.e., Python) needs to manage each function call's *local scope*. With the `for`-loop in the iterative version, we do this managing as the programmer.\n",
"\n",
"We could \"hack\" a bit with Python's default configuration using the [sys <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/sys.html) module in the [standard library <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/index.html) and make it work. As we are good citizens, we reset everything to the defaults after our hack is completed."
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [],
"source": [
"import sys"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [],
"source": [
"old_recursion_limit = sys.getrecursionlimit()"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [
{
"data": {
"text/plain": [
"3000"
]
},
"execution_count": 35,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"old_recursion_limit"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [],
"source": [
"sys.setrecursionlimit(99999)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"Computational speed is *not* the problem here."
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The slowest run took 50532.39 times longer than the fastest. This could mean that an intermediate result is being cached.\n",
"1.21 ms ± 2.97 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"