diff --git a/00_python_in_a_nutshell.ipynb b/00_python_in_a_nutshell.ipynb
deleted file mode 100644
index e2172dd..0000000
--- a/00_python_in_a_nutshell.ipynb
+++ /dev/null
@@ -1,1965 +0,0 @@
-{
- "cells": [
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "# Chapter 0: Python in a Nutshell"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Python itself is a so-called **general purpose** programming language. That means it does *not* know about any **scientific algorithms** \"out of the box.\"\n",
- "\n",
- "The purpose of this notebook is to summarize anything that is worthwhile knowing about Python and programming on a \"high level\" and lay the foundation for working with so-called **third-party libraries**, some of which we see in subsequent chapters."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Using Python as a Calculator"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Any computer can always be viewed as some sort of a \"fancy calculator\" and Python is no exception from that. The following code snippet, for example, does exactly what we expect it would, namely *addition*."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 1,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "3"
- ]
- },
- "execution_count": 1,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "1 + 2"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "In terms of **syntax** (i.e., \"grammatical rules\"), digits are interpreted as plain numbers (i.e., a so-called **numerical literal**) and the `+` symbol consitutes a so-called **operator** that is built into Python.\n",
- "\n",
- "Other common operators are `-` for *subtraction*, `*` for *multiplication*, and `**` for *exponentiation*. In terms of arithmetic, Python allows the **chaining** of operations and adheres to conventions from math, namely the [PEMDAS rule ](https://en.wikipedia.org/wiki/Order_of_operations#Mnemonics)."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 2,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "45"
- ]
- },
- "execution_count": 2,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "87 - 42"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 3,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "15"
- ]
- },
- "execution_count": 3,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "3 * 5"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 4,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "8"
- ]
- },
- "execution_count": 4,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "2 ** 3"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 5,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "16"
- ]
- },
- "execution_count": 5,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "2 * 2 ** 3"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "To change the **order of precedence**, parentheses may be used for grouping. Syntactically, they are so-called **delimiters** that mark the beginning and the end of a **(sub-)expression** (i.e., a group of symbols that are **evaluated** together)."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 6,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "64"
- ]
- },
- "execution_count": 6,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "(2 * 2) ** 3"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "We must beware that some operators do *not* do what we expect. So, the following code snippet is *not* an example of exponentiation."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 7,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "1"
- ]
- },
- "execution_count": 7,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "2 ^ 3"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "*Division* is also not as straighforward as we may think!\n",
- "\n",
- "While the `/` operator does *ordinary division*, we must note the subtlety of the `.0` in the result."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 8,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "4.0"
- ]
- },
- "execution_count": 8,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "8 / 2"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Whereas both `4` and `4.0` have the *same* **semantic meaning** to us humans, they are two *different* \"things\" for a computer!\n",
- "\n",
- "Instead of using a single `/`, we may divide with a double `//` just as well."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 9,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "4"
- ]
- },
- "execution_count": 9,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "8 // 2"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "However, then we must be certain that the result is not a number with decimals other than `.0`. As we can guess from the result below, the `//` operator does *integer division* (i.e., \"whole number\" division)."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 10,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "3"
- ]
- },
- "execution_count": 10,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "7 // 2"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "On the contrary, the `%` operator implements the so-called *modulo division* (i.e., \"rest\" division). Here, a result of `0` indicates that a number is divisible by another one whereas any result other than `0` shows the opposite."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 11,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "1"
- ]
- },
- "execution_count": 11,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "7 % 2"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 12,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "0"
- ]
- },
- "execution_count": 12,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "8 % 2"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "What makes Python such an intuitive and thus beginner-friendly language, is the fact that it is a so-called **[interpreted language ](https://en.wikipedia.org/wiki/Interpreter_%28computing%29)**. In layman's terms, this means that we can go back up and *re-execute* any of the code cells in *any order*: That allows us to built up code *incrementally*. So-called **[compiled languages ](https://en.wikipedia.org/wiki/Compiler)**, on the other hand, would require us to run a program in its entirety even if only one small part has been changed.\n",
- "\n",
- "Instead of running individual code cells \"by hand\" and taking the result as it is, Python offers us the usage of **variables** to store \"values.\" A variable is created with the single `=` symbol, the so-called **assignment statement**."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 13,
- "metadata": {},
- "outputs": [],
- "source": [
- "a = 1"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 14,
- "metadata": {},
- "outputs": [],
- "source": [
- "b = 2"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "After assignment, we can simply ask Python about the values of `a` and `b`."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 15,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "1"
- ]
- },
- "execution_count": 15,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "a"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 16,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "2"
- ]
- },
- "execution_count": 16,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "b"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Similarly, we can use a variable in place of, for example, a numerical literal within an expression."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 17,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "3"
- ]
- },
- "execution_count": 17,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "a + b"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Also, we may combine several lines of code into a single code cell, adding as many empty lines as we wish to group the code. Then, all of the lines are executed from top to bottom in linear order whenever we execute the cell as a whole."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 18,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "3"
- ]
- },
- "execution_count": 18,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "a = 1\n",
- "b = 2\n",
- "\n",
- "a + b"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Something that fools many beginners is the fact that the `=` statement is *not* to be confused with the concept of an *equation* from math! An `=` statement is *always* to be interpreted from right to left.\n",
- "\n",
- "The following code snippet, for example, takes the \"old\" value of `a`, adds the value of `b` to it, and then stores the resulting `3` as the \"new\" value of `a`. After all, a variable is called a variable as its value is indeed variable!"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 19,
- "metadata": {},
- "outputs": [],
- "source": [
- "a = a + b"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 20,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "3"
- ]
- },
- "execution_count": 20,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "a"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "In general, the result of some expression involving variables is often stored in yet another variable for further processing. This is how more realistic programs are built up."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 21,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "3"
- ]
- },
- "execution_count": 21,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "a = 1\n",
- "b = 2\n",
- "\n",
- "c = a + b\n",
- "\n",
- "c"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "As most real-life projects involve *non-scalar* data, we take a pre-liminary look at how Python models `list`-like data next. Intuitively, a `list` can be thought of as a **container** holding many \"things.\"\n",
- "\n",
- "The syntax to create a `list` are brackets, `[` and `]`, another example of delimiters, listing the individual **elements** of the `list` in between them, separated by commas.\n",
- "\n",
- "For example, the next code snippet creates a `list` named `numbers` with the numbers `1`, `2`, `3`, and `4` in it."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 22,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "[1, 2, 3, 4]"
- ]
- },
- "execution_count": 22,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "numbers = [a, b, c, 4]\n",
- "\n",
- "numbers"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Whenever we use any kind of delimiter, we may break the lines in between them as we wish and add other so-called **whitespace** characters like spaces to format the way the code looks like. So, the following two code cells do *exactly* the same as the previous one, even the `,` after the `4` in the second cell is ignored."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 23,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "[1, 2, 3, 4]"
- ]
- },
- "execution_count": 23,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "numbers = [\n",
- " a, b, c, 4\n",
- "]\n",
- "\n",
- "numbers"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 24,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "[1, 2, 3, 4]"
- ]
- },
- "execution_count": 24,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "numbers = [\n",
- " a,\n",
- " b,\n",
- " c,\n",
- " 4,\n",
- "]\n",
- "\n",
- "numbers"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "A nice thing to know is that JupyterLab comes with **tab completion** built in. That means we do not have to type out the name `numbers` as a whole. Try it out by simply typing `num` and then hit the tab key on your keyboard. JupyterLab should complete the variable into `numbers`."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "num"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "A natural operation to do with `list`s is to **access** its elements. That is achieved with another operator that also uses a bracket notation. Each element is associated with an **index**, which is why we say that we \"index into a `list`.\" As with many other programming languages, Python is 0-based, which simply means that whenever we count something, we start to count at `0`.\n",
- "\n",
- "For example, to obtain the first element in `numbers`, we write the following."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 25,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "1"
- ]
- },
- "execution_count": 25,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "numbers[0]"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Note that the indexing operation implicitly assumes an **order** among the elements, which is quite intuitive as we specified the numbers in order above.\n",
- "\n",
- "Another implicit assumption behind `list`s is that the number of elements is *finite*. Because of that, we may use negative indices starting at `-1` to obtain an element in right-to-left order.\n",
- "\n",
- "So, to obtain the last element in `numbers`, we write the following."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 26,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "4"
- ]
- },
- "execution_count": 26,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "numbers[-1]"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Expressing Logic"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "The main point of using `list`s in Python is to write code that does something repeatedly, once for each element in the `list`.\n",
- "\n",
- "The syntactical construct to achieve that is the `for`-loop, which consists of two parts:\n",
- "- a **header** line specifying what is looped over, and\n",
- "- a **body** consisting of the block of code that is repeated for each element.\n",
- "\n",
- "In the example below, `for number in numbers:` constitutes the header. The expression after the `in` references the \"thing\" that is looped over (here: a `list` of `numbers`) and the name between `for` and `in` becomes a variable that is assigned a new value in each **iteration** over of the loop. A best practice is to use a meaingful name, which is why we choose the singular `number`. The `:` at the end is the charactistic symbol of a header line in general and requires the next line (and possibly many more lines) to be **indented**.\n",
- "\n",
- "The indented line constitues the `for`-loop's body. In the example, we simply take each of the numbers in `numbers`, one at a time, and add it to a `total` that is initialized at `0`. In other words, we calculate the sum of all the elements in `numbers`.\n",
- "\n",
- "Many beginners struggle with the term \"loop.\" To visualize the looping behavior of this code, we use the online tool [PythonTutor ](http://pythontutor.com/visualize.html#code=numbers%20%3D%20%5B1,%202,%203,%204%5D%0A%0Atotal%20%3D%200%0A%0Afor%20number%20in%20numbers%3A%0A%20%20%20%20total%20%3D%20total%20%2B%20number%0A%0Atotal&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false). That tool is helpful for two reasons:\n",
- "1. It allows us to execute code in \"slow motion\" (i.e., by clicking the \"next\" button on the left side, only the next atomic step of the code snippet is executed).\n",
- "2. It shows what happens inside the computer's memory on the right-hand side (cf., the \"*Thinking like a Computer*\" section further below)."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 27,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "10"
- ]
- },
- "execution_count": 27,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "total = 0\n",
- "\n",
- "for number in numbers:\n",
- " total = total + number\n",
- "\n",
- "total"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Python is pretty agnostic about how far the `for`-loop's body is indented. So, both of the next code cells are equivalent to the one above. Yet, a popular convention in the Python world is to always indent code with 4 spaces per indentation level."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 28,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "10"
- ]
- },
- "execution_count": 28,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "total = 0\n",
- "\n",
- "for number in numbers:\n",
- " total = total + number\n",
- "\n",
- "total"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 29,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "10"
- ]
- },
- "execution_count": 29,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "total = 0\n",
- "\n",
- "for number in numbers:\n",
- " total = total + number\n",
- "\n",
- "total"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### Conditional Execution"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "As a variation, let's add up only the even numbers. To achieve that, we exploit the fact that even numbers are all numbers that are divisible by `2` and use the `%` operator from above and a new one, namely the `==` operator for *equality comparison*, to express that idea."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 30,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "1"
- ]
- },
- "execution_count": 30,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "7 % 2"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 31,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "0"
- ]
- },
- "execution_count": 31,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "8 % 2"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Whenever *arithmetic* operators like `%` are combined in an expression with *relational* operators like `==`, the arithmetic is done first and the comparison last. So, the next two cells first obtain the rest after dividing `7` and `8` by `2` and then compare that to `0`. The result is a so-called **boolean**, either `True` or `False`, which is a computer's way of saying \"yes\" or \"no.\""
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 32,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "False"
- ]
- },
- "execution_count": 32,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "7 % 2 == 0"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 33,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "True"
- ]
- },
- "execution_count": 33,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "8 % 2 == 0"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "We use such kind of expressions as the **condition** in an `if` statement that constitutes a second layer within our `for`-loop implementation. An `if` statement itself consists of yet another header line with a body. That body's code is only executed if the condition is `True`.\n",
- "\n",
- "As an example, the next code snippet loops over all the elements in `numbers` and, for each individual `number`, checks if it is even. Only if that is the case, the `number` is added to the `total`. Otherwise, nothing is done with the `number`. The example also shows how we can add so-called **comments** at the end of a line: Anything that comes after the `#` symbol is disregarded by Python. We use such comments to put little notes to ourselves within the code."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 34,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "6"
- ]
- },
- "execution_count": 34,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "total = 0\n",
- "\n",
- "for number in numbers:\n",
- " if number % 2 == 0: # if the number is even\n",
- " total = total + number\n",
- "\n",
- "total"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "`if` statements may have more than one header line: For example, the code in the `else`-clause's body is only executed if the condition in the `if`-clause is `False`. In the code cell below, we calculate the sum of all even numbers and subtract the sum of all odd numbers. The result is `(2 + 4) - (1 + 3)` or `-1 + 2 - 3 + 4` resembling the order of the numbers in the `for`-loop."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 35,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "2"
- ]
- },
- "execution_count": 35,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "total = 0\n",
- "\n",
- "for number in numbers:\n",
- " if number % 2 == 0: # if the number is even\n",
- " total = total + number\n",
- " else: # if the number is odd\n",
- " total = total - number\n",
- "\n",
- "total"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Modularizing Code"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "One big idea in software engineering is to **modularize** code. The purpose of that is manyfold. Two very important motivations are to\n",
- "- make a code segment **re-usable**, and to\n",
- "- give a meaningful name to that code segment.\n",
- "\n",
- "The latter gets more important as the codebase in a project grows so big that we can only look at a tiny fraction of it at one point in time.\n",
- "\n",
- "The syntactical construct that enables us to achieve that is that of a **function definition**. Just like in math, we can \"define\" a function to be some set of parametrized instructions that provide some (deterministic) **output** given some *concrete* **input**.\n",
- "\n",
- "A function is defined with the `def` statement: After the `def` part comes the name of the function followed by the **parameter list** within parentheses. The first couple of lines in the function's body should be a so-called **docstring** that describes what the function does in plain English. Then, comes the code that is to be made repeatable. In the example below, we simply copy & pasted the code to calculate the sum of all even numbers in a `list` into the example function `sum_evens()`. Note that we exchanged the variable name `total` with `result` here to illustrate a point further below. In order for the function to provide back the output to \"the outside world,\" we use the `return` statement (Hint: to see its effect simply re-run the couple of code cells below with and without the `return result` line)."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 36,
- "metadata": {},
- "outputs": [],
- "source": [
- "def sum_evens(numbers):\n",
- " \"\"\"Sum up all the even numbers in a list.\n",
- "\n",
- " Args:\n",
- " numbers (list of int's): numbers to be summed up\n",
- "\n",
- " Returns:\n",
- " total (int)\n",
- " \"\"\"\n",
- " result = 0\n",
- "\n",
- " for number in numbers:\n",
- " if number % 2 == 0: # if the number is even\n",
- " result = result + number\n",
- "\n",
- " return result"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "After defining a function, we can **call** (i.e., \"execute\") it with the `()` operator. So, just as with the `[]` above, the `()` may have a different meaning in a given context.\n",
- "\n",
- "Let's execute the function with `numbers` as the input. We see the same `6` below the cell as we do above where we run the code without a function. Without the `return` statement in the function's body, we would not see any output here.\n",
- "\n",
- "To see what happens in detail, take a look at [PythonTutor ](http://pythontutor.com/visualize.html#code=numbers%20%3D%20%5B1,%202,%203,%204%5D%0A%0Adef%20sum_evens%28numbers%29%3A%0A%20%20%20%20%22%22%22Sum%20up%20all%20the%20even%20numbers%20in%20a%20list.%22%22%22%0A%20%20%20%20result%20%3D%200%0A%0A%20%20%20%20for%20number%20in%20numbers%3A%0A%20%20%20%20%20%20%20%20if%20number%20%25%202%20%3D%3D%200%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20result%20%3D%20result%20%2B%20number%0A%0A%20%20%20%20return%20result%0A%0Atotal%20%3D%20sum_evens%28numbers%29&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false) again. You should notice how there are two variables by the name `numbers` in memory. Python manages the memory with a concept called **namespaces** or **scopes**, which are just fancy terms for saying that Python can tell variables from different contexts apart."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 37,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "6"
- ]
- },
- "execution_count": 37,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "sum_evens(numbers)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "To re-use the *same* instructions with *different* input, we call the function a second time and give it a brand-new `list` of numbers as its input."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 38,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "30"
- ]
- },
- "execution_count": 38,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "sum_evens([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Note how the variable `result` only exists \"inside\" the `sum_evens()` function. Hence, we see the `NameError` here."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 39,
- "metadata": {},
- "outputs": [
- {
- "ename": "NameError",
- "evalue": "name 'result' is not defined",
- "output_type": "error",
- "traceback": [
- "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
- "\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
- "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mresult\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
- "\u001b[0;31mNameError\u001b[0m: name 'result' is not defined"
- ]
- }
- ],
- "source": [
- "result"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "The concept of re-usable functions is so important in programming that Python comes with many [built-in functions ](https://docs.python.org/3/library/functions.html). Two popular examples are the [sum() ](https://docs.python.org/3/library/functions.html#sum) and [len() ](https://docs.python.org/3/library/functions.html#len) functions that calculate the sum or the number of elements in a `list` input."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 40,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "10"
- ]
- },
- "execution_count": 40,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "sum(numbers)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 41,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "4"
- ]
- },
- "execution_count": 41,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "len(numbers)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Another function that comes in handy at times, is the [print() ](https://docs.python.org/3/library/functions.html#print) function that simply \"prints\" out its input to the screen. Below is the popular \"Hello World\" example that is shown in almost any introduction text on any programming language. The double quotes `\"` are yet another delimiter that specifies anything in between them as textual data (cf., the docstring above is just a special case thereof)."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 42,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "Hello World\n"
- ]
- }
- ],
- "source": [
- "print(\"Hello World\")"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Single quotes `'` are basically just synonyms for double quotes `\"`."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 43,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "Hello World\n"
- ]
- }
- ],
- "source": [
- "print('Hello World')"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "The [print() ](https://docs.python.org/3/library/functions.html#print) function is often helpful to **debug** a code snippet (i.e., trying to figure out what it does, step by step)."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 44,
- "metadata": {},
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "The square of 1 is 1\n",
- "The square of 2 is 4\n",
- "The square of 3 is 9\n",
- "The square of 4 is 16\n"
- ]
- }
- ],
- "source": [
- "for number in numbers:\n",
- " square = number ** 2\n",
- " print(\"The square of\", number, \"is\", square)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### Extending Core Python"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "In the Python community, we even say that \"Python comes with batteries included,\" meaning that a plain Python installation (like the one you are probably using to execute this notebook) offers all kinds of functionalities for a multitude of application domains. Thus, the name **general purpose** language.\n",
- "\n",
- "To \"enable\" most of these, however, we need to first **import** them from the so-called [standard library ](https://docs.python.org/3/library/index.html). Let's do a quick example here and look at the [random ](https://docs.python.org/3/library/random.html) module that provides functionalities to simulate and work with random numbers."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 45,
- "metadata": {},
- "outputs": [],
- "source": [
- "import random"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "To access a function inside the [random ](https://docs.python.org/3/library/random.html) module, for example, the [random() ](https://docs.python.org/3/library/random.html#random.random) function, we use the `.` operator, formally called the attribute access operator. The [random() ](https://docs.python.org/3/library/random.html#random.random) function simply returns a random decimal number between `0` and `1`."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 46,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "0.38523914298287465"
- ]
- },
- "execution_count": 46,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "random.random()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "It could be used, for example, to model a fair coin toss by comparing the number it returns to `0.5` with the `<` operator: In 50% of the cases we see `True` and in the other 50% `False`."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 47,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "False"
- ]
- },
- "execution_count": 47,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "random.random() < 0.5"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "A second example would be the [choice() ](https://docs.python.org/3/library/random.html#random.choice) function, which draws a random element from a `list` with replacement. We could use it to model a fair die."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 48,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "2"
- ]
- },
- "execution_count": 48,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "random.choice([1, 2, 3, 4, 5, 6])"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "In the next chapter, we see how we can extend Python even further by installing and importing **third-party packages**."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "## Thinking like a Computer"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "An important skill for any data scientist is to learn to \"think\" like a computer does. So far, we have seen that Python is a pretty \"intuitive\" language: Many concepts can already be understood after seeing them once or just a couple of times. Many of the aspects that make other languages harder to learn, are somehow \"magically\" automated by Python in the background, most notably the management of the memory.\n",
- "\n",
- "This section introduces a couple of more \"advanced\" concepts that presumably are *not* so intuitive to beginners."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### \"Simple\" Data Types"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "At first, let's review the concept of **object-orientation**, which is the paradigm by which Python manages the memory.\n",
- "\n",
- "Take the following three examples. Whereas `a` and `b` have the same **value** (i.e., **semantic meaning**) to us humans, we see in this section that there are a couple of caveats to look out for."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 49,
- "metadata": {},
- "outputs": [],
- "source": [
- "a = 42\n",
- "b = 42.0\n",
- "c = 42.87"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "An important idea to understand is that each of the right-hand sides lead to a *new* **object** being created in the computer's memory *first*. An object can be thought of as a \"box\" in memory holding $1$s and $0$s (i.e., physical energy flows inside the computer).\n",
- "\n",
- "Objects can and do exist without being **referenced** by a variable. Also, an object may even have several variables referencing them, just as a human may have different names in different contexts (e.g., a formal name in the password, a name by which one is known to friends, and maybe a different name by which one is called by one's spouse).\n",
- "\n",
- "In the example, while both `a` and `b` have the *same* value, they are two *distinct* objects. The `is` operator checks if the objects referenced by two variables are indeed the *same* one, or, in other words, have the same **identity**."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 50,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "True"
- ]
- },
- "execution_count": 50,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "a == b"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 51,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "False"
- ]
- },
- "execution_count": 51,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "a is b"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Every object always has some **data type**, which determines how the object behaves and what we can do with it. The types of `a` and `b` are `int` and `float`, respectively."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 52,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "int"
- ]
- },
- "execution_count": 52,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "type(a)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 53,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "float"
- ]
- },
- "execution_count": 53,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "type(b)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "While it seems cumbersome to analyze numbers at this level of detail, the following code cell shows how `float`ing-point numbers, one gold standard of numbers in all of computer science and engineering, behave couter-intutive. Yet, *nothing* is wrong here."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 54,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "False"
- ]
- },
- "execution_count": 54,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "0.1 + 0.2 == 0.3"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "The data type of an object also determines which **methods** we can invoke on it. A method is just a function that is \"attached\" to an object and can be accessed with the `.` operator seen above. A method necessarily needs the objects it is attached to as in input, which is why it is attached to an object to begin with.\n",
- "\n",
- "For example, `float` objects come with an `.is_integer()` method that tells us if the number has non-`0` decimals."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 55,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "True"
- ]
- },
- "execution_count": 55,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "b.is_integer()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 56,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "False"
- ]
- },
- "execution_count": 56,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "c.is_integer()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "`int` objects on the contrary have no notion of the concept of decimals, which is why they do *not* have an `.is_integer()` method. That is what the `AttributeError` tells us."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 57,
- "metadata": {},
- "outputs": [
- {
- "ename": "AttributeError",
- "evalue": "'int' object has no attribute 'is_integer'",
- "output_type": "error",
- "traceback": [
- "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
- "\u001b[0;31mAttributeError\u001b[0m Traceback (most recent call last)",
- "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0ma\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mis_integer\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
- "\u001b[0;31mAttributeError\u001b[0m: 'int' object has no attribute 'is_integer'"
- ]
- }
- ],
- "source": [
- "a.is_integer()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "What we could do here, is to take `a` and pass it to the [float() ](https://docs.python.org/3/library/functions.html#float) built-in, a so-called **constructor**, which takes the value of its input and creates a *new* object of the desired `float` type. Yet, we know the answer to `aa.is_integer()` already, even without executing the code cell as `a` has no non-`0` decimals to begin with."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 58,
- "metadata": {},
- "outputs": [],
- "source": [
- "aa = float(a)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 59,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "True"
- ]
- },
- "execution_count": 59,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "aa.is_integer()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Let's create another example `d` to see further examples of methods."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 60,
- "metadata": {},
- "outputs": [],
- "source": [
- "d = \"Python rocks\""
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "The type of `d` is `str`, which is short for \"**string**\" and is defined in computer science as a sequence of characters."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 61,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "str"
- ]
- },
- "execution_count": 61,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "type(d)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "`str` objects support various methods that \"make sense\" in the context of textual data, for example, the `.lower()` and `.upper()` methods."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 62,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "'python rocks'"
- ]
- },
- "execution_count": 62,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "d.lower()"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 63,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "'PYTHON ROCKS'"
- ]
- },
- "execution_count": 63,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "d.upper()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "### \"Complex\" Data Types"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "The examples in the previous section are considered \"simple\" as they only model *scalar* values (i.e., an individual object per example). However, we have already seen an example of a more \"complex\" object, namely the `list` called `numbers` above."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 64,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "list"
- ]
- },
- "execution_count": 64,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "type(numbers)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 65,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "[1, 2, 3, 4]"
- ]
- },
- "execution_count": 65,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "numbers"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "`list` objects also come with specific methods on them, for example, the `.append()` method that adds another element at the end of a `list`."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 66,
- "metadata": {},
- "outputs": [],
- "source": [
- "numbers.append(5)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Note how the `.append()` method does not lead to any output below the code cell. That is an indication that `numbers` is \"changed in place.\" The formal term for this property is **mutability**. A good working definition is: Any object whose value can be changed *after* its creation, is a **mutable** objects. Objects *without* this property are called **immutable**.\n",
- "\n",
- "An example for the latter, is the `tuple` data type. `tuple`s are simply `list`s with the additional property that they cannot be changed. Everything is else is the same as for `list`s. `tuple`s are created with parentheses replacing the brackets."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 67,
- "metadata": {},
- "outputs": [],
- "source": [
- "more_numbers = (7, 8, 9)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "`more_numbers` does not know about the `.append()` method."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 68,
- "metadata": {},
- "outputs": [
- {
- "ename": "AttributeError",
- "evalue": "'tuple' object has no attribute 'append'",
- "output_type": "error",
- "traceback": [
- "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
- "\u001b[0;31mAttributeError\u001b[0m Traceback (most recent call last)",
- "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mmore_numbers\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mappend\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m10\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
- "\u001b[0;31mAttributeError\u001b[0m: 'tuple' object has no attribute 'append'"
- ]
- }
- ],
- "source": [
- "more_numbers.append(10)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Whereas both `list` and `tuple` objects perserve the **order** of their elements, the `set` data type does not. Additionally, any object may only be an element of a `set` at most once. The syntax to create `set`s are curly braces, `{` and `}`. By giving up order, `set` objects offer significantly increased processing speed in various situations."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 69,
- "metadata": {},
- "outputs": [],
- "source": [
- "other_numbers = {3, 3, 3, 2, 2, 1}"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 70,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "{1, 2, 3}"
- ]
- },
- "execution_count": 70,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "other_numbers"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "One last example of a \"complex\" data type is the `dict`ionary type, which models a mapping relationship among the objects it contains. The syntax to create `dict`s also involves curly braces with the additon of using a `:` to specify the mapping relationships.\n",
- "\n",
- "For example, to map `int`egers to `str`ings modeling the English words corresponding to the numbers, we could write the following. The objects to the left of the `:` take the role of the **keys** while the ones to the right take the role of the **values**."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 71,
- "metadata": {},
- "outputs": [],
- "source": [
- "to_words = {\n",
- " 0: \"zero\",\n",
- " 1: \"one\",\n",
- " 2: \"two\",\n",
- "}"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "The main purpose of `dict`s is to look up the value mapped to by some key. We can use the indexing notion to achieve that."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 72,
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/plain": [
- "'zero'"
- ]
- },
- "execution_count": 72,
- "metadata": {},
- "output_type": "execute_result"
- }
- ],
- "source": [
- "to_words[0]"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "`dict`s are among the most optimized data type in the Python world and a major building block in codebases solving real-life problems."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "A big factor in getting good at any programming language is to learn what data types to use in which situations. There is no \"best\" data type; choosing among a couple of data types always comes down to trade-offs."
- ]
- }
- ],
- "metadata": {
- "kernelspec": {
- "display_name": "Python 3",
- "language": "python",
- "name": "python3"
- },
- "language_info": {
- "codemirror_mode": {
- "name": "ipython",
- "version": 3
- },
- "file_extension": ".py",
- "mimetype": "text/x-python",
- "name": "python",
- "nbconvert_exporter": "python",
- "pygments_lexer": "ipython3",
- "version": "3.8.9"
- },
- "toc": {
- "base_numbering": 1,
- "nav_menu": {},
- "number_sections": false,
- "sideBar": true,
- "skip_h1_title": false,
- "title_cell": "Table of Contents",
- "title_sidebar": "Contents",
- "toc_cell": false,
- "toc_position": {},
- "toc_section_display": true,
- "toc_window_display": false
- }
- },
- "nbformat": 4,
- "nbformat_minor": 4
-}
diff --git a/00_python_in_a_nutshell/00_content_arithmetic.ipynb b/00_python_in_a_nutshell/00_content_arithmetic.ipynb
new file mode 100644
index 0000000..3a27936
--- /dev/null
+++ b/00_python_in_a_nutshell/00_content_arithmetic.ipynb
@@ -0,0 +1,571 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "**Note**: Click on \"*Kernel*\" > \"*Restart Kernel and Clear All Outputs*\" in [JupyterLab](https://jupyterlab.readthedocs.io/en/stable/) *before* reading this notebook to reset its output. If you cannot run this file on your machine, you may want to open it [in the cloud ](https://mybinder.org/v2/gh/webartifex/intro-to-data-science/main?urlpath=lab/tree/00_python_in_a_nutshell/00_content_arithmetic.ipynb)."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Chapter 0: Python in a Nutshell (Part 1)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Python itself is a so-called **general purpose** programming language. That means it does *not* know about any **scientific algorithms** \"out of the box.\"\n",
+ "\n",
+ "The purpose of this notebook is to summarize anything that is worthwhile knowing about Python and programming on a \"high level\" and lay the foundation for working with so-called **third-party libraries**, some of which we see in subsequent chapters."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Basic Arithmetic"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Any computer can always be viewed as some sort of a \"fancy calculator\" and Python is no exception from that. The following code snippet, for example, does exactly what we expect it would, namely *addition*."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "3"
+ ]
+ },
+ "execution_count": 1,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "1 + 2"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "In terms of **syntax** (i.e., \"grammatical rules\"), digits are interpreted as plain numbers (i.e., a so-called **numerical literal**) and the `+` symbol consitutes a so-called **operator** that is built into Python.\n",
+ "\n",
+ "Other common operators are `-` for *subtraction*, `*` for *multiplication*, and `**` for *exponentiation*. In terms of arithmetic, Python allows the **chaining** of operations and adheres to conventions from math, namely the [PEMDAS rule ](https://en.wikipedia.org/wiki/Order_of_operations#Mnemonics)."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 2,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "45"
+ ]
+ },
+ "execution_count": 2,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "87 - 42"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "15"
+ ]
+ },
+ "execution_count": 3,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "3 * 5"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 4,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "8"
+ ]
+ },
+ "execution_count": 4,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "2 ** 3"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 5,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "16"
+ ]
+ },
+ "execution_count": 5,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "2 * 2 ** 3"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "To change the **order of precedence**, parentheses may be used for grouping. Syntactically, they are so-called **delimiters** that mark the beginning and the end of a **(sub-)expression** (i.e., a group of symbols that are **evaluated** together)."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 6,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "64"
+ ]
+ },
+ "execution_count": 6,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "(2 * 2) ** 3"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "We must beware that some operators do *not* do what we expect. So, the following code snippet is *not* an example of exponentiation."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 7,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "1"
+ ]
+ },
+ "execution_count": 7,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "2 ^ 3"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "*Division* is also not as straighforward as we may think!\n",
+ "\n",
+ "While the `/` operator does *ordinary division*, we must note the subtlety of the `.0` in the result."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 8,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "4.0"
+ ]
+ },
+ "execution_count": 8,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "8 / 2"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Whereas both `4` and `4.0` have the *same* **semantic meaning** to us humans, they are two *different* \"things\" for a computer!\n",
+ "\n",
+ "Instead of using a single `/`, we may divide with a double `//` just as well."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 9,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "4"
+ ]
+ },
+ "execution_count": 9,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "8 // 2"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "However, then we must be certain that the result is not a number with decimals other than `.0`. As we can guess from the result below, the `//` operator does *integer division* (i.e., \"whole number\" division)."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 10,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "3"
+ ]
+ },
+ "execution_count": 10,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "7 // 2"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "On the contrary, the `%` operator implements the so-called *modulo division* (i.e., \"rest\" division). Here, a result of `0` indicates that a number is divisible by another one whereas any result other than `0` shows the opposite."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 11,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "1"
+ ]
+ },
+ "execution_count": 11,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "7 % 2"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 12,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "0"
+ ]
+ },
+ "execution_count": 12,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "8 % 2"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "What makes Python such an intuitive and thus beginner-friendly language, is the fact that it is a so-called **[interpreted language ](https://en.wikipedia.org/wiki/Interpreter_%28computing%29)**. In layman's terms, this means that we can go back up and *re-execute* any of the code cells in *any order*: That allows us to built up code *incrementally*. So-called **[compiled languages ](https://en.wikipedia.org/wiki/Compiler)**, on the other hand, would require us to run a program in its entirety even if only one small part has been changed.\n",
+ "\n",
+ "Instead of running individual code cells \"by hand\" and taking the result as it is, Python offers us the usage of **variables** to store \"values.\" A variable is created with the single `=` symbol, the so-called **assignment statement**."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 13,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "a = 1"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 14,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "b = 2"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "After assignment, we can simply ask Python about the values of `a` and `b`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 15,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "1"
+ ]
+ },
+ "execution_count": 15,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "a"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 16,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "2"
+ ]
+ },
+ "execution_count": 16,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "b"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Similarly, we can use a variable in place of, for example, a numerical literal within an expression."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 17,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "3"
+ ]
+ },
+ "execution_count": 17,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "a + b"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Also, we may combine several lines of code into a single code cell, adding as many empty lines as we wish to group the code. Then, all of the lines are executed from top to bottom in linear order whenever we execute the cell as a whole."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 18,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "3"
+ ]
+ },
+ "execution_count": 18,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "a = 1\n",
+ "b = 2\n",
+ "\n",
+ "a + b"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Something that fools many beginners is the fact that the `=` statement is *not* to be confused with the concept of an *equation* from math! An `=` statement is *always* to be interpreted from right to left.\n",
+ "\n",
+ "The following code snippet, for example, takes the \"old\" value of `a`, adds the value of `b` to it, and then stores the resulting `3` as the \"new\" value of `a`. After all, a variable is called a variable as its value is indeed variable!"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 19,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "a = a + b"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 20,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "3"
+ ]
+ },
+ "execution_count": 20,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "a"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "In general, the result of some expression involving variables is often stored in yet another variable for further processing. This is how more realistic programs are built up."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 21,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "3"
+ ]
+ },
+ "execution_count": 21,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "a = 1\n",
+ "b = 2\n",
+ "\n",
+ "c = a + b\n",
+ "\n",
+ "c"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.8.12"
+ },
+ "toc": {
+ "base_numbering": 1,
+ "nav_menu": {},
+ "number_sections": false,
+ "sideBar": true,
+ "skip_h1_title": false,
+ "title_cell": "Table of Contents",
+ "title_sidebar": "Contents",
+ "toc_cell": false,
+ "toc_position": {},
+ "toc_section_display": true,
+ "toc_window_display": false
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/00_python_in_a_nutshell/01_exercises_calculator.ipynb b/00_python_in_a_nutshell/01_exercises_calculator.ipynb
new file mode 100644
index 0000000..758aa47
--- /dev/null
+++ b/00_python_in_a_nutshell/01_exercises_calculator.ipynb
@@ -0,0 +1,184 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "**Note**: Click on \"*Kernel*\" > \"*Restart Kernel and Run All*\" in [JupyterLab](https://jupyterlab.readthedocs.io/en/stable/) *after* finishing the exercises to ensure that your solution runs top to bottom *without* any errors. If you cannot run this file on your machine, you may want to open it [in the cloud ](https://mybinder.org/v2/gh/webartifex/intro-to-data-science/main?urlpath=lab/tree/00_python_in_a_nutshell/01_exercises_calculator.ipynb)."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Chapter 0: Python in a Nutshell (Coding Exercises)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "The exercises below assume that you have read the preceeding content sections.\n",
+ "\n",
+ "The `...`'s in the code cells indicate where you need to fill in code snippets. The number of `...`'s within a code cell give you a rough idea of how many lines of code are needed to solve the task. You should not need to create any additional code cells for your final solution. However, you may want to use temporary code cells to try out some ideas."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Python as a Calculator"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "The [volume of a sphere ](https://en.wikipedia.org/wiki/Sphere) is defined as $\\frac{4}{3} * \\pi * r^3$.\n",
+ "\n",
+ "**Q1**: Calculate it for `r = 2.88` and approximate $\\pi$ with `pi = 3.14`!"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "pi = 3.14\n",
+ "r = 2.88"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "..."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "While Python may be used as a calculator, it behaves a bit differently compared to calculator apps that phones or computers come with and that we are accustomed to.\n",
+ "\n",
+ "A major difference is that Python \"forgets\" intermediate results that are not assigned to variables. On the contrary, the calculators we work with outside of programming always keep the last result and allow us to use it as the first input for the next calculation.\n",
+ "\n",
+ "One way to keep on working with intermediate results in Python is to write the entire calculation as just *one* big expression that is composed of many sub-expressions representing the individual steps in our overall calculation.\n",
+ "\n",
+ "**Q2.1**: Given `a` and `b` like below, subtract the smaller `a` from the larger `b`, divide the difference by `9`, and raise the result to the power of `2`! Use operators that preserve the `int` type of the final result! The entire calculations *must* be placed within *one* code cell.\n",
+ "\n",
+ "Hint: You may need to group sub-expressions with parentheses `(` and `)`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "a = 42\n",
+ "b = 87"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "..."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "The code cell below contains nothing but a single underscore `_`. In both, a Python command-line prompt and Jupyter notebooks, the variable `_` is automatically updated and always references the object to which the *last* expression executed evaluated to.\n",
+ "\n",
+ "**Q2.2**: Execute the code cell below! It should evaluate to the *same* result as the previous code cell (i.e., your answer to **Q2.1** assuming you go through this notebook in order)."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "_"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "**Q2.3**: Implement the same overall calculation as in your answer to **Q2.1** in several independent steps (i.e., code cells)! Use only *one* operator per code cell!\n",
+ "\n",
+ "Hint: You should need *two* more code cells after the `b - a` one immediately below. If you *need* to use parentheses, you must be doing something wrong."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "b - a"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "_ ..."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "_ ..."
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.8.12"
+ },
+ "toc": {
+ "base_numbering": 1,
+ "nav_menu": {},
+ "number_sections": false,
+ "sideBar": true,
+ "skip_h1_title": true,
+ "title_cell": "Table of Contents",
+ "title_sidebar": "Contents",
+ "toc_cell": false,
+ "toc_position": {},
+ "toc_section_display": false,
+ "toc_window_display": false
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/00_python_in_a_nutshell/02_content_logic.ipynb b/00_python_in_a_nutshell/02_content_logic.ipynb
new file mode 100644
index 0000000..43a5c5f
--- /dev/null
+++ b/00_python_in_a_nutshell/02_content_logic.ipynb
@@ -0,0 +1,858 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "**Note**: Click on \"*Kernel*\" > \"*Restart Kernel and Clear All Outputs*\" in [JupyterLab](https://jupyterlab.readthedocs.io/en/stable/) *before* reading this notebook to reset its output. If you cannot run this file on your machine, you may want to open it [in the cloud ](https://mybinder.org/v2/gh/webartifex/intro-to-data-science/main?urlpath=lab/tree/00_python_in_a_nutshell/02_content_logic.ipynb)."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Chapter 0: Python in a Nutshell (Part 2)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "In the previous section, we only looked at **scalars** (i.e., a variable referencing one number at a time). However, that is not the only kind of data a computer can hold in its memory. In the section below, we look at how computers process many numbers in a generic fashion."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Non-Scalar Data"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "As most real-life projects involve *non-scalar* data, we take a pre-liminary look at how Python models `list`-like data next. Intuitively, a `list` can be thought of as a **container** holding many \"things.\"\n",
+ "\n",
+ "The syntax to create a `list` are brackets, `[` and `]`, another example of delimiters, listing the individual **elements** of the `list` in between them, separated by commas.\n",
+ "\n",
+ "For example, the next code snippet creates a `list` named `numbers` with the numbers `1`, `2`, `3`, and `4` in it."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "[1, 2, 3, 4]"
+ ]
+ },
+ "execution_count": 1,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "numbers = [1, 2, 3, 4]\n",
+ "\n",
+ "numbers"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Whenever we use any kind of delimiter, we may break the lines in between them as we wish and add other so-called **whitespace** characters like spaces to format the way the code looks like. So, the following two code cells do *exactly* the same as the previous one, even the `,` after the `4` in the second cell is ignored."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 2,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "[1, 2, 3, 4]"
+ ]
+ },
+ "execution_count": 2,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "numbers = [\n",
+ " 1, 2, 3, 4\n",
+ "]\n",
+ "\n",
+ "numbers"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "[1, 2, 3, 4]"
+ ]
+ },
+ "execution_count": 3,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "numbers = [\n",
+ " 1,\n",
+ " 2,\n",
+ " 3,\n",
+ " 4,\n",
+ "]\n",
+ "\n",
+ "numbers"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "A nice thing to know is that JupyterLab comes with **tab completion** built in. That means we do not have to type out the name `numbers` as a whole. Try it out by simply typing `num` and then hit the tab key on your keyboard. JupyterLab should complete the variable into `numbers`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "num"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "A natural operation to do with `list`s is to **access** its elements. That is achieved with another operator that also uses a bracket notation. Each element is associated with an **index**, which is why we say that we \"index into a `list`.\" As with many other programming languages, Python is 0-based, which simply means that whenever we count something, we start to count at `0`.\n",
+ "\n",
+ "For example, to obtain the first element in `numbers`, we write the following."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 4,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "1"
+ ]
+ },
+ "execution_count": 4,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "numbers[0]"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Note that the indexing operation implicitly assumes an **order** among the elements, which is quite intuitive as we specified the numbers in order above.\n",
+ "\n",
+ "Another implicit assumption behind `list`s is that the number of elements is *finite*. Because of that, we may use negative indices starting at `-1` to obtain an element in right-to-left order.\n",
+ "\n",
+ "So, to obtain the last element in `numbers`, we write the following."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 5,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "4"
+ ]
+ },
+ "execution_count": 5,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "numbers[-1]"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "`list` objects are **mutable**: We may change *parts* of them *after* they are created. That behavior is *not* a given for many other **types** of objects.\n",
+ "\n",
+ "For example, to exchange the first and the last element in `numbers`, we assign new objects to an index."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 6,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "numbers[0] = 4"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 7,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "numbers[3] = 1"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 8,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "[4, 2, 3, 1]"
+ ]
+ },
+ "execution_count": 8,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "numbers"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "To \"flip\" the value of two variables or indexes, we may also use the following notation."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 9,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "numbers[0], numbers[3] = numbers[3], numbers[0]"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 10,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "[1, 2, 3, 4]"
+ ]
+ },
+ "execution_count": 10,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "numbers"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Expressing Business Logic"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "The main point of using `list`s in Python is to write code that does \"something\" for each element in the `list`, which may hold big amounts of data. Expressing the logic of a problem from the real world in code, the \"something\" part, is subsumed by the term [business logic ](https://en.wikipedia.org/wiki/Business_logic), which has *nothing* to do with businesses that make money.\n",
+ "\n",
+ "There are two aspects to business logic:\n",
+ "1. Execute some lines of code many times, and\n",
+ "2. execute some lines of code only if a certain **condition** applies.\n",
+ "\n",
+ "Both of these aspects come in many variants and may be combined in basically any arbitrary fashion."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Iterative Execution"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "**Iteration** is the generic idea of executing code repeatedly. Most programming languages provide dedicated constructs to achieve that. In Python, the easiest such construct is the so-called `for`-loop."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### The `for` Loop"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "A `for`-loop consists of two parts:\n",
+ "- a **header** line specifying what is looped over, and\n",
+ "- a **body** consisting of the **block of code** that is repeated for each element.\n",
+ "\n",
+ "In the example below, `for number in numbers:` constitutes the header. The expression after the `in` references the \"thing\" that is looped over (here: a `list` of `numbers`) and the name between `for` and `in` becomes a variable that is assigned a new value in each **iteration** over of the loop. A best practice is to use a meaingful name, which is why we choose the singular `number`. The `:` at the end is the charactistic symbol of a header line in general and requires the next line (and possibly many more lines) to be **indented**.\n",
+ "\n",
+ "The indented line constitues the `for`-loop's body. In the example, we simply take each of the numbers in `numbers`, one at a time, and add it to a `total` that is initialized at `0`. In other words, we calculate the sum of all the elements in `numbers`.\n",
+ "\n",
+ "Many beginners struggle with the term \"loop.\" To visualize the looping behavior of this code, we use the online tool [PythonTutor ](http://pythontutor.com/visualize.html#code=numbers%20%3D%20%5B1,%202,%203,%204%5D%0A%0Atotal%20%3D%200%0A%0Afor%20number%20in%20numbers%3A%0A%20%20%20%20total%20%3D%20total%20%2B%20number%0A%0Atotal&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false). That tool is helpful for two reasons:\n",
+ "1. It allows us to execute code in \"slow motion\" (i.e., by clicking the \"next\" button on the left side, only the next atomic step of the code snippet is executed).\n",
+ "2. It shows what happens inside the computer's memory on the right-hand side (cf., the \"*Thinking like a Computer*\" section further below)."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 11,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "10"
+ ]
+ },
+ "execution_count": 11,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "total = 0\n",
+ "\n",
+ "for number in numbers:\n",
+ " total = total + number\n",
+ "\n",
+ "total"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Python is pretty agnostic about how far the `for`-loop's body is indented. So, both of the next code cells are equivalent to the one above. Yet, a popular convention in the Python world is to always indent code with 4 spaces per indentation level."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 12,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "10"
+ ]
+ },
+ "execution_count": 12,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "total = 0\n",
+ "\n",
+ "for number in numbers:\n",
+ " total = total + number\n",
+ "\n",
+ "total"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 13,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "10"
+ ]
+ },
+ "execution_count": 13,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "total = 0\n",
+ "\n",
+ "for number in numbers:\n",
+ " total = total + number\n",
+ "\n",
+ "total"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Conditional Execution"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "As a variation, let's add up only the even numbers. To achieve that, we exploit the fact that even numbers are all numbers that are divisible by `2` and use the `%` operator from before and a new one, namely the `==` operator for *equality comparison*, to express that idea."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 14,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "1"
+ ]
+ },
+ "execution_count": 14,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "7 % 2"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 15,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "0"
+ ]
+ },
+ "execution_count": 15,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "8 % 2"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Whenever *arithmetic* operators like `%` are combined with *relational* operators like `==`, the arithmetic ones are evaluated first. So, in the two cells below, we first obtain the rest after dividing `7` and `8` by `2` and then compare that to `0`. The result is a so-called **boolean**, either `True` or `False`, which is a computer's way of saying \"yes\" or \"no.\""
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 16,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "False"
+ ]
+ },
+ "execution_count": 16,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "7 % 2 == 0"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 17,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "True"
+ ]
+ },
+ "execution_count": 17,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "8 % 2 == 0"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Other relational operators are `!=` to test inequality and `<`, `<=`, `>`, and `>=` to check wether the left or right side is smaller or larger."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### The `if` Statement"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "We use such kind of expressions as the **condition** in an `if` statement that constitutes a second layer within our `for`-loop implementation. An `if` statement itself consists of yet another header line with a body. That body's code is only executed if the condition is `True`.\n",
+ "\n",
+ "As an example, the next code snippet loops over all the elements in `numbers` and, for each individual `number`, checks if it is even. Only if that is the case, the `number` is added to the `total`. Otherwise, nothing is done with the `number`. The example also shows how we can add so-called **comments** at the end of a line: Anything that comes after the `#` symbol is disregarded by Python. We use such comments to put little notes to ourselves within the code."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 18,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "6"
+ ]
+ },
+ "execution_count": 18,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "total = 0\n",
+ "\n",
+ "for number in numbers:\n",
+ " if number % 2 == 0: # if the number is even\n",
+ " total = total + number\n",
+ "\n",
+ "total"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### The `else` Clause"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "`if` statements may have more than one header line: For example, the code in the `else`-clause's body is only executed if the condition in the `if`-clause is `False`. In the code cell below, we calculate the sum of all even numbers and subtract the sum of all odd numbers. The result is `(2 + 4) - (1 + 3)`, or `-1 + 2 - 3 + 4` resembling the order of the numbers in the `for`-loop."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 19,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "2"
+ ]
+ },
+ "execution_count": 19,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "total = 0\n",
+ "\n",
+ "for number in numbers:\n",
+ " if number % 2 == 0: # if the number is even\n",
+ " total = total + number\n",
+ " else: # if the number is odd\n",
+ " total = total - number\n",
+ "\n",
+ "total"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "A **function** (cf., the \"*Built-in Functions*\" section further below) that comes in handy with `for`-loops is [print() ](https://docs.python.org/3/library/functions.html#print), which simply \"prints\" out (i.e., \"shows on the screen\") whatever **input** we give it.\n",
+ "\n",
+ "In the example next, we loop over the numbers from `1` to `10` and print out either half a `number` or three times a `number` plus 1 depending on the `number` being even or odd."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 20,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "4\n",
+ "1\n",
+ "10\n",
+ "2\n",
+ "16\n",
+ "3\n",
+ "22\n",
+ "4\n",
+ "28\n",
+ "5\n"
+ ]
+ }
+ ],
+ "source": [
+ "for number in [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]:\n",
+ " if number % 2 == 0:\n",
+ " print(number // 2)\n",
+ " else:\n",
+ " print(3 * number + 1)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "To save ourselves writing out all the numbers, we may also use the [range() ](https://docs.python.org/3/library/functions.html#func-range) built-in, which, in the example, takes two inputs separated by comma: A `start` number that is included and a `stop` number that is *not* included."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 21,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "4\n",
+ "1\n",
+ "10\n",
+ "2\n",
+ "16\n",
+ "3\n",
+ "22\n",
+ "4\n",
+ "28\n",
+ "5\n"
+ ]
+ }
+ ],
+ "source": [
+ "for number in range(1, 11):\n",
+ " if number % 2 == 0:\n",
+ " print(number // 2)\n",
+ " else:\n",
+ " print(3 * number + 1)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "#### The `elif` Clause"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "If we need to check for *several* **alternatives** (i.e., different conditions), we may add an arbitrary number of `elif`-clauses to an `if` statement.\n",
+ "\n",
+ "In the next example, we print out messages indicating the *largest* whole number by which a `number` may be divided.\n",
+ "\n",
+ "Note that [print() ](https://docs.python.org/3/library/functions.html#print) may take several inputs as well. The `\"...\"` notation is Python's way of modeling **textual data**."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 22,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "1 is divisible by neither 2 nor 3\n",
+ "2 is divisible by 2\n",
+ "3 is divisible by 3\n",
+ "4 is divisible by 2\n",
+ "5 is divisible by neither 2 nor 3\n",
+ "6 is divisible by 2\n",
+ "7 is divisible by neither 2 nor 3\n",
+ "8 is divisible by 2\n",
+ "9 is divisible by 3\n",
+ "10 is divisible by 2\n"
+ ]
+ }
+ ],
+ "source": [
+ "for number in range(1, 11):\n",
+ " if number % 2 == 0:\n",
+ " print(number, \"is divisible by 2\")\n",
+ " elif number % 3 == 0:\n",
+ " print(number, \"is divisible by 3\")\n",
+ " else:\n",
+ " print(number, \"is divisible by neither 2 nor 3\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "It is noteworthy that only the *first* block of code whose condition is `True` is executed!\n",
+ "\n",
+ "So, we must be careful not to make any logical errors: In the example below, we *never* reach the alternative where the `number` is divisible by `4` because whenever a `number` is divisible by `4` it is also always divisible by `2` as well."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 23,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "1 is divisible by neither 2, 3, nor 4\n",
+ "2 is divisible by 2\n",
+ "3 is divisible by 3\n",
+ "4 is divisible by 2\n",
+ "5 is divisible by neither 2, 3, nor 4\n",
+ "6 is divisible by 2\n",
+ "7 is divisible by neither 2, 3, nor 4\n",
+ "8 is divisible by 2\n",
+ "9 is divisible by 3\n",
+ "10 is divisible by 2\n"
+ ]
+ }
+ ],
+ "source": [
+ "for number in range(1, 11):\n",
+ " if number % 2 == 0:\n",
+ " print(number, \"is divisible by 2\")\n",
+ " elif number % 3 == 0:\n",
+ " print(number, \"is divisible by 3\")\n",
+ " elif number % 4 == 0:\n",
+ " print(number, \"is divisible by 4\")\n",
+ " else:\n",
+ " print(number, \"is divisible by neither 2, 3, nor 4\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "By re-arranging the order of the `if`- and `elif`- clauses, we obtain the correct output."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 24,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "1 is divisible by neither 2, 3, nor 4\n",
+ "2 is divisible by 2\n",
+ "3 is divisible by 3\n",
+ "4 is divisible by 4\n",
+ "5 is divisible by neither 2, 3, nor 4\n",
+ "6 is divisible by 3\n",
+ "7 is divisible by neither 2, 3, nor 4\n",
+ "8 is divisible by 4\n",
+ "9 is divisible by 3\n",
+ "10 is divisible by 2\n"
+ ]
+ }
+ ],
+ "source": [
+ "for number in range(1, 11):\n",
+ " if number % 4 == 0:\n",
+ " print(number, \"is divisible by 4\")\n",
+ " elif number % 3 == 0:\n",
+ " print(number, \"is divisible by 3\")\n",
+ " elif number % 2 == 0:\n",
+ " print(number, \"is divisible by 2\")\n",
+ " else:\n",
+ " print(number, \"is divisible by neither 2, 3, nor 4\")"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.8.12"
+ },
+ "toc": {
+ "base_numbering": 1,
+ "nav_menu": {},
+ "number_sections": false,
+ "sideBar": true,
+ "skip_h1_title": false,
+ "title_cell": "Table of Contents",
+ "title_sidebar": "Contents",
+ "toc_cell": false,
+ "toc_position": {},
+ "toc_section_display": true,
+ "toc_window_display": false
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/00_python_in_a_nutshell/03_exercises_loops.ipynb b/00_python_in_a_nutshell/03_exercises_loops.ipynb
new file mode 100644
index 0000000..58fc7ad
--- /dev/null
+++ b/00_python_in_a_nutshell/03_exercises_loops.ipynb
@@ -0,0 +1,213 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "**Note**: Click on \"*Kernel*\" > \"*Restart Kernel and Run All*\" in [JupyterLab](https://jupyterlab.readthedocs.io/en/stable/) *after* finishing the exercises to ensure that your solution runs top to bottom *without* any errors. If you cannot run this file on your machine, you may want to open it [in the cloud ](https://mybinder.org/v2/gh/webartifex/intro-to-data-science/main?urlpath=lab/tree/00_python_in_a_nutshell/03_exercises_loops.ipynb)."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Chapter 0: Python in a Nutshell (Coding Exercises)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "The exercises below assume that you have read the preceeding content sections.\n",
+ "\n",
+ "The `...`'s in the code cells indicate where you need to fill in code snippets. The number of `...`'s within a code cell give you a rough idea of how many lines of code are needed to solve the task. You should not need to create any additional code cells for your final solution. However, you may want to use temporary code cells to try out some ideas."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Simple Loops"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "`for`-loops are extremely versatile in Python. That is different from many other programming languages.\n",
+ "\n",
+ "Let's create a `list` holding the numbers from `1` to `12` in an unordered fashion, like `numbers` below, loop over the numbers on a one-by-one basis, and implement simple **filter** logics."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "numbers = [7, 11, 8, 5, 3, 12, 2, 6, 9, 10, 1, 4]"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "**Q1**: Fill in the *condition* in the `if` statement such that only numbers divisible by `3` are printed!"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "for number in numbers:\n",
+ " if ...:\n",
+ " print(...)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "An easy way to loop over a `list` in a sorted manner, is to wrap it with the built-in [sorted() ](https://docs.python.org/3/library/functions.html#sorted) function.\n",
+ "\n",
+ "**Q2**: Fill in the condition of the `if` statement such that only odd numbers are printed out!"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "for number in sorted(numbers):\n",
+ " if ...:\n",
+ " print(...)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Whenever we want to loop over numbers representing a [series ](https://en.wikipedia.org/wiki/Series_%28mathematics%29) in the mathematical sense (i.e., a rule to calculate the next number from its predecessor), we may be able to use the [range() ](https://docs.python.org/3/library/functions.html#func-range) built-in.\n",
+ "\n",
+ "For example, to loop over the whole numbers from `0` to `9` (both including) in order, we could write them out in a `list` like in the following task.\n",
+ "\n",
+ "**Q3**: Fill in the call to the [print() ](https://docs.python.org/3/library/functions.html#print) function such that the squares of the `numbers` are printed out!"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "for number in [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]:\n",
+ " print(...)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "**Q4**: Read the documentation on the [range() ](https://docs.python.org/3/library/functions.html#func-range) built-in! It may be used with either one, two, or three inputs. What do `start`, `stop`, and `step` mean? Fill in the calls to [range() ](https://docs.python.org/3/library/functions.html#func-range) and [print() ](https://docs.python.org/3/library/functions.html#print) to mimic the output of **Q3**!"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "for number in range(...):\n",
+ " print(...)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "**Q5**: Fill in the calls to [range() ](https://docs.python.org/3/library/functions.html#func-range) and [print() ](https://docs.python.org/3/library/functions.html#print) to print out *all* numbers from `1` to `10` (both including)!"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "for number in range(...):\n",
+ " print(...)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "**Q6**: Fill in the calls to [range() ](https://docs.python.org/3/library/functions.html#func-range) and [print() ](https://docs.python.org/3/library/functions.html#print) to print out the *even* numbers from `1` to `10` (both including)! Do *not* use an `if` statement to accomplish this!"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "for number in range(...):\n",
+ " print(...)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "**Q7**: Fill in the calls to [range() ](https://docs.python.org/3/library/functions.html#func-range) and [print() ](https://docs.python.org/3/library/functions.html#print) to print out the *odd* numbers from `10` to `1` (both including) going backwards!"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "for number in range(...):\n",
+ " print(...)"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.8.12"
+ },
+ "toc": {
+ "base_numbering": 1,
+ "nav_menu": {},
+ "number_sections": false,
+ "sideBar": true,
+ "skip_h1_title": true,
+ "title_cell": "Table of Contents",
+ "title_sidebar": "Contents",
+ "toc_cell": false,
+ "toc_position": {},
+ "toc_section_display": false,
+ "toc_window_display": false
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/00_python_in_a_nutshell/04_exercises_fizz_buzz.ipynb b/00_python_in_a_nutshell/04_exercises_fizz_buzz.ipynb
new file mode 100644
index 0000000..57df84f
--- /dev/null
+++ b/00_python_in_a_nutshell/04_exercises_fizz_buzz.ipynb
@@ -0,0 +1,147 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "**Note**: Click on \"*Kernel*\" > \"*Restart Kernel and Run All*\" in [JupyterLab](https://jupyterlab.readthedocs.io/en/stable/) *after* finishing the exercises to ensure that your solution runs top to bottom *without* any errors. If you cannot run this file on your machine, you may want to open it [in the cloud ](https://mybinder.org/v2/gh/webartifex/intro-to-data-science/main?urlpath=lab/tree/00_python_in_a_nutshell/04_exercises_fizz_buzz.ipynb)."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Chapter 0: Python in a Nutshell (Coding Exercises)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "The exercises below assume that you have read the preceeding content sections.\n",
+ "\n",
+ "The `...`'s in the code cells indicate where you need to fill in code snippets. The number of `...`'s within a code cell give you a rough idea of how many lines of code are needed to solve the task. You should not need to create any additional code cells for your final solution. However, you may want to use temporary code cells to try out some ideas."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Fizz Buzz"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "The kids game [Fizz Buzz ](https://en.wikipedia.org/wiki/Fizz_buzz) is said to be often used in job interviews for entry-level positions. However, opinions vary as to how good of a test it is (cf., [source ](https://news.ycombinator.com/item?id=16446774)).\n",
+ "\n",
+ "In its simplest form, a group of people starts counting upwards in an alternating fashion. Whenever a number is divisible by $3$, the person must say \"Fizz\" instead of the number. The same holds for numbers divisible by $5$ when the person must say \"Buzz.\" If a number is divisible by both numbers, one must say \"FizzBuzz.\" Probably, this game would also make a good drinking game with the \"right\" beverages."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "**Q1**: Complete the code cell below implementing the Fizz Buzz game with some `if`-`elif`-`else` logic according to the rules above for the numbers from `1` to `33`! The cell should simply print out either the numbers or the words \"Fizz,\" \"Buzz,\" or \"FizzBuzz.\"\n",
+ "\n",
+ "Hints:\n",
+ "- The *order* of the **conditions** is important\n",
+ "- By what single number can all numbers divisible by both `3` and `5` by divided by?"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "for number in range(...):\n",
+ " ...\n",
+ " ...\n",
+ " ...\n",
+ " ...\n",
+ " ...\n",
+ " ...\n",
+ " ...\n",
+ " ..."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "**Q2.1**: Re-write your solution from **Q1** such that the [print() ](https://docs.python.org/3/library/functions.html#print) function is executed only once!\n",
+ "\n",
+ "Hints:\n",
+ "- You can also store **text data** in variables\n",
+ "- It may be helpful to *overwrite* the `out` variable"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "for out in range(...):\n",
+ " ...\n",
+ " ...\n",
+ " ...\n",
+ " ...\n",
+ " ...\n",
+ " ...\n",
+ "\n",
+ " print(out)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "**Q2.2**: What may be advantages of writing code like in the solution to **Q2.1**?"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ " < your answer >"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.8.12"
+ },
+ "toc": {
+ "base_numbering": 1,
+ "nav_menu": {},
+ "number_sections": false,
+ "sideBar": true,
+ "skip_h1_title": true,
+ "title_cell": "Table of Contents",
+ "title_sidebar": "Contents",
+ "toc_cell": false,
+ "toc_position": {},
+ "toc_section_display": false,
+ "toc_window_display": false
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/00_python_in_a_nutshell/05_content_functions.ipynb b/00_python_in_a_nutshell/05_content_functions.ipynb
new file mode 100644
index 0000000..1b56e0e
--- /dev/null
+++ b/00_python_in_a_nutshell/05_content_functions.ipynb
@@ -0,0 +1,540 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "**Note**: Click on \"*Kernel*\" > \"*Restart Kernel and Clear All Outputs*\" in [JupyterLab](https://jupyterlab.readthedocs.io/en/stable/) *before* reading this notebook to reset its output. If you cannot run this file on your machine, you may want to open it [in the cloud ](https://mybinder.org/v2/gh/webartifex/intro-to-data-science/main?urlpath=lab/tree/00_python_in_a_nutshell/05_content_functions.ipynb)."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Chapter 0: Python in a Nutshell (Part 3)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "One big idea in software engineering is to **modularize** code. The purpose of that is manyfold. Two very important motivations are:\n",
+ "- make a code block **re-usable** and\n",
+ "- give it a meaningful name.\n",
+ "\n",
+ "The latter gets more important as the codebase in a project grows so big that we can only look at a tiny fraction of it at one point in time."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## User-defined Functions"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "One syntactical construct to achieve modularization is that of a **function definition**. Just like in math, we can \"define\" a function as a set of parametrized instructions that provide some **output** given some **input**.\n",
+ "\n",
+ "A function is defined with the `def` statement: After the `def` part comes the name of the function followed by the **parameter list** within parentheses. The first couple of lines in the function's body should be a so-called **docstring** that describes what the function does in plain English. Then, comes the code that is to be made repeatable. In the example below, we simply copy & pasted the code to calculate the sum of all even numbers in a `list` into the example function `add_evens()`. Note that we exchanged the variable name `total` with `result` here to illustrate a point further below. In order for the function to provide back the output to \"the outside world,\" we use the `return` statement (Hint: to see its effect simply re-run the couple of code cells below with and without the `return result` line)."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "def add_evens(numbers):\n",
+ " \"\"\"Sum up all the even numbers in a list.\n",
+ "\n",
+ " Args:\n",
+ " numbers (list of int's): numbers to be summed up\n",
+ "\n",
+ " Returns:\n",
+ " total (int)\n",
+ " \"\"\"\n",
+ " result = 0\n",
+ "\n",
+ " for number in numbers:\n",
+ " if number % 2 == 0: # if the number is even\n",
+ " result = result + number\n",
+ "\n",
+ " return result"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "After defining a function, we can **call** (i.e., \"execute\") it with the `()` operator.\n",
+ "\n",
+ "Let's execute the function with `numbers` as the input. We see the same `6` below the cell as we do above where we run the code without a function. Without the `return` statement in the function's body, we would not see any output here.\n",
+ "\n",
+ "To see what happens in detail, take a look at [PythonTutor ](https://pythontutor.com/visualize.html#code=numbers%20%3D%20%5B1,%202,%203,%204%5D%0A%0Adef%20add_evens%28numbers%29%3A%0A%20%20%20%20%22%22%22Sum%20up%20all%20the%20even%20numbers%20in%20a%20list.%22%22%22%0A%20%20%20%20result%20%3D%200%0A%0A%20%20%20%20for%20number%20in%20numbers%3A%0A%20%20%20%20%20%20%20%20if%20number%20%25%202%20%3D%3D%200%3A%0A%20%20%20%20%20%20%20%20%20%20%20%20result%20%3D%20result%20%2B%20number%0A%0A%20%20%20%20return%20result%0A%0Atotal%20%3D%20add_evens%28numbers%29&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false) again. You should notice how there are two variables by the name `numbers` in memory. Python manages the memory with a concept called **namespaces** or **scopes**, which are just fancy terms for saying that Python can tell variables from different contexts apart."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 2,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "numbers = [1, 2, 3, 4]"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "6"
+ ]
+ },
+ "execution_count": 3,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "add_evens(numbers)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "To re-use the *same* instructions with *different* input, we call the function a second time and give it a brand-new `list` of numbers as its input."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 4,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "30"
+ ]
+ },
+ "execution_count": 4,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "add_evens([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Note how the variable `result` only exists \"inside\" the `add_evens()` function. Hence, we see the `NameError` here."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 5,
+ "metadata": {},
+ "outputs": [
+ {
+ "ename": "NameError",
+ "evalue": "name 'result' is not defined",
+ "output_type": "error",
+ "traceback": [
+ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
+ "\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
+ "\u001b[0;32m/tmp/user/1000/ipykernel_305654/1049141082.py\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mresult\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
+ "\u001b[0;31mNameError\u001b[0m: name 'result' is not defined"
+ ]
+ }
+ ],
+ "source": [
+ "result"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Built-in Functions"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "The concept of re-usable functions is so important in programming that Python comes with many [built-in functions ](https://docs.python.org/3/library/functions.html).\n",
+ "\n",
+ "Two popular examples are the [sum() ](https://docs.python.org/3/library/functions.html#sum) and [len() ](https://docs.python.org/3/library/functions.html#len) functions that calculate the sum or the number of elements in a `list`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 6,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "10"
+ ]
+ },
+ "execution_count": 6,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "sum(numbers)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 7,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "4"
+ ]
+ },
+ "execution_count": 7,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "len(numbers)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "When working with numbers, the [round() ](https://docs.python.org/3/library/functions.html#round) function rounds `float`ing-point numbers (i.e., real numbers in the mathematical sense) into `int`egers. `float`s are numbers containing a `.` somewhere in its digits."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 8,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "7"
+ ]
+ },
+ "execution_count": 8,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "round(7.1)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 9,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "8"
+ ]
+ },
+ "execution_count": 9,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "round(7.9)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "The [round() ](https://docs.python.org/3/library/functions.html#round) function takes a second input called `ndigits` that allows us to customize the rounding even further. Then, [round() ](https://docs.python.org/3/library/functions.html#round) returns a `float`ing-point number!"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 10,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "7.1"
+ ]
+ },
+ "execution_count": 10,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "round(7.123, 1)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 11,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "7.12"
+ ]
+ },
+ "execution_count": 11,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "round(7.123, 2)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "As we saw before, the [print() ](https://docs.python.org/3/library/functions.html#print) function simply \"prints\" out its input to the screen. Below is the popular \"Hello World\" example that is shown in almost any introduction text on any programming language. The double quotes `\"` are a delimiter that specifies anything in between them as **textual data**. The docstring above in the tripple-double quotes notation is just a special case allowing the text to span several lines.\n",
+ "\n",
+ "The quotes themselves are *not* part of the value. So, they are *not* printed out."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 12,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Hello World\n"
+ ]
+ }
+ ],
+ "source": [
+ "print(\"Hello World\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Single quotes `'` are synonyms for double quotes `\"`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 13,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Hello World\n"
+ ]
+ }
+ ],
+ "source": [
+ "print('Hello World')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "The [print() ](https://docs.python.org/3/library/functions.html#print) function is often helpful to **debug** a code snippet (i.e., trying to figure out what it does, step by step)."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 14,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "The square of 1 is 1\n",
+ "The square of 2 is 4\n",
+ "The square of 3 is 9\n",
+ "The square of 4 is 16\n"
+ ]
+ }
+ ],
+ "source": [
+ "for number in numbers:\n",
+ " square = number ** 2\n",
+ " print(\"The square of\", number, \"is\", square)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Extending Core Python with the Standard Library"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "In the Python community, we even say that \"Python comes with batteries included,\" meaning that a plain Python installation (like the one you are probably using to execute this notebook) offers all kinds of functionalities for a multitude of application domains. Thus, the name **general purpose** language.\n",
+ "\n",
+ "To \"enable\" most of these, however, we need to first **import** them from the so-called [standard library ](https://docs.python.org/3/library/index.html). Let's do a quick example here and look at the [random ](https://docs.python.org/3/library/random.html) module that provides functionalities to simulate and work with random numbers."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 15,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import random"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "To access a function inside the [random ](https://docs.python.org/3/library/random.html) module, for example, the [random() ](https://docs.python.org/3/library/random.html#random.random) function, we use the `.` operator, formally called the attribute access operator. The [random() ](https://docs.python.org/3/library/random.html#random.random) function simply returns a random decimal number between `0` and `1`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 16,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "0.44374384200665107"
+ ]
+ },
+ "execution_count": 16,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "random.random()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "It could be used, for example, to model a fair coin toss by comparing the number it returns to `0.5` with the `<` operator: In 50% of the cases we see `True` and in the other 50% `False`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 17,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "False"
+ ]
+ },
+ "execution_count": 17,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "random.random() < 0.5"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "A second example would be the [choice() ](https://docs.python.org/3/library/random.html#random.choice) function, which draws a random element from a `list` with replacement. We could use it to model a fair die."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 18,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "3"
+ ]
+ },
+ "execution_count": 18,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "random.choice([1, 2, 3, 4, 5, 6])"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "In the \"*Extending Core Python with Third-party Packages*\" section in the next chapter, we see how Python can be extended even further by installing and importing **third-party packages**."
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.8.12"
+ },
+ "toc": {
+ "base_numbering": 1,
+ "nav_menu": {},
+ "number_sections": false,
+ "sideBar": true,
+ "skip_h1_title": false,
+ "title_cell": "Table of Contents",
+ "title_sidebar": "Contents",
+ "toc_cell": false,
+ "toc_position": {},
+ "toc_section_display": true,
+ "toc_window_display": false
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/00_python_in_a_nutshell/06_exercises_volume.ipynb b/00_python_in_a_nutshell/06_exercises_volume.ipynb
new file mode 100644
index 0000000..ded4b96
--- /dev/null
+++ b/00_python_in_a_nutshell/06_exercises_volume.ipynb
@@ -0,0 +1,292 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "**Note**: Click on \"*Kernel*\" > \"*Restart Kernel and Run All*\" in [JupyterLab](https://jupyterlab.readthedocs.io/en/stable/) *after* finishing the exercises to ensure that your solution runs top to bottom *without* any errors. If you cannot run this file on your machine, you may want to open it [in the cloud ](https://mybinder.org/v2/gh/webartifex/intro-to-data-science/main?urlpath=lab/tree/00_python_in_a_nutshell/06_exercises_volume.ipynb)."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Chapter 0: Python in a Nutshell (Coding Exercises)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "The exercises below assume that you have read the preceeding content sections.\n",
+ "\n",
+ "The `...`'s in the code cells indicate where you need to fill in code snippets. The number of `...`'s within a code cell give you a rough idea of how many lines of code are needed to solve the task. You should not need to create any additional code cells for your final solution. However, you may want to use temporary code cells to try out some ideas."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## Volume of a Sphere"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "The [volume of a sphere ](https://en.wikipedia.org/wiki/Sphere) is defined as $\\frac{4}{3} * \\pi * r^3$.\n",
+ "\n",
+ "In **Q2**, you will write a `function` implementing this formula, and in **Q3** and **Q5**, you will execute this `function` with a couple of example inputs.\n",
+ "\n",
+ "**Q1**: First, execute the next two code cells that import the `math` module from the [standard library ](https://docs.python.org/3/library/index.html) providing an approximation for $\\pi$!"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import math"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "math.pi"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "**Q2**: Implement the business logic in the `sphere_volume()` function below according to the specifications in the **docstring**!\n",
+ "\n",
+ "Hints:\n",
+ "- `sphere_volume()` takes a mandatory `radius` input and an optional `ndigits` input (defaulting to `5`)\n",
+ "- Because `math.pi` is constant, it may be used within `sphere_volume()` *without* being an official input\n",
+ "- The volume is returned as a so-called `float`ing-point number due to the rounding with the built-in [round() ](https://docs.python.org/3/library/functions.html#round) function\n",
+ "- You may either write your solution as one big expression (where the `...` are) or introduce an intermediate step holding the result before rounding (then, one more line of code is needed above the `return ...` one)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "def sphere_volume(radius, ndigits=5):\n",
+ " \"\"\"Calculate the volume of a sphere.\n",
+ "\n",
+ " Args:\n",
+ " radius (int or float): radius of the sphere\n",
+ " ndigits (optional, int): number of digits\n",
+ " when rounding the resulting volume\n",
+ "\n",
+ " Returns:\n",
+ " volume (float)\n",
+ " \"\"\"\n",
+ " return ..."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "**Q3**: Execute the function with `radius = 100.0` and 1, 5, 10, 15, and 20 as `ndigits` respectively."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "radius = 100.0"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "sphere_volume(...)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "sphere_volume(...)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "sphere_volume(...)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "sphere_volume(...)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "sphere_volume(...)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "**Q4**: What observation do you make?"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ " < your answer >"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "**Q4**: Using the [range() ](https://docs.python.org/3/library/functions.html#func-range) built-in, write a `for`-loop and calculate the volume of a sphere with `radius = 42.0` for all `ndigits` from `1` through `20`!\n",
+ "\n",
+ "Hint: You need to use the built-in [print() ](https://docs.python.org/3/library/functions.html#print) function to make the return values visible"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "radius = 42.0"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "for ... in ...:\n",
+ " ..."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "**Q5**: What lesson do you learn about `float`ing-point numbers?"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ " < your answer >"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "With the [round() ](https://docs.python.org/3/library/functions.html#round) function, we can see another technicality of the `float`ing-point standard: `float`s are *inherently* imprecise!\n",
+ "\n",
+ "**Q6**: Execute the following code cells to see a \"weird\" output! What could be the reasoning behind rounding this way?"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "round(1.5)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "round(2.5)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "round(3.5)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "round(4.5)"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.8.12"
+ },
+ "toc": {
+ "base_numbering": 1,
+ "nav_menu": {},
+ "number_sections": false,
+ "sideBar": true,
+ "skip_h1_title": true,
+ "title_cell": "Table of Contents",
+ "title_sidebar": "Contents",
+ "toc_cell": false,
+ "toc_position": {},
+ "toc_section_display": false,
+ "toc_window_display": false
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/00_python_in_a_nutshell/07_content_data_types.ipynb b/00_python_in_a_nutshell/07_content_data_types.ipynb
new file mode 100644
index 0000000..a40af42
--- /dev/null
+++ b/00_python_in_a_nutshell/07_content_data_types.ipynb
@@ -0,0 +1,708 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "**Note**: Click on \"*Kernel*\" > \"*Restart Kernel and Clear All Outputs*\" in [JupyterLab](https://jupyterlab.readthedocs.io/en/stable/) *before* reading this notebook to reset its output. If you cannot run this file on your machine, you may want to open it [in the cloud ](https://mybinder.org/v2/gh/webartifex/intro-to-data-science/main?urlpath=lab/tree/00_python_in_a_nutshell/07_content_data_types.ipynb)."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Chapter 0: Python in a Nutshell (Part 4)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "An important skill for any data scientist is to learn to \"think\" like a computer does. So far, we have seen that Python is a pretty \"intuitive\" language: Many concepts can already be understood after seeing them once or just a couple of times. Many of the aspects that make other languages harder to learn, are somehow \"magically\" automated by Python in the background, most notably the management of the memory.\n",
+ "\n",
+ "This section introduces a couple of more \"advanced\" concepts that presumably are *not* so intuitive to beginners."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## \"Simple\" Data Types"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "At first, let's review the concept of **object-orientation**, which is the paradigm by which Python manages the memory.\n",
+ "\n",
+ "Take the following three examples. Whereas `a` and `b` have the same **value** (i.e., **semantic meaning**) to us humans, we see in this section that there are a couple of caveats to look out for."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "a = 42\n",
+ "b = 42.0\n",
+ "c = 42.87"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "An important idea to understand is that each of the right-hand sides lead to a *new* **object** being created in the computer's memory *first*. An object can be thought of as a \"box\" in memory holding $1$s and $0$s (i.e., physical energy flows inside the computer).\n",
+ "\n",
+ "Objects can and do exist without being **referenced** by a variable. Also, an object may even have several variables referencing them, just as a human may have different names in different contexts (e.g., a formal name in the password, a name by which one is known to friends, and maybe a different name by which one is called by one's spouse).\n",
+ "\n",
+ "In the example, while both `a` and `b` have the *same* value, they are two *distinct* objects. The `is` operator checks if the objects referenced by two variables are indeed the *same* one, or, in other words, have the same **identity**."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 2,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "True"
+ ]
+ },
+ "execution_count": 2,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "a == b"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "False"
+ ]
+ },
+ "execution_count": 3,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "a is b"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Every object always has some **data type**, which determines how the object behaves and what we can do with it. The types of `a` and `b` are `int` and `float`, respectively."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 4,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "int"
+ ]
+ },
+ "execution_count": 4,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "type(a)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 5,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "float"
+ ]
+ },
+ "execution_count": 5,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "type(b)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "While it seems cumbersome to analyze numbers at this level of detail, the following code cell shows how `float`ing-point numbers, one gold standard of numbers in all of computer science and engineering, behave couter-intutive. Yet, *nothing* is wrong here."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 6,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "False"
+ ]
+ },
+ "execution_count": 6,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "0.1 + 0.2 == 0.3"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "The data type of an object also determines which **methods** we can invoke on it. A method is just a function that is \"attached\" to an object and can be accessed with the `.` operator seen above. A method necessarily needs the objects it is attached to as in input, which is why it is attached to an object to begin with.\n",
+ "\n",
+ "For example, `float` objects come with an `.is_integer()` method that tells us if the number has non-`0` decimals."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 7,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "True"
+ ]
+ },
+ "execution_count": 7,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "b.is_integer()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 8,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "False"
+ ]
+ },
+ "execution_count": 8,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "c.is_integer()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "`int` objects on the contrary have no notion of the concept of decimals, which is why they do *not* have an `.is_integer()` method. That is what the `AttributeError` tells us."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 9,
+ "metadata": {},
+ "outputs": [
+ {
+ "ename": "AttributeError",
+ "evalue": "'int' object has no attribute 'is_integer'",
+ "output_type": "error",
+ "traceback": [
+ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
+ "\u001b[0;31mAttributeError\u001b[0m Traceback (most recent call last)",
+ "\u001b[0;32m/tmp/user/1000/ipykernel_306555/2418692311.py\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0ma\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mis_integer\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
+ "\u001b[0;31mAttributeError\u001b[0m: 'int' object has no attribute 'is_integer'"
+ ]
+ }
+ ],
+ "source": [
+ "a.is_integer()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "What we could do here, is to take `a` and pass it to the [float() ](https://docs.python.org/3/library/functions.html#float) built-in, a so-called **constructor**, which takes the value of its input and creates a *new* object of the desired `float` type. Yet, we know the answer to `aa.is_integer()` already, even without executing the code cell as `a` has no non-`0` decimals to begin with."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 10,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "aa = float(a)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 11,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "True"
+ ]
+ },
+ "execution_count": 11,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "aa.is_integer()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Let's create another example `d` to see further examples of methods."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 12,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "d = \"Python rocks\""
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "The type of `d` is `str`, which is short for \"**string**\" and is defined in computer science as a sequence of characters."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 13,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "str"
+ ]
+ },
+ "execution_count": 13,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "type(d)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "`str` objects support various methods that \"make sense\" in the context of textual data, for example, the `.lower()` and `.upper()` methods."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 14,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "'python rocks'"
+ ]
+ },
+ "execution_count": 14,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "d.lower()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 15,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "'PYTHON ROCKS'"
+ ]
+ },
+ "execution_count": 15,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "d.upper()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### \"Complex\" Data Types"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "The examples in the previous section are considered \"simple\" as they only model *scalar* values (i.e., an individual object per example). However, we have already seen an example of a more \"complex\" object, namely the `list` called `numbers` from before."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 16,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "numbers = [1, 2, 3, 4]"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 17,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "list"
+ ]
+ },
+ "execution_count": 17,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "type(numbers)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 18,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "[1, 2, 3, 4]"
+ ]
+ },
+ "execution_count": 18,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "numbers"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "`list` objects also come with specific methods on them, for example, the `.append()` method that adds another element at the end of a `list`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 19,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "numbers.append(5)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Note how the `.append()` method does not lead to any output below the code cell. That is an indication that `numbers` is \"changed in place.\" The formal term for this property is **mutability**. A good working definition is: Any object whose value can be changed *after* its creation, is a **mutable** objects. Objects *without* this property are called **immutable**.\n",
+ "\n",
+ "An example for the latter, is the `tuple` data type. `tuple`s are simply `list`s with the additional property that they cannot be changed. Everything is else is the same as for `list`s. `tuple`s are created with parentheses replacing the brackets."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 20,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "more_numbers = (7, 8, 9)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "`more_numbers` does not know about the `.append()` method."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 21,
+ "metadata": {},
+ "outputs": [
+ {
+ "ename": "AttributeError",
+ "evalue": "'tuple' object has no attribute 'append'",
+ "output_type": "error",
+ "traceback": [
+ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
+ "\u001b[0;31mAttributeError\u001b[0m Traceback (most recent call last)",
+ "\u001b[0;32m/tmp/user/1000/ipykernel_306555/2667408552.py\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mmore_numbers\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mappend\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m10\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
+ "\u001b[0;31mAttributeError\u001b[0m: 'tuple' object has no attribute 'append'"
+ ]
+ }
+ ],
+ "source": [
+ "more_numbers.append(10)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Whereas both `list` and `tuple` objects perserve the **order** of their elements, the `set` data type does not. Additionally, any object may only be an element of a `set` at most once. The syntax to create `set`s are curly braces, `{` and `}`. By giving up order, `set` objects offer significantly increased processing speed in various situations."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 22,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "other_numbers = {3, 2, 1, 3, 3, 2}"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 23,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "{1, 2, 3}"
+ ]
+ },
+ "execution_count": 23,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "other_numbers"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "One last example of a \"complex\" data type is the `dict`ionary type, which models a mapping relationship among the objects it contains. The syntax to create `dict`s also involves curly braces with the additon of using a `:` to specify the mapping relationships.\n",
+ "\n",
+ "For example, to map `int`egers to `str`ings modeling the English words corresponding to the numbers, we could write the following. The objects to the left of the `:` take the role of the **keys** while the ones to the right take the role of the **values**."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 24,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "to_words = {\n",
+ " 0: \"zero\",\n",
+ " 1: \"one\",\n",
+ " 2: \"two\",\n",
+ "}"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "The main purpose of `dict`s is to **look up** the value mapped to by some key. We can use the indexing notion to achieve that."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 25,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "'zero'"
+ ]
+ },
+ "execution_count": 25,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "to_words[0]"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Looking up the values can *not* be done as the `KeyError` below shows."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 26,
+ "metadata": {},
+ "outputs": [
+ {
+ "ename": "KeyError",
+ "evalue": "'zero'",
+ "output_type": "error",
+ "traceback": [
+ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
+ "\u001b[0;31mKeyError\u001b[0m Traceback (most recent call last)",
+ "\u001b[0;32m/tmp/user/1000/ipykernel_306555/3320204082.py\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mto_words\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m\"zero\"\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
+ "\u001b[0;31mKeyError\u001b[0m: 'zero'"
+ ]
+ }
+ ],
+ "source": [
+ "to_words[\"zero\"]"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Instead, we would have to create a `dict` mapping the words to numbers, like the one below."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 27,
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "to_numbers = {\n",
+ " \"zero\": 0,\n",
+ " \"one\": 1,\n",
+ " \"two\": 2,\n",
+ "}"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 28,
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "0"
+ ]
+ },
+ "execution_count": 28,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "to_numbers[\"zero\"]"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "`dict`s are among the most optimized data type in the Python world and a major building block in codebases solving real-life problems."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "A big factor in getting good at any programming language is to learn what data types to use in which situations. There is no \"best\" data type; choosing among a couple of data types always comes down to trade-offs."
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.8.12"
+ },
+ "toc": {
+ "base_numbering": 1,
+ "nav_menu": {},
+ "number_sections": false,
+ "sideBar": true,
+ "skip_h1_title": false,
+ "title_cell": "Table of Contents",
+ "title_sidebar": "Contents",
+ "toc_cell": false,
+ "toc_position": {},
+ "toc_section_display": true,
+ "toc_window_display": false
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/01_scientific_stack.ipynb b/01_scientific_stack/00_content.ipynb
similarity index 99%
rename from 01_scientific_stack.ipynb
rename to 01_scientific_stack/00_content.ipynb
index 694376b..3498dbd 100644
--- a/01_scientific_stack.ipynb
+++ b/01_scientific_stack/00_content.ipynb
@@ -13,8 +13,20 @@
"source": [
"Python itself does not come with any scientific algorithms. However, over time, many third-party libraries emerged that are useful to build machine learning applications. In this context, \"third-party\" means that the libraries are *not* part of Python's standard library.\n",
"\n",
- "Among the popular ones are [numpy](https://numpy.org/) (numerical computations, linear algebra), [pandas](https://pandas.pydata.org/) (data processing), [matplotlib](https://matplotlib.org/) (visualisations), and [scikit-learn](https://scikit-learn.org/stable/index.html) (machine learning algorithms).\n",
- "\n",
+ "Among the popular ones are [numpy](https://numpy.org/) (numerical computations, linear algebra), [pandas](https://pandas.pydata.org/) (data processing), [matplotlib](https://matplotlib.org/) (visualisations), and [scikit-learn](https://scikit-learn.org/stable/index.html) (machine learning algorithms)."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Extending Core Python with Third-party Packages"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
"Before we can import these libraries, we must ensure that they installed on our computers. If you installed Python via the Anaconda Distribution that should already be the case. Otherwise, we can use Python's **package manager** `pip` to install them manually.\n",
"\n",
"`pip` is a so-called command-line interface (CLI), meaning it is a program that is run within a terminal window. JupyterLab allows us to run such a CLI tool from within a notebook by starting a code cell with a single `%` symbol. Here, this does not mean Python's modulo operator but is just an instruction to JupyterLab that the following code is *not* Python."
@@ -627,7 +639,7 @@
],
"metadata": {
"kernelspec": {
- "display_name": "Python 3",
+ "display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
@@ -641,7 +653,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
- "version": "3.8.9"
+ "version": "3.8.12"
},
"toc": {
"base_numbering": 1,
diff --git a/02_a_first_example.ipynb b/02_classification/00_content.ipynb
similarity index 99%
rename from 02_a_first_example.ipynb
rename to 02_classification/00_content.ipynb
index 5339d00..90d4f54 100644
--- a/02_a_first_example.ipynb
+++ b/02_classification/00_content.ipynb
@@ -1150,7 +1150,7 @@
],
"metadata": {
"kernelspec": {
- "display_name": "Python 3",
+ "display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
@@ -1164,7 +1164,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
- "version": "3.8.9"
+ "version": "3.8.12"
},
"toc": {
"base_numbering": 1,
diff --git a/static/3_types_of_machine_learning.png b/02_classification/static/3_types_of_machine_learning.png
similarity index 100%
rename from static/3_types_of_machine_learning.png
rename to 02_classification/static/3_types_of_machine_learning.png
diff --git a/static/classification_vs_regression.png b/02_classification/static/classification_vs_regression.png
similarity index 100%
rename from static/classification_vs_regression.png
rename to 02_classification/static/classification_vs_regression.png
diff --git a/static/examples.png b/02_classification/static/examples.png
similarity index 100%
rename from static/examples.png
rename to 02_classification/static/examples.png
diff --git a/static/generalization.png b/02_classification/static/generalization.png
similarity index 100%
rename from static/generalization.png
rename to 02_classification/static/generalization.png
diff --git a/static/iris.png b/02_classification/static/iris.png
similarity index 100%
rename from static/iris.png
rename to 02_classification/static/iris.png
diff --git a/static/iris_data.png b/02_classification/static/iris_data.png
similarity index 100%
rename from static/iris_data.png
rename to 02_classification/static/iris_data.png
diff --git a/static/knn.png b/02_classification/static/knn.png
similarity index 100%
rename from static/knn.png
rename to 02_classification/static/knn.png
diff --git a/static/python_ml_book.png b/02_classification/static/python_ml_book.png
similarity index 100%
rename from static/python_ml_book.png
rename to 02_classification/static/python_ml_book.png
diff --git a/static/r_ml_book.png b/02_classification/static/r_ml_book.png
similarity index 100%
rename from static/r_ml_book.png
rename to 02_classification/static/r_ml_book.png
diff --git a/static/spam.png b/02_classification/static/spam.png
similarity index 100%
rename from static/spam.png
rename to 02_classification/static/spam.png
diff --git a/static/what_is_machine_learning.png b/02_classification/static/what_is_machine_learning.png
similarity index 100%
rename from static/what_is_machine_learning.png
rename to 02_classification/static/what_is_machine_learning.png
diff --git a/README.md b/README.md
index fdd6a87..a6dd083 100644
--- a/README.md
+++ b/README.md
@@ -9,9 +9,17 @@ To learn about Python and programming in detail,
### Table of Contents
-- *Chapter 0*: [Python in a Nutshell](00_python_in_a_nutshell.ipynb)
-- *Chapter 1*: [Python's Scientific Stack](01_scientific_stack.ipynb)
-- *Chapter 2*: [A first Example: Classifying Flowers](02_a_first_example.ipynb)
+- *Chapter 0*: **Python in a Nutshell**
+ - *Content*: [Basic Arithmetic](00_python_in_a_nutshell/00_content_arithmetic.ipynb)
+ - *Exercises*: [Python as a Calculator](00_python_in_a_nutshell/01_exercises_calculator.ipynb)
+ - *Content*: [Business Logic](00_python_in_a_nutshell/02_content_logic.ipynb)
+ - *Exercises*: [Simple Loops](00_python_in_a_nutshell/03_exercises_loops.ipynb)
+ - *Exercises*: [Fizz Buzz](00_python_in_a_nutshell/04_exercises_fizz_buzz.ipynb)
+ - *Content*: [Functions](00_python_in_a_nutshell/05_content_functions.ipynb)
+ - *Exercises*: [Volume of a Sphere](00_python_in_a_nutshell/06_exercises_volume.ipynb)
+ - *Content*: [Data Types](00_python_in_a_nutshell/07_content_data_types.ipynb)
+- *Chapter 1*: [Python's Scientific Stack](01_scientific_stack/00_content.ipynb)
+- *Chapter 2*: [A first Example: Classifying Flowers](02_classification/00_content.ipynb)
- *Chapter 3*: [Case Study: House Prices in Ames, Iowa ](https://github.com/webartifex/ames-housing)
diff --git a/static/link/to_hn.png b/static/link/to_hn.png
new file mode 100644
index 0000000..b8a4a66
Binary files /dev/null and b/static/link/to_hn.png differ
diff --git a/static/link/to_mb.png b/static/link/to_mb.png
new file mode 100644
index 0000000..ae37d50
Binary files /dev/null and b/static/link/to_mb.png differ