diff --git a/01_elements_00_lecture.ipynb b/01_elements_00_lecture.ipynb index f7583ba..89beede 100644 --- a/01_elements_00_lecture.ipynb +++ b/01_elements_00_lecture.ipynb @@ -1308,7 +1308,7 @@ } }, "source": [ - "The `c` object is a so-called **string** type (i.e., `str`), which is Python's way of representing \"text.\" Strings also come with peculiar behaviors, for example, to convert a text to lower, upper, or title case." + "The `c` object is a so-called **string** type (i.e., `str`), which is Python's way of representing \"text.\" Strings also come with peculiar behaviors, for example, to make a text lower or upper case." ] }, { @@ -1383,30 +1383,6 @@ "c.upper()" ] }, - { - "cell_type": "code", - "execution_count": 41, - "metadata": { - "slideshow": { - "slide_type": "skip" - } - }, - "outputs": [ - { - "data": { - "text/plain": [ - "'Python Rocks'" - ] - }, - "execution_count": 41, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "c.title()" - ] - }, { "cell_type": "markdown", "metadata": { diff --git a/03_conditionals_02_exercises.ipynb b/03_conditionals_02_exercises.ipynb index 650b063..ac4acd1 100644 --- a/03_conditionals_02_exercises.ipynb +++ b/03_conditionals_02_exercises.ipynb @@ -19,7 +19,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Read [Chapter 3](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/03_conditionals_00_lecture.ipynb) of the book. Then, work through the exercises below." + "Read [Chapter 3](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/03_conditionals_00_lecture.ipynb) of the book. Then, work through the exercises below. The `...` indicate where you need to fill in your answers. You should not need to create any additional code cells." ] }, { @@ -136,7 +136,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Q1.4**: Looking at the `if`-`else`-logic in the function, why do you think the four example line items in **Q9.2** were chosen as they were?" + "**Q1.4**: Looking at the `if`-`else`-logic in the function, why do you think the four example line items in **Q1.2** were chosen as they were?" ] }, { diff --git a/04_iteration_00_lecture.ipynb b/04_iteration_00_lecture.ipynb index fb89130..61858c5 100644 --- a/04_iteration_00_lecture.ipynb +++ b/04_iteration_00_lecture.ipynb @@ -52,7 +52,7 @@ "cell_type": "markdown", "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "slide" } }, "source": [ diff --git a/05_numbers_00_lecture.ipynb b/05_numbers_00_lecture.ipynb index 531de04..771a0ad 100644 --- a/05_numbers_00_lecture.ipynb +++ b/05_numbers_00_lecture.ipynb @@ -29,7 +29,7 @@ "\n", "- [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements_00_lecture.ipynb#%28Data%29-Type-%2F-%22Behavior%22) reveals that numbers may come in *different* data types (i.e., `int` vs. `float` so far),\n", "- [Chapter 3](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/03_conditionals_00_lecture.ipynb#Boolean-Expressions) raises questions regarding the **limited precision** of `float` numbers (e.g., `42 == 42.000000000000001` evaluates to `True`), and\n", - "- [Chapter 4](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/04_iteration_00_lecture.ipynb#Infinite-Recursion) shows that sometimes a `float` \"walks\" and \"quacks\" like an `int`, whereas the reverse is true in other cases.\n", + "- [Chapter 4](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/04_iteration_00_lecture.ipynb#Infinite-Recursion) shows that sometimes a `float` \"walks\" and \"quacks\" like an `int`, whereas the reverse is true.\n", "\n", "This chapter introduces all the [built-in numeric types](https://docs.python.org/3/library/stdtypes.html#numeric-types-int-float-complex): `int`, `float`, and `complex`. To mitigate the limited precision of floating-point numbers, we also look at two replacements for the `float` type in the [standard library](https://docs.python.org/3/library/index.html), namely the `Decimal` type in the [decimals](https://docs.python.org/3/library/decimal.html#decimal.Decimal) and the `Fraction` type in the [fractions](https://docs.python.org/3/library/fractions.html#fractions.Fraction) module." ] @@ -53,7 +53,9 @@ } }, "source": [ - "The simplest numeric type is the `int` type: It behaves like an [integer in ordinary math](https://en.wikipedia.org/wiki/Integer) (i.e., the set $\\mathbb{Z}$) and supports operators in the way we saw in the section on arithmetic operators in [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements_00_lecture.ipynb#%28Arithmetic%29-Operators)." + "The simplest numeric type is the `int` type: It behaves like an [integer in ordinary math](https://en.wikipedia.org/wiki/Integer) (i.e., the set $\\mathbb{Z}$) and supports operators in the way we saw in the section on arithmetic operators in [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements_00_lecture.ipynb#%28Arithmetic%29-Operators).\n", + "\n", + "One way to create `int` objects is by simply writing its value as a literal with the digits `0` to `9`." ] }, { @@ -66,7 +68,7 @@ }, "outputs": [], "source": [ - "i = 789" + "a = 42" ] }, { @@ -77,7 +79,7 @@ } }, "source": [ - "Just like any other object, `789` has an identity, a type, and a value." + "Just like any other object, the `42` has an identity, a type, and a value." ] }, { @@ -92,7 +94,7 @@ { "data": { "text/plain": [ - "140166838695792" + "94784085682240" ] }, "execution_count": 2, @@ -101,7 +103,7 @@ } ], "source": [ - "id(i)" + "id(a)" ] }, { @@ -109,7 +111,7 @@ "execution_count": 3, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -125,7 +127,7 @@ } ], "source": [ - "type(i)" + "type(a)" ] }, { @@ -133,14 +135,14 @@ "execution_count": 4, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ - "789" + "42" ] }, "execution_count": 4, @@ -149,7 +151,7 @@ } ], "source": [ - "i" + "a" ] }, { @@ -160,7 +162,7 @@ } }, "source": [ - "A nice feature is using underscores `_` as (thousands) separator in numeric literals. For example, `1_000_000` evaluates to `1000000` in memory." + "A nice feature in newer Python versions is using underscores `_` as (thousands) separators in numeric literals. For example, `1_000_000` evaluates to `1000000` in memory; the `_` is ignored by the interpreter." ] }, { @@ -195,7 +197,7 @@ } }, "source": [ - "Whereas mathematicians argue what the term $0^0$ means (cf., this [article](https://en.wikipedia.org/wiki/Zero_to_the_power_of_zero)), programmers are pragmatic about this and define $0^0 = 1$." + "We may place the `_`s anywhere we want." ] }, { @@ -210,7 +212,7 @@ { "data": { "text/plain": [ - "1" + "123456789" ] }, "execution_count": 6, @@ -218,6 +220,252 @@ "output_type": "execute_result" } ], + "source": [ + "1_2_3_4_5_6_7_8_9" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "It is syntactically invalid to write out leading `0` in numeric literals. The reason for that will become apparent in the next section." + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "ename": "SyntaxError", + "evalue": "invalid token (, line 1)", + "output_type": "error", + "traceback": [ + "\u001b[0;36m File \u001b[0;32m\"\"\u001b[0;36m, line \u001b[0;32m1\u001b[0m\n\u001b[0;31m 042\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m invalid token\n" + ] + } + ], + "source": [ + "042" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Another way to create `int` objects is with the [int()](https://docs.python.org/3/library/functions.html#int) built-in that casts `float` or properly formatted `str` objects as integers. So, decimals are truncated (i.e., \"cut off\")." + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "42" + ] + }, + "execution_count": 8, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "int(42.11)" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "42" + ] + }, + "execution_count": 9, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "int(42.87)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Whereas the floor division operator `//` effectively rounds towards negative infinity (cf., the \"*(Arithmetic) Operators*\" section in [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements_00_lecture.ipynb#%28Arithmetic%29-Operators)), the [int()](https://docs.python.org/3/library/functions.html#int) built-in effectively rounds towards `0`." + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "-42" + ] + }, + "execution_count": 10, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "int(-42.87)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "When casting `str` objects as `int`, the [int()](https://docs.python.org/3/library/functions.html#int) built-in is less forgiving. We must not include any decimals as shows by the `ValueError`. Yet, leading and trailing whitespace is gracefully ignored." + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "42" + ] + }, + "execution_count": 11, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "int(\"42\")" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "ename": "ValueError", + "evalue": "invalid literal for int() with base 10: '42.0'", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"42.0\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;31mValueError\u001b[0m: invalid literal for int() with base 10: '42.0'" + ] + } + ], + "source": [ + "int(\"42.0\")" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "42" + ] + }, + "execution_count": 13, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "int(\" 42 \")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The `int` type follows all rules we know from math, apart from one exception: Whereas mathematicians to this day argue what the term $0^0$ means (cf., this [article](https://en.wikipedia.org/wiki/Zero_to_the_power_of_zero)), programmers are pragmatic about this and simply define $0^0 = 1$." + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "1" + ] + }, + "execution_count": 14, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "0 ** 0" ] @@ -270,17 +518,17 @@ "\n", "To encode the integer $3$, for example, we need to find a combination of $0$s and $1$s such that the sum of digits marked with a $1$ is equal to the number we want to encode. In the example, we set all bits to $0$ except for the first ($i=0$) and second ($i=1$) as $2^0 + 2^1 = 1 + 2 = 3$. So the binary representation of $3$ is $00~00~00~11$. To borrow some terminology from linear algebra, the $3$ is a linear combination of the digits where the coefficients are either $0$ or $1$: $3 = 0*128 + 0*64 + 0*32 + 0*16 + 0*8 + 0*4 + 1*2 + 1*1$. It is *guaranteed* that there is exactly *one* such combination for each number between $0$ and $255$.\n", "\n", - "As each bit in the binary representation is one of two values, we say that this representation has a base of $2$. Often, the base is indicated with a subscript to avoid confusion. For example, we write $3_{10} = 00000011_2$ or $3_{10} = 11_2$ for short omitting leading $0$s. A subscript of $10$ implies a decimal number as we know it from grade school.\n", + "As each bit in the binary representation is one of two values, we say that this representation has a base of $2$. Often, the base is indicated with a subscript to avoid confusion. For example, we write $3_{10} = 00000011_2$ or $3_{10} = 11_2$ for short omitting leading $0$s. A subscript of $10$ implies a decimal number as we know it from elementary school.\n", "\n", "We use the built-in [bin()](https://docs.python.org/3/library/functions.html#bin) function to obtain an `int` object's binary representation: It returns a `str` object starting with `\"0b\"` indicating the binary format and as many $0$s and $1$s as are necessary to encode the integer omitting leading $0$s." ] }, { "cell_type": "code", - "execution_count": 7, + "execution_count": 15, "metadata": { "slideshow": { - "slide_type": "fragment" + "slide_type": "slide" } }, "outputs": [ @@ -290,7 +538,7 @@ "'0b11'" ] }, - "execution_count": 7, + "execution_count": 15, "metadata": {}, "output_type": "execute_result" } @@ -312,7 +560,7 @@ }, { "cell_type": "code", - "execution_count": 8, + "execution_count": 16, "metadata": { "slideshow": { "slide_type": "fragment" @@ -325,7 +573,7 @@ "3" ] }, - "execution_count": 8, + "execution_count": 16, "metadata": {}, "output_type": "execute_result" } @@ -347,7 +595,7 @@ }, { "cell_type": "code", - "execution_count": 9, + "execution_count": 17, "metadata": { "slideshow": { "slide_type": "fragment" @@ -360,7 +608,7 @@ "3" ] }, - "execution_count": 9, + "execution_count": 17, "metadata": {}, "output_type": "execute_result" } @@ -377,31 +625,31 @@ } }, "source": [ - "Another example is the integer `177` that is the sum of $128 + 32 + 16 + 1$: Thus, its binary representation is the sequence of bits $10~11~00~01$, or to use our new notation, $177_{10} = 10110001_2$." + "It is convenient to use the underscore `_` to separate the `\"0b\"` prefix from the bits." ] }, { "cell_type": "code", - "execution_count": 10, + "execution_count": 18, "metadata": { "slideshow": { - "slide_type": "slide" + "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ - "'0b10110001'" + "3" ] }, - "execution_count": 10, + "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "bin(177)" + "0b_11" ] }, { @@ -412,31 +660,66 @@ } }, "source": [ - "Analogous to typing `177` into a code cell, we may write `0b10110001`, or `0b_1011_0001` to make use of the underscores, and create an `int` object with the value `177`." + "Another example is the integer `123` that is the sum of $64 + 32 + 16 + 8 + 2 + 1$: Thus, its binary representation is the sequence of bits $01~11~10~11$, or to use our new notation, $123_{10} = 1111011_2$." ] }, { "cell_type": "code", - "execution_count": 11, + "execution_count": 19, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ - "177" + "'0b1111011'" ] }, - "execution_count": 11, + "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "0b_1011_0001" + "bin(123)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Analogous to typing `123` into a code cell, we may write `0b1111011`, or `0b_111_1011` to make use of the underscores, and create an `int` object with the value `123`." + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "123" + ] + }, + "execution_count": 20, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "0b_111_1011" ] }, { @@ -452,7 +735,7 @@ }, { "cell_type": "code", - "execution_count": 12, + "execution_count": 21, "metadata": { "slideshow": { "slide_type": "slide" @@ -465,7 +748,7 @@ "'0b0'" ] }, - "execution_count": 12, + "execution_count": 21, "metadata": {}, "output_type": "execute_result" } @@ -476,10 +759,10 @@ }, { "cell_type": "code", - "execution_count": 13, + "execution_count": 22, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "skip" } }, "outputs": [ @@ -489,7 +772,7 @@ "'0b1'" ] }, - "execution_count": 13, + "execution_count": 22, "metadata": {}, "output_type": "execute_result" } @@ -500,10 +783,10 @@ }, { "cell_type": "code", - "execution_count": 14, + "execution_count": 23, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "skip" } }, "outputs": [ @@ -513,7 +796,7 @@ "'0b10'" ] }, - "execution_count": 14, + "execution_count": 23, "metadata": {}, "output_type": "execute_result" } @@ -524,10 +807,10 @@ }, { "cell_type": "code", - "execution_count": 15, + "execution_count": 24, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -537,7 +820,7 @@ "'0b11111111'" ] }, - "execution_count": 15, + "execution_count": 24, "metadata": {}, "output_type": "execute_result" } @@ -556,12 +839,12 @@ "source": [ "Groups of eight bits are also called a **byte**. As a byte can only represent non-negative integers up to $255$, the table above is extended conceptually with greater digits to the left to model integers beyond $255$. The memory management needed to implement this is built into Python, and we do not need to worry about it.\n", "\n", - "For example, the `789` from above is encoded with ten bits and $789_{10} = 1100010101_2$." + "For example, `789` is encoded with ten bits and $789_{10} = 1100010101_2$." ] }, { "cell_type": "code", - "execution_count": 16, + "execution_count": 25, "metadata": { "slideshow": { "slide_type": "fragment" @@ -574,13 +857,13 @@ "'0b1100010101'" ] }, - "execution_count": 16, + "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "bin(i) # = bin(789)" + "bin(789)" ] }, { @@ -598,7 +881,254 @@ "| Digit |$10^3$|$10^2$|$10^1$|$10^0$|\n", "| $=$ |$1000$| $100$| $10$ | $1$ |\n", "\n", - "Now, an integer is a linear combination of the digits where the coefficients are one of *ten* values, and the base is now $10$. For example, the number $123$ can be expressed as $0*1000 + 1*100 + 2*10 + 3*1$. So, the binary representation follows the same logic as the decimal system taught in grade school. The decimal system is intuitive to us humans, mostly as we learn to count with our *ten* fingers. The $0$s and $1$s in a computer's memory are therefore no rocket science; they only feel unintuitive for a beginner." + "Now, an integer is a linear combination of the digits where the coefficients are one of *ten* values, and the base is now $10$. For example, the number $123$ can be expressed as $0*1000 + 1*100 + 2*10 + 3*1$. So, the binary representation follows the same logic as the decimal system taught in elementary school. The decimal system is intuitive to us humans, mostly as we learn to count with our *ten* fingers. The $0$s and $1$s in a computer's memory are therefore no rocket science; they only feel unintuitive for a beginner." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "#### Arithmetic with Bits" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Adding two numbers in their binary representations is straightforward and works just like we all learned addition in elementary school. Going from right to left, we add the individual digits, and ..." + ] + }, + { + "cell_type": "code", + "execution_count": 26, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "3" + ] + }, + "execution_count": 26, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "1 + 2" + ] + }, + { + "cell_type": "code", + "execution_count": 27, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0b1 + 0b10 = 0b11'" + ] + }, + "execution_count": 27, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "bin(1) + \" + \" + bin(2) + \" = \" + bin(3)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "... if any two digits add up to $2$, the resulting digit is $0$ and a $1$ carries over." + ] + }, + { + "cell_type": "code", + "execution_count": 28, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "4" + ] + }, + "execution_count": 28, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "1 + 3" + ] + }, + { + "cell_type": "code", + "execution_count": 29, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0b1 + 0b11 = 0b100'" + ] + }, + "execution_count": 29, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "bin(1) + \" + \" + bin(3) + \" = \" + bin(4)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Multiplication is also quite easy. All we need to do is to multiply the left operand by all digits of the right operand separately and then add up the individual products, just like in elementary school." + ] + }, + { + "cell_type": "code", + "execution_count": 30, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "12" + ] + }, + "execution_count": 30, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "4 * 3" + ] + }, + { + "cell_type": "code", + "execution_count": 31, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0b100 * 0b11 = 0b1100'" + ] + }, + "execution_count": 31, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "bin(4) + \" * \" + bin(3) + \" = \" + bin(12)" + ] + }, + { + "cell_type": "code", + "execution_count": 32, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0b100 * 0b1 = 0b100'" + ] + }, + "execution_count": 32, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "bin(4) + \" * \" + bin(1) + \" = \" + bin(4) # multiply with first digit" + ] + }, + { + "cell_type": "code", + "execution_count": 33, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0b100 * 0b10 = 0b1000'" + ] + }, + "execution_count": 33, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "bin(4) + \" * \" + bin(2) + \" = \" + bin(8) # multiply with second digit" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The \"*Further Resources*\" section at the end of this chapter provides video tutorials on addition and multiplication in binary. Subtraction and division are a bit more involved but essentially also easy to understand." ] }, { @@ -676,7 +1206,7 @@ }, { "cell_type": "code", - "execution_count": 17, + "execution_count": 34, "metadata": { "slideshow": { "slide_type": "slide" @@ -689,7 +1219,7 @@ "'0x0'" ] }, - "execution_count": 17, + "execution_count": 34, "metadata": {}, "output_type": "execute_result" } @@ -700,10 +1230,10 @@ }, { "cell_type": "code", - "execution_count": 18, + "execution_count": 35, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -713,7 +1243,7 @@ "'0x1'" ] }, - "execution_count": 18, + "execution_count": 35, "metadata": {}, "output_type": "execute_result" } @@ -735,10 +1265,10 @@ }, { "cell_type": "code", - "execution_count": 19, + "execution_count": 36, "metadata": { "slideshow": { - "slide_type": "fragment" + "slide_type": "skip" } }, "outputs": [ @@ -748,7 +1278,7 @@ "'0x3'" ] }, - "execution_count": 19, + "execution_count": 36, "metadata": {}, "output_type": "execute_result" } @@ -770,7 +1300,7 @@ }, { "cell_type": "code", - "execution_count": 20, + "execution_count": 37, "metadata": { "slideshow": { "slide_type": "fragment" @@ -783,7 +1313,7 @@ "'0xa'" ] }, - "execution_count": 20, + "execution_count": 37, "metadata": {}, "output_type": "execute_result" } @@ -794,10 +1324,10 @@ }, { "cell_type": "code", - "execution_count": 21, + "execution_count": 38, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -807,7 +1337,7 @@ "'0xf'" ] }, - "execution_count": 21, + "execution_count": 38, "metadata": {}, "output_type": "execute_result" } @@ -824,12 +1354,12 @@ } }, "source": [ - "The binary representation of `177`, `\"0b10110001\"`, can be viewed as *two* groups of four bits, $1011$ and $0001$, that are encoded as $\\text{b}$ and $1$ in hexadecimal (cf., table above)." + "The binary representation of `123`, `0b_111_1011`, can be viewed as *two* groups of four bits, $0111$ and $1011$, that are encoded as $7$ and $\\text{b}$ in hexadecimal (cf., table above)." ] }, { "cell_type": "code", - "execution_count": 22, + "execution_count": 39, "metadata": { "slideshow": { "slide_type": "slide" @@ -839,40 +1369,40 @@ { "data": { "text/plain": [ - "'0b10110001'" + "'0b1111011'" ] }, - "execution_count": 22, + "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "bin(177)" + "bin(123)" ] }, { "cell_type": "code", - "execution_count": 23, + "execution_count": 40, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ - "'0xb1'" + "'0x7b'" ] }, - "execution_count": 23, + "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "hex(177)" + "hex(123)" ] }, { @@ -883,12 +1413,12 @@ } }, "source": [ - "To obtain a *new* `int` object with the value `177`, we call the [int()](https://docs.python.org/3/library/functions.html#int) built-in with a properly formatted `str` object and `base=16` as arguments." + "To obtain a *new* `int` object with the value `123`, we call the [int()](https://docs.python.org/3/library/functions.html#int) built-in with a properly formatted `str` object and `base=16` as arguments." ] }, { "cell_type": "code", - "execution_count": 24, + "execution_count": 41, "metadata": { "slideshow": { "slide_type": "fragment" @@ -898,16 +1428,16 @@ { "data": { "text/plain": [ - "177" + "123" ] }, - "execution_count": 24, + "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "int(\"0xb1\", base=16)" + "int(\"0x7b\", base=16)" ] }, { @@ -923,7 +1453,7 @@ }, { "cell_type": "code", - "execution_count": 25, + "execution_count": 42, "metadata": { "slideshow": { "slide_type": "fragment" @@ -933,16 +1463,16 @@ { "data": { "text/plain": [ - "177" + "123" ] }, - "execution_count": 25, + "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "0xb1" + "0x_7b" ] }, { @@ -958,7 +1488,31 @@ }, { "cell_type": "code", - "execution_count": 26, + "execution_count": 43, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0x0'" + ] + }, + "execution_count": 43, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "hex(0)" + ] + }, + { + "cell_type": "code", + "execution_count": 44, "metadata": { "slideshow": { "slide_type": "skip" @@ -971,7 +1525,7 @@ "'0xff'" ] }, - "execution_count": 26, + "execution_count": 44, "metadata": {}, "output_type": "execute_result" } @@ -993,7 +1547,7 @@ }, { "cell_type": "code", - "execution_count": 27, + "execution_count": 45, "metadata": { "slideshow": { "slide_type": "skip" @@ -1006,13 +1560,13 @@ "'0x315'" ] }, - "execution_count": 27, + "execution_count": 45, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "hex(i) # = hex(789)" + "hex(789)" ] }, { @@ -1045,14 +1599,14 @@ } }, "source": [ - "While there are conventions that model negative integers with $0$s and $1$s in memory (cf., [Two's Complement](https://en.wikipedia.org/wiki/Two%27s_complement)), Python manages that for us, and we do not look into the theory here for brevity. We have learned all we need to know about how integers are modeled in a computer.\n", + "While there are conventions that model negative integers with $0$s and $1$s in memory (cf., [Two's Complement](https://en.wikipedia.org/wiki/Two%27s_complement)), Python manages that for us, and we do not look into the theory here for brevity. We have learned all that a practitioner needs to know about how integers are modeled in a computer. The \"*Further Resources*\" section at the end of this chapter provides a video tutorial on how the [Two's Complement](https://en.wikipedia.org/wiki/Two%27s_complement) idea works.\n", "\n", - "The binary and hexadecimal representations of negative integers are identical to their positive counterparts except that they start with a minus sign `-`." + "The binary and hexadecimal representations of negative integers are identical to their positive counterparts except that they start with a minus sign `-`. However, as the video tutorial at the end of the chapter reveals, that is *not* how the bits are organized in memory." ] }, { "cell_type": "code", - "execution_count": 28, + "execution_count": 46, "metadata": { "slideshow": { "slide_type": "skip" @@ -1065,7 +1619,7 @@ "'-0b11'" ] }, - "execution_count": 28, + "execution_count": 46, "metadata": {}, "output_type": "execute_result" } @@ -1076,7 +1630,7 @@ }, { "cell_type": "code", - "execution_count": 29, + "execution_count": 47, "metadata": { "slideshow": { "slide_type": "skip" @@ -1089,7 +1643,7 @@ "'-0x3'" ] }, - "execution_count": 29, + "execution_count": 47, "metadata": {}, "output_type": "execute_result" } @@ -1100,7 +1654,7 @@ }, { "cell_type": "code", - "execution_count": 30, + "execution_count": 48, "metadata": { "slideshow": { "slide_type": "skip" @@ -1113,7 +1667,7 @@ "'-0b11111111'" ] }, - "execution_count": 30, + "execution_count": 48, "metadata": {}, "output_type": "execute_result" } @@ -1124,7 +1678,7 @@ }, { "cell_type": "code", - "execution_count": 31, + "execution_count": 49, "metadata": { "slideshow": { "slide_type": "skip" @@ -1137,7 +1691,7 @@ "'-0xff'" ] }, - "execution_count": 31, + "execution_count": 49, "metadata": {}, "output_type": "execute_result" } @@ -1170,7 +1724,7 @@ }, { "cell_type": "code", - "execution_count": 32, + "execution_count": 50, "metadata": { "slideshow": { "slide_type": "slide" @@ -1183,7 +1737,7 @@ "1" ] }, - "execution_count": 32, + "execution_count": 50, "metadata": {}, "output_type": "execute_result" } @@ -1194,10 +1748,10 @@ }, { "cell_type": "code", - "execution_count": 33, + "execution_count": 51, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -1207,7 +1761,7 @@ "42" ] }, - "execution_count": 33, + "execution_count": 51, "metadata": {}, "output_type": "execute_result" } @@ -1218,10 +1772,10 @@ }, { "cell_type": "code", - "execution_count": 34, + "execution_count": 52, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -1231,13 +1785,13 @@ "0.0" ] }, - "execution_count": 34, + "execution_count": 52, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "42.0 * False" + "42.87 * False" ] }, { @@ -1253,7 +1807,7 @@ }, { "cell_type": "code", - "execution_count": 35, + "execution_count": 53, "metadata": { "slideshow": { "slide_type": "slide" @@ -1266,7 +1820,7 @@ "1" ] }, - "execution_count": 35, + "execution_count": 53, "metadata": {}, "output_type": "execute_result" } @@ -1277,10 +1831,10 @@ }, { "cell_type": "code", - "execution_count": 36, + "execution_count": 54, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -1290,7 +1844,7 @@ "0" ] }, - "execution_count": 36, + "execution_count": 54, "metadata": {}, "output_type": "execute_result" } @@ -1312,7 +1866,7 @@ }, { "cell_type": "code", - "execution_count": 37, + "execution_count": 55, "metadata": { "slideshow": { "slide_type": "slide" @@ -1325,7 +1879,7 @@ "'0b1'" ] }, - "execution_count": 37, + "execution_count": 55, "metadata": {}, "output_type": "execute_result" } @@ -1336,10 +1890,10 @@ }, { "cell_type": "code", - "execution_count": 38, + "execution_count": 56, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -1349,7 +1903,7 @@ "'0b0'" ] }, - "execution_count": 38, + "execution_count": 56, "metadata": {}, "output_type": "execute_result" } @@ -1371,10 +1925,10 @@ }, { "cell_type": "code", - "execution_count": 39, + "execution_count": 57, "metadata": { "slideshow": { - "slide_type": "slide" + "slide_type": "skip" } }, "outputs": [ @@ -1384,7 +1938,7 @@ "'0x1'" ] }, - "execution_count": 39, + "execution_count": 57, "metadata": {}, "output_type": "execute_result" } @@ -1395,10 +1949,10 @@ }, { "cell_type": "code", - "execution_count": 40, + "execution_count": 58, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "skip" } }, "outputs": [ @@ -1408,7 +1962,7 @@ "'0x0'" ] }, - "execution_count": 40, + "execution_count": 58, "metadata": {}, "output_type": "execute_result" } @@ -1425,12 +1979,12 @@ } }, "source": [ - "As a reminder, the `None` literal is a type on its own, namely the `NoneType`, and different from `False`. It *cannot* be cast as an integer as the `TypeError` indicates." + "As a reminder, the `None` object is a type on its own, namely the `NoneType`, and different from `False`. It *cannot* be cast as an integer as the `TypeError` indicates." ] }, { "cell_type": "code", - "execution_count": 41, + "execution_count": 59, "metadata": { "slideshow": { "slide_type": "slide" @@ -1444,7 +1998,7 @@ "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mTypeError\u001b[0m: int() argument must be a string, a bytes-like object or a number, not 'NoneType'" ] } @@ -1476,12 +2030,12 @@ "\n", "We keep this overview rather short as such \"low-level\" operations are not needed by the data science practitioner regularly. Yet, it is worthwhile to have heard about them as they form the basis of all of arithmetic in computers.\n", "\n", - "The first operator is the **bitwise AND** operator `&`: It looks at the bits of its two operands, `11` and `13` in the example, in a pairwise fashion and *iff* both operands have a $1$ in the same position, will the resulting integer have a $1$ in this position as well. The binary representations of `11` and `13` have $1$s in their respective first and fourth bits, which is why `bin(11 & 13)` evaluates to `\"Ob1001\"`." + "The first operator is the **bitwise AND** operator `&`: It looks at the bits of its two operands, `11` and `13` in the example, in a pairwise fashion and if *both* operands have a $1$ in the *same* position, the resulting integer will have a $1$ in this position as well. Otherwise, the resulting integer will have a $0$ in this position. The binary representations of `11` and `13` both have $1$s in their respective first and fourth bits, which is why `bin(11 & 13)` evaluates to `Ob_1001` or `9`." ] }, { "cell_type": "code", - "execution_count": 47, + "execution_count": 60, "metadata": { "slideshow": { "slide_type": "slide" @@ -1491,24 +2045,48 @@ { "data": { "text/plain": [ - "'0b1011 & 0b1101'" + "9" ] }, - "execution_count": 47, + "execution_count": 60, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "bin(11) + \" & \" + bin(13) # to show the operands' binary representations" + "11 & 13" ] }, { "cell_type": "code", - "execution_count": 48, + "execution_count": 61, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0b1011 & 0b1101'" + ] + }, + "execution_count": 61, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "bin(11) + \" & \" + bin(13) # to show the operands' bits" + ] + }, + { + "cell_type": "code", + "execution_count": 62, + "metadata": { + "slideshow": { + "slide_type": "fragment" } }, "outputs": [ @@ -1518,7 +2096,7 @@ "'0b1001'" ] }, - "execution_count": 48, + "execution_count": 62, "metadata": {}, "output_type": "execute_result" } @@ -1535,12 +2113,12 @@ } }, "source": [ - "`0b1001` is the binary representation of `9`." + "`0b_1001` is the binary representation of `9`." ] }, { "cell_type": "code", - "execution_count": 49, + "execution_count": 63, "metadata": { "slideshow": { "slide_type": "fragment" @@ -1553,37 +2131,13 @@ "9" ] }, - "execution_count": 49, + "execution_count": 63, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "0b1001" - ] - }, - { - "cell_type": "code", - "execution_count": 50, - "metadata": { - "slideshow": { - "slide_type": "-" - } - }, - "outputs": [ - { - "data": { - "text/plain": [ - "9" - ] - }, - "execution_count": 50, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "11 & 13" + "0b_1001" ] }, { @@ -1594,12 +2148,12 @@ } }, "source": [ - "The **bitwise OR** operator `|` evaluates to an `int` object whose bits are set to $1$ if the corresponding bits of either *one* or *both* operands are $1$. So in the example `9 | 13` only the second bit is $0$ for both operands, which is why the expression evaluates to `\"0b1101\"`." + "The **bitwise OR** operator `|` evaluates to an `int` object whose bits are set to $1$ if the corresponding bits of either *one* or *both* operands are $1$. So in the example `9 | 13` only the second bit is $0$ for both operands, which is why the expression evaluates to `0b_1101` or `13`." ] }, { "cell_type": "code", - "execution_count": 51, + "execution_count": 64, "metadata": { "slideshow": { "slide_type": "slide" @@ -1609,24 +2163,48 @@ { "data": { "text/plain": [ - "'0b1001 | 0b1101'" + "13" ] }, - "execution_count": 51, + "execution_count": 64, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "bin(9) + \" | \" + bin(13) # to show the operands' binary representations" + "9 | 13" ] }, { "cell_type": "code", - "execution_count": 52, + "execution_count": 65, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0b1001 | 0b1101'" + ] + }, + "execution_count": 65, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "bin(9) + \" | \" + bin(13) # to show the operands' bits" + ] + }, + { + "cell_type": "code", + "execution_count": 66, + "metadata": { + "slideshow": { + "slide_type": "fragment" } }, "outputs": [ @@ -1636,7 +2214,7 @@ "'0b1101'" ] }, - "execution_count": 52, + "execution_count": 66, "metadata": {}, "output_type": "execute_result" } @@ -1653,12 +2231,12 @@ } }, "source": [ - "`0b1101` evaluates to the `int` object `13`." + "`0b_1101` evaluates to an `int` object with the value `13`." ] }, { "cell_type": "code", - "execution_count": 53, + "execution_count": 67, "metadata": { "slideshow": { "slide_type": "fragment" @@ -1671,37 +2249,13 @@ "13" ] }, - "execution_count": 53, + "execution_count": 67, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "0b1101" - ] - }, - { - "cell_type": "code", - "execution_count": 54, - "metadata": { - "slideshow": { - "slide_type": "-" - } - }, - "outputs": [ - { - "data": { - "text/plain": [ - "13" - ] - }, - "execution_count": 54, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "9 | 13" + "0b_1101" ] }, { @@ -1712,12 +2266,12 @@ } }, "source": [ - "The **bitwise XOR** operator `^` is a special case of the `|` operator in that it evaluates to an `int` object whose bits are set to $1$ if the corresponding bit of *exactly one* of the two operands is $1$. Colloquially, the \"X\" stands for \"exclusive.\" The `^` operator must *not* be confused with the exponentiation operator `**`! In the example, `9 ^ 13`, only the third bit differs between the two operands, which is why it evaluates to `\"0b100\"` omitting the fourth bit." + "The **bitwise XOR** operator `^` is a special case of the `|` operator in that it evaluates to an `int` object whose bits are set to $1$ if the corresponding bit of *exactly one* of the two operands is $1$. Colloquially, the \"X\" stands for \"exclusive.\" The `^` operator must *not* be confused with the exponentiation operator `**`! In the example, `9 ^ 13`, only the third bit differs between the two operands, which is why it evaluates to `0b_100` omitting the leading $0$." ] }, { "cell_type": "code", - "execution_count": 55, + "execution_count": 68, "metadata": { "slideshow": { "slide_type": "slide" @@ -1727,24 +2281,48 @@ { "data": { "text/plain": [ - "'0b1001 ^ 0b1101'" + "4" ] }, - "execution_count": 55, + "execution_count": 68, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "bin(9) + \" ^ \" + bin(13) # to show the operands' binary representations" + "9 ^ 13" ] }, { "cell_type": "code", - "execution_count": 56, + "execution_count": 69, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0b1001 ^ 0b1101'" + ] + }, + "execution_count": 69, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "bin(9) + \" ^ \" + bin(13) # to show the operands' bits" + ] + }, + { + "cell_type": "code", + "execution_count": 70, + "metadata": { + "slideshow": { + "slide_type": "fragment" } }, "outputs": [ @@ -1754,7 +2332,7 @@ "'0b100'" ] }, - "execution_count": 56, + "execution_count": 70, "metadata": {}, "output_type": "execute_result" } @@ -1771,12 +2349,12 @@ } }, "source": [ - "`0b100` evaluates to the `int` object `4`." + "`0b_100` evaluates to an `int` object with the value `4`." ] }, { "cell_type": "code", - "execution_count": 57, + "execution_count": 71, "metadata": { "slideshow": { "slide_type": "fragment" @@ -1789,37 +2367,13 @@ "4" ] }, - "execution_count": 57, + "execution_count": 71, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "0b100" - ] - }, - { - "cell_type": "code", - "execution_count": 58, - "metadata": { - "slideshow": { - "slide_type": "-" - } - }, - "outputs": [ - { - "data": { - "text/plain": [ - "4" - ] - }, - "execution_count": 58, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "9 ^ 13" + "0b_100" ] }, { @@ -1830,14 +2384,14 @@ } }, "source": [ - "The **bitwise NOT** operator `~`, sometimes also called **inversion** operator, is said to \"flip\" the $0$s into $1$s and the $1$s into $0$s. However, it is based on the aforementioned [Two's Complement](https://en.wikipedia.org/wiki/Two%27s_complement) convention and `~x = -(x + 1)` by definition (cf., the [reference](https://docs.python.org/3/reference/expressions.html#unary-arithmetic-and-bitwise-operations)). The full logic behind this is considered out of scope in this book.\n", + "The **bitwise NOT** operator `~`, sometimes also called **inversion** operator, is said to \"flip\" the $0$s into $1$s and the $1$s into $0$s. However, it is based on the aforementioned [Two's Complement](https://en.wikipedia.org/wiki/Two%27s_complement) convention and `~x = -(x + 1)` by definition (cf., the [reference](https://docs.python.org/3/reference/expressions.html#unary-arithmetic-and-bitwise-operations)). The full logic behind this, while actually quite simple, is considered out of scope in this book.\n", "\n", - "We can at least verify the definition by comparing the binary representations of `8` and `-9`: They are indeed the same." + "We can at least verify the definition by comparing the binary representations of `7` and `-8`: They are indeed the same." ] }, { "cell_type": "code", - "execution_count": 59, + "execution_count": 72, "metadata": { "slideshow": { "slide_type": "slide" @@ -1847,40 +2401,88 @@ { "data": { "text/plain": [ - "'-0b1001'" + "-8" ] }, - "execution_count": 59, + "execution_count": 72, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "bin(~8)" + "~7" ] }, { "cell_type": "code", - "execution_count": 60, + "execution_count": 73, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ - "'-0b1001'" + "True" ] }, - "execution_count": 60, + "execution_count": 73, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "bin(-(8 + 1))" + "~7 == -(7 + 1) # = Two's Complement" + ] + }, + { + "cell_type": "code", + "execution_count": 74, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'-0b1000'" + ] + }, + "execution_count": 74, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "bin(~7)" + ] + }, + { + "cell_type": "code", + "execution_count": 75, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'-0b1000'" + ] + }, + "execution_count": 75, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "bin(-(7 + 1))" ] }, { @@ -1896,10 +2498,10 @@ }, { "cell_type": "code", - "execution_count": 61, + "execution_count": 76, "metadata": { "slideshow": { - "slide_type": "fragment" + "slide_type": "skip" } }, "outputs": [ @@ -1909,13 +2511,13 @@ "-1" ] }, - "execution_count": 61, + "execution_count": 76, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "~8 + 8" + "~7 + 7" ] }, { @@ -1926,14 +2528,14 @@ } }, "source": [ - "Lastly, the **bitwise left and right shift** operators, `<<` and `>>`, shift all the bits either to the left or to the right. This corresponds to multiplying or dividing an integer by powers of $2$.\n", + "Lastly, the **bitwise left and right shift** operators, `<<` and `>>`, shift all the bits either to the left or to the right. This corresponds to multiplying or dividing an integer by powers of `2`.\n", "\n", "When shifting left, $0$s are filled in." ] }, { "cell_type": "code", - "execution_count": 62, + "execution_count": 77, "metadata": { "slideshow": { "slide_type": "slide" @@ -1943,64 +2545,88 @@ { "data": { "text/plain": [ - "'0b1001'" + "28" ] }, - "execution_count": 62, + "execution_count": 77, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "bin(9)" + "7 << 2" ] }, { "cell_type": "code", - "execution_count": 63, + "execution_count": 78, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ - "'0b100100'" + "'0b111'" ] }, - "execution_count": 63, + "execution_count": 78, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "bin(9 << 2)" + "bin(7)" ] }, { "cell_type": "code", - "execution_count": 64, + "execution_count": 79, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ - "36" + "'0b11100'" ] }, - "execution_count": 64, + "execution_count": 79, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "9 << 2" + "bin(7 << 2)" + ] + }, + { + "cell_type": "code", + "execution_count": 80, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "28" + ] + }, + "execution_count": 80, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "0b_1_1100" ] }, { @@ -2016,7 +2642,31 @@ }, { "cell_type": "code", - "execution_count": 65, + "execution_count": 81, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "3" + ] + }, + "execution_count": 81, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "7 >> 1" + ] + }, + { + "cell_type": "code", + "execution_count": 82, "metadata": { "slideshow": { "slide_type": "fragment" @@ -2026,40 +2676,64 @@ { "data": { "text/plain": [ - "'0b10'" + "'0b111'" ] }, - "execution_count": 65, + "execution_count": 82, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "bin(9 >> 2)" + "bin(7)" ] }, { "cell_type": "code", - "execution_count": 66, + "execution_count": 83, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ - "2" + "'0b11'" ] }, - "execution_count": 66, + "execution_count": 83, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "9 >> 2" + "bin(7 >> 1)" + ] + }, + { + "cell_type": "code", + "execution_count": 84, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "3" + ] + }, + "execution_count": 84, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "0b_11" ] }, { @@ -2092,7 +2766,7 @@ }, { "cell_type": "code", - "execution_count": 67, + "execution_count": 85, "metadata": { "slideshow": { "slide_type": "slide" @@ -2100,12 +2774,12 @@ }, "outputs": [], "source": [ - "f = 1.23" + "b = 42.0" ] }, { "cell_type": "code", - "execution_count": 68, + "execution_count": 86, "metadata": { "slideshow": { "slide_type": "fragment" @@ -2115,24 +2789,24 @@ { "data": { "text/plain": [ - "140166567715664" + "139663468493296" ] }, - "execution_count": 68, + "execution_count": 86, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "id(f)" + "id(b)" ] }, { "cell_type": "code", - "execution_count": 69, + "execution_count": 87, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -2142,37 +2816,37 @@ "float" ] }, - "execution_count": 69, + "execution_count": 87, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "type(f)" + "type(b)" ] }, { "cell_type": "code", - "execution_count": 70, + "execution_count": 88, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ - "1.23" + "42.0" ] }, - "execution_count": 70, + "execution_count": 88, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "f" + "b" ] }, { @@ -2188,7 +2862,7 @@ }, { "cell_type": "code", - "execution_count": 71, + "execution_count": 89, "metadata": { "slideshow": { "slide_type": "skip" @@ -2201,7 +2875,7 @@ "0.123456789" ] }, - "execution_count": 71, + "execution_count": 89, "metadata": {}, "output_type": "execute_result" } @@ -2223,7 +2897,7 @@ }, { "cell_type": "code", - "execution_count": 72, + "execution_count": 90, "metadata": { "slideshow": { "slide_type": "slide" @@ -2233,45 +2907,80 @@ { "data": { "text/plain": [ - "1.0" + "42.0" ] }, - "execution_count": 72, + "execution_count": 90, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "1. # 1 without a dot creates an int object" + "42." ] }, { "cell_type": "code", - "execution_count": 73, + "execution_count": 91, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ - "1.0" + "42.0" ] }, - "execution_count": 73, + "execution_count": 91, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "float(1)" + "float(42)" ] }, { "cell_type": "code", - "execution_count": 74, + "execution_count": 92, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "42.0" + ] + }, + "execution_count": 92, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "float(\"42\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Leading and trailing whitespace is ignored ..." + ] + }, + { + "cell_type": "code", + "execution_count": 93, "metadata": { "slideshow": { "slide_type": "skip" @@ -2281,16 +2990,52 @@ { "data": { "text/plain": [ - "1.0" + "42.87" ] }, - "execution_count": 74, + "execution_count": 93, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "float(\"1.000\")" + "float(\" 42.87 \")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "... but not whitespace in between." + ] + }, + { + "cell_type": "code", + "execution_count": 94, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "ename": "ValueError", + "evalue": "could not convert string to float: '42. 87'", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mfloat\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"42. 87\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;31mValueError\u001b[0m: could not convert string to float: '42. 87'" + ] + } + ], + "source": [ + "float(\"42. 87\")" ] }, { @@ -2306,7 +3051,7 @@ }, { "cell_type": "code", - "execution_count": 75, + "execution_count": 95, "metadata": { "slideshow": { "slide_type": "slide" @@ -2319,7 +3064,7 @@ "0.3333333333333333" ] }, - "execution_count": 75, + "execution_count": 95, "metadata": {}, "output_type": "execute_result" } @@ -2341,7 +3086,7 @@ }, { "cell_type": "code", - "execution_count": 76, + "execution_count": 96, "metadata": { "slideshow": { "slide_type": "fragment" @@ -2354,7 +3099,7 @@ "42.0" ] }, - "execution_count": 76, + "execution_count": 96, "metadata": {}, "output_type": "execute_result" } @@ -2365,10 +3110,10 @@ }, { "cell_type": "code", - "execution_count": 77, + "execution_count": 97, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -2378,7 +3123,7 @@ "42.0" ] }, - "execution_count": 77, + "execution_count": 97, "metadata": {}, "output_type": "execute_result" } @@ -2411,7 +3156,7 @@ }, { "cell_type": "code", - "execution_count": 78, + "execution_count": 98, "metadata": { "slideshow": { "slide_type": "slide" @@ -2424,7 +3169,7 @@ "1.23" ] }, - "execution_count": 78, + "execution_count": 98, "metadata": {}, "output_type": "execute_result" } @@ -2446,7 +3191,7 @@ }, { "cell_type": "code", - "execution_count": 79, + "execution_count": 99, "metadata": { "slideshow": { "slide_type": "skip" @@ -2455,10 +3200,10 @@ "outputs": [ { "ename": "SyntaxError", - "evalue": "invalid syntax (, line 1)", + "evalue": "invalid syntax (, line 1)", "output_type": "error", "traceback": [ - "\u001b[0;36m File \u001b[0;32m\"\"\u001b[0;36m, line \u001b[0;32m1\u001b[0m\n\u001b[0;31m 1.23 e0\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m invalid syntax\n" + "\u001b[0;36m File \u001b[0;32m\"\"\u001b[0;36m, line \u001b[0;32m1\u001b[0m\n\u001b[0;31m 1.23 e0\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m invalid syntax\n" ] } ], @@ -2468,7 +3213,7 @@ }, { "cell_type": "code", - "execution_count": 80, + "execution_count": 100, "metadata": { "slideshow": { "slide_type": "skip" @@ -2477,10 +3222,10 @@ "outputs": [ { "ename": "SyntaxError", - "evalue": "invalid syntax (, line 1)", + "evalue": "invalid syntax (, line 1)", "output_type": "error", "traceback": [ - "\u001b[0;36m File \u001b[0;32m\"\"\u001b[0;36m, line \u001b[0;32m1\u001b[0m\n\u001b[0;31m 1.23e 0\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m invalid syntax\n" + "\u001b[0;36m File \u001b[0;32m\"\"\u001b[0;36m, line \u001b[0;32m1\u001b[0m\n\u001b[0;31m 1.23e 0\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m invalid syntax\n" ] } ], @@ -2490,7 +3235,7 @@ }, { "cell_type": "code", - "execution_count": 81, + "execution_count": 101, "metadata": { "slideshow": { "slide_type": "skip" @@ -2499,10 +3244,10 @@ "outputs": [ { "ename": "SyntaxError", - "evalue": "invalid syntax (, line 1)", + "evalue": "invalid syntax (, line 1)", "output_type": "error", "traceback": [ - "\u001b[0;36m File \u001b[0;32m\"\"\u001b[0;36m, line \u001b[0;32m1\u001b[0m\n\u001b[0;31m 1.23e0.0\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m invalid syntax\n" + "\u001b[0;36m File \u001b[0;32m\"\"\u001b[0;36m, line \u001b[0;32m1\u001b[0m\n\u001b[0;31m 1.23e0.0\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m invalid syntax\n" ] } ], @@ -2518,32 +3263,32 @@ } }, "source": [ - "If we leave out the number to the left, Python raises a `NameError` as it unsuccessfully tries to look up a variable named `e1`." + "If we leave out the number to the left, Python raises a `NameError` as it unsuccessfully tries to look up a variable named `e0`." ] }, { "cell_type": "code", - "execution_count": 82, + "execution_count": 102, "metadata": { "slideshow": { - "slide_type": "fragment" + "slide_type": "skip" } }, "outputs": [ { "ename": "NameError", - "evalue": "name 'e1' is not defined", + "evalue": "name 'e0' is not defined", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0me1\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", - "\u001b[0;31mNameError\u001b[0m: name 'e1' is not defined" + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0me0\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;31mNameError\u001b[0m: name 'e0' is not defined" ] } ], "source": [ - "e1" + "e0" ] }, { @@ -2554,12 +3299,12 @@ } }, "source": [ - "So, to write $10^1$ in Python, we need to think of it as $1*10^1$ and write `1e1`." + "So, to write $10^0$ in Python, we need to think of it as $1*10^0$ and write `1e0`." ] }, { "cell_type": "code", - "execution_count": 83, + "execution_count": 103, "metadata": { "slideshow": { "slide_type": "fragment" @@ -2569,16 +3314,86 @@ { "data": { "text/plain": [ - "10.0" + "1.0" ] }, - "execution_count": 83, + "execution_count": 103, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "1e1" + "1e0" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "To express thousands of something (i.e., $10^3$), we write `1e3`." + ] + }, + { + "cell_type": "code", + "execution_count": 104, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "1000.0" + ] + }, + "execution_count": 104, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "1e3 # = thousands" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Similarly, to express, for example, milliseconds (i.e., $10^{-3} s$), we write `1e-3`." + ] + }, + { + "cell_type": "code", + "execution_count": 105, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "0.001" + ] + }, + "execution_count": 105, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "1e-3 # = milli" ] }, { @@ -2605,7 +3420,7 @@ }, { "cell_type": "code", - "execution_count": 84, + "execution_count": 106, "metadata": { "slideshow": { "slide_type": "slide" @@ -2618,18 +3433,18 @@ "nan" ] }, - "execution_count": 84, + "execution_count": 106, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "float(\"nan\") # also works as float(\"NaN\")" + "float(\"nan\") # also float(\"NaN\")" ] }, { "cell_type": "code", - "execution_count": 85, + "execution_count": 107, "metadata": { "slideshow": { "slide_type": "skip" @@ -2642,21 +3457,21 @@ "inf" ] }, - "execution_count": 85, + "execution_count": 107, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "float(\"+inf\") # also works as float(\"+infinity\")" + "float(\"+inf\") # also float(\"+infinity\") or float(\"infinity\")" ] }, { "cell_type": "code", - "execution_count": 86, + "execution_count": 108, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -2666,21 +3481,21 @@ "inf" ] }, - "execution_count": 86, + "execution_count": 108, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "float(\"inf\") # by omitting the plus sign we mean positive infinity" + "float(\"inf\") # same as float(\"+inf\")" ] }, { "cell_type": "code", - "execution_count": 87, + "execution_count": 109, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -2690,7 +3505,7 @@ "-inf" ] }, - "execution_count": 87, + "execution_count": 109, "metadata": {}, "output_type": "execute_result" } @@ -2712,7 +3527,7 @@ }, { "cell_type": "code", - "execution_count": 88, + "execution_count": 110, "metadata": { "slideshow": { "slide_type": "slide" @@ -2725,7 +3540,7 @@ "False" ] }, - "execution_count": 88, + "execution_count": 110, "metadata": {}, "output_type": "execute_result" } @@ -2742,17 +3557,52 @@ } }, "source": [ - "On the contrary, as two values go to infinity, there is no such concept as difference and *everything* compares equal." + "Another caveat is that any arithmetic involving a `nan` object results in `nan`. In other words, the addition below **fails silently** as no error is raised. As this also happens in accordance with the [IEEE 754](https://en.wikipedia.org/wiki/IEEE_754) standard, we *need* to be aware of that and check any data we work with for any `nan` occurrences *before* doing any calculations." ] }, { "cell_type": "code", - "execution_count": 89, + "execution_count": 111, "metadata": { "slideshow": { "slide_type": "fragment" } }, + "outputs": [ + { + "data": { + "text/plain": [ + "nan" + ] + }, + "execution_count": 111, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "42 + float(\"nan\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "On the contrary, as two values go to infinity, there is no such concept as difference and *everything* compares equal." + ] + }, + { + "cell_type": "code", + "execution_count": 112, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, "outputs": [ { "data": { @@ -2760,7 +3610,7 @@ "True" ] }, - "execution_count": 89, + "execution_count": 112, "metadata": {}, "output_type": "execute_result" } @@ -2782,7 +3632,7 @@ }, { "cell_type": "code", - "execution_count": 90, + "execution_count": 113, "metadata": { "slideshow": { "slide_type": "skip" @@ -2795,7 +3645,7 @@ "inf" ] }, - "execution_count": 90, + "execution_count": 113, "metadata": {}, "output_type": "execute_result" } @@ -2806,7 +3656,7 @@ }, { "cell_type": "code", - "execution_count": 91, + "execution_count": 114, "metadata": { "slideshow": { "slide_type": "fragment" @@ -2819,7 +3669,7 @@ "True" ] }, - "execution_count": 91, + "execution_count": 114, "metadata": {}, "output_type": "execute_result" } @@ -2841,7 +3691,7 @@ }, { "cell_type": "code", - "execution_count": 92, + "execution_count": 115, "metadata": { "slideshow": { "slide_type": "skip" @@ -2854,7 +3704,7 @@ "inf" ] }, - "execution_count": 92, + "execution_count": 115, "metadata": {}, "output_type": "execute_result" } @@ -2865,7 +3715,7 @@ }, { "cell_type": "code", - "execution_count": 93, + "execution_count": 116, "metadata": { "slideshow": { "slide_type": "skip" @@ -2878,7 +3728,7 @@ "True" ] }, - "execution_count": 93, + "execution_count": 116, "metadata": {}, "output_type": "execute_result" } @@ -2900,7 +3750,7 @@ }, { "cell_type": "code", - "execution_count": 94, + "execution_count": 117, "metadata": { "slideshow": { "slide_type": "skip" @@ -2913,7 +3763,7 @@ "inf" ] }, - "execution_count": 94, + "execution_count": 117, "metadata": {}, "output_type": "execute_result" } @@ -2924,7 +3774,7 @@ }, { "cell_type": "code", - "execution_count": 95, + "execution_count": 118, "metadata": { "slideshow": { "slide_type": "skip" @@ -2937,7 +3787,7 @@ "True" ] }, - "execution_count": 95, + "execution_count": 118, "metadata": {}, "output_type": "execute_result" } @@ -2959,7 +3809,7 @@ }, { "cell_type": "code", - "execution_count": 96, + "execution_count": 119, "metadata": { "slideshow": { "slide_type": "skip" @@ -2972,7 +3822,7 @@ "inf" ] }, - "execution_count": 96, + "execution_count": 119, "metadata": {}, "output_type": "execute_result" } @@ -2983,7 +3833,7 @@ }, { "cell_type": "code", - "execution_count": 97, + "execution_count": 120, "metadata": { "slideshow": { "slide_type": "fragment" @@ -2996,7 +3846,7 @@ "True" ] }, - "execution_count": 97, + "execution_count": 120, "metadata": {}, "output_type": "execute_result" } @@ -3013,12 +3863,12 @@ } }, "source": [ - "As a caveat, adding infinities of different signs is an *undefined operation* in math and results in a `nan` object. So, if we (accidentally or unknowingly) do this on a real dataset, we do *not* see any error messages, and our program may continue to run with non-meaningful results! This is an example of a piece of code **failing silently**." + "As a caveat, adding infinities of different signs is an *undefined operation* in math and results in a `nan` object. So, if we (accidentally or unknowingly) do this on a real dataset, we do *not* see any error messages, and our program may continue to run with non-meaningful results! This is another example of a piece of code **failing silently**." ] }, { "cell_type": "code", - "execution_count": 98, + "execution_count": 121, "metadata": { "slideshow": { "slide_type": "slide" @@ -3031,7 +3881,7 @@ "nan" ] }, - "execution_count": 98, + "execution_count": 121, "metadata": {}, "output_type": "execute_result" } @@ -3042,10 +3892,10 @@ }, { "cell_type": "code", - "execution_count": 99, + "execution_count": 122, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -3055,7 +3905,7 @@ "nan" ] }, - "execution_count": 99, + "execution_count": 122, "metadata": {}, "output_type": "execute_result" } @@ -3085,71 +3935,12 @@ "source": [ "`float` objects are *inherently* imprecise, and there is *nothing* we can do about it! In particular, arithmetic operations with two `float` objects may result in \"weird\" rounding \"errors\" that are strictly deterministic and occur in accordance with the [IEEE 754](https://en.wikipedia.org/wiki/IEEE_754) standard.\n", "\n", - "For example, let's add `1e0` to `1e15` and `1e16`, respectively. In the latter case, the `1e0` somehow gets \"lost.\"" + "For example, let's add `1` to `1e15` and `1e16`, respectively. In the latter case, the `1` somehow gets \"lost.\"" ] }, { "cell_type": "code", - "execution_count": 100, - "metadata": { - "slideshow": { - "slide_type": "skip" - } - }, - "outputs": [ - { - "data": { - "text/plain": [ - "1000000000000001.0" - ] - }, - "execution_count": 100, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "1e15 + 1e0" - ] - }, - { - "cell_type": "code", - "execution_count": 101, - "metadata": { - "slideshow": { - "slide_type": "skip" - } - }, - "outputs": [ - { - "data": { - "text/plain": [ - "1e+16" - ] - }, - "execution_count": 101, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "1e16 + 1e0" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "slideshow": { - "slide_type": "skip" - } - }, - "source": [ - "Of course, we may also add the `int` object `1` to the `float` objects in the literal `e` notation with the same outcome." - ] - }, - { - "cell_type": "code", - "execution_count": 102, + "execution_count": 123, "metadata": { "slideshow": { "slide_type": "slide" @@ -3162,7 +3953,7 @@ "1000000000000001.0" ] }, - "execution_count": 102, + "execution_count": 123, "metadata": {}, "output_type": "execute_result" } @@ -3173,10 +3964,10 @@ }, { "cell_type": "code", - "execution_count": 103, + "execution_count": 124, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -3186,7 +3977,7 @@ "1e+16" ] }, - "execution_count": 103, + "execution_count": 124, "metadata": {}, "output_type": "execute_result" } @@ -3208,7 +3999,7 @@ }, { "cell_type": "code", - "execution_count": 104, + "execution_count": 125, "metadata": { "slideshow": { "slide_type": "slide" @@ -3221,10 +4012,10 @@ }, { "cell_type": "code", - "execution_count": 105, + "execution_count": 126, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -3234,7 +4025,7 @@ "2.0000000000000004" ] }, - "execution_count": 105, + "execution_count": 126, "metadata": {}, "output_type": "execute_result" } @@ -3245,7 +4036,7 @@ }, { "cell_type": "code", - "execution_count": 106, + "execution_count": 127, "metadata": { "slideshow": { "slide_type": "fragment" @@ -3258,7 +4049,7 @@ "0.30000000000000004" ] }, - "execution_count": 106, + "execution_count": 127, "metadata": {}, "output_type": "execute_result" } @@ -3280,7 +4071,7 @@ }, { "cell_type": "code", - "execution_count": 107, + "execution_count": 128, "metadata": { "slideshow": { "slide_type": "fragment" @@ -3293,7 +4084,7 @@ "False" ] }, - "execution_count": 107, + "execution_count": 128, "metadata": {}, "output_type": "execute_result" } @@ -3304,10 +4095,10 @@ }, { "cell_type": "code", - "execution_count": 108, + "execution_count": 129, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -3317,7 +4108,7 @@ "False" ] }, - "execution_count": 108, + "execution_count": 129, "metadata": {}, "output_type": "execute_result" } @@ -3339,7 +4130,7 @@ }, { "cell_type": "code", - "execution_count": 109, + "execution_count": 130, "metadata": { "slideshow": { "slide_type": "slide" @@ -3352,7 +4143,7 @@ }, { "cell_type": "code", - "execution_count": 110, + "execution_count": 131, "metadata": { "slideshow": { "slide_type": "fragment" @@ -3365,7 +4156,7 @@ "True" ] }, - "execution_count": 110, + "execution_count": 131, "metadata": {}, "output_type": "execute_result" } @@ -3376,10 +4167,10 @@ }, { "cell_type": "code", - "execution_count": 111, + "execution_count": 132, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -3389,7 +4180,7 @@ "True" ] }, - "execution_count": 111, + "execution_count": 132, "metadata": {}, "output_type": "execute_result" } @@ -3406,14 +4197,14 @@ } }, "source": [ - "The built-in [format()](https://docs.python.org/3/library/functions.html#format) function allows us to show the **significant digits** of a `float` number as they exist in memory to arbitrary precision. To exemplify it, let's view a couple of `float` objects with `50` digits. This analysis reveals that almost no `float` number is precise! After $14$ or $15$ digits \"weird\" things happen. As we see further below, the \"random\" digits ending the `float` numbers do *not* \"physically\" exist in memory.\n", + "The built-in [format()](https://docs.python.org/3/library/functions.html#format) function allows us to show the **significant digits** of a `float` number as they exist in memory to arbitrary precision. To exemplify it, let's view a couple of `float` objects with `50` digits. This analysis reveals that almost no `float` number is precise! After 14 or 15 digits \"weird\" things happen. As we see further below, the \"random\" digits ending the `float` numbers do *not* \"physically\" exist in memory! Rather, they are \"calculated\" by the [format()](https://docs.python.org/3/library/functions.html#format) function that is forced to show `50` digits.\n", "\n", - "The [format()](https://docs.python.org/3/library/functions.html#format) function is different from the [format()](https://docs.python.org/3/library/stdtypes.html#str.format) method on `str` objects introduced in the next chapter and both work with the so-called [format specification mini-language](https://docs.python.org/3/library/string.html#format-specification-mini-language): `\".50f\"` is the instruction to show `50` digits of a `float` number. But let's not worry too much about these details for now." + "The [format()](https://docs.python.org/3/library/functions.html#format) function is different from the [format()](https://docs.python.org/3/library/stdtypes.html#str.format) method on `str` objects introduced in the next chapter (cf., [Chapter 6](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/06_text_00_lecture.ipynb#format%28%29-Method)): Yet, both work with the so-called [format specification mini-language](https://docs.python.org/3/library/string.html#format-specification-mini-language): `\".50f\"` is the instruction to show `50` digits of a `float` number." ] }, { "cell_type": "code", - "execution_count": 112, + "execution_count": 133, "metadata": { "slideshow": { "slide_type": "slide" @@ -3426,7 +4217,7 @@ "'0.10000000000000000555111512312578270211815834045410'" ] }, - "execution_count": 112, + "execution_count": 133, "metadata": {}, "output_type": "execute_result" } @@ -3437,10 +4228,10 @@ }, { "cell_type": "code", - "execution_count": 113, + "execution_count": 134, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -3450,7 +4241,7 @@ "'0.20000000000000001110223024625156540423631668090820'" ] }, - "execution_count": 113, + "execution_count": 134, "metadata": {}, "output_type": "execute_result" } @@ -3461,10 +4252,10 @@ }, { "cell_type": "code", - "execution_count": 114, + "execution_count": 135, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -3474,7 +4265,7 @@ "'0.29999999999999998889776975374843459576368331909180'" ] }, - "execution_count": 114, + "execution_count": 135, "metadata": {}, "output_type": "execute_result" } @@ -3485,7 +4276,7 @@ }, { "cell_type": "code", - "execution_count": 115, + "execution_count": 136, "metadata": { "slideshow": { "slide_type": "slide" @@ -3498,7 +4289,7 @@ "'0.33333333333333331482961625624739099293947219848633'" ] }, - "execution_count": 115, + "execution_count": 136, "metadata": {}, "output_type": "execute_result" } @@ -3524,7 +4315,7 @@ }, { "cell_type": "code", - "execution_count": 116, + "execution_count": 137, "metadata": { "slideshow": { "slide_type": "fragment" @@ -3537,10 +4328,10 @@ }, { "cell_type": "code", - "execution_count": 117, + "execution_count": 138, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -3550,7 +4341,7 @@ "0.33333" ] }, - "execution_count": 117, + "execution_count": 138, "metadata": {}, "output_type": "execute_result" } @@ -3561,10 +4352,10 @@ }, { "cell_type": "code", - "execution_count": 118, + "execution_count": 139, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -3574,7 +4365,7 @@ "'0.33333000000000001517008740847813896834850311279297'" ] }, - "execution_count": 118, + "execution_count": 139, "metadata": {}, "output_type": "execute_result" } @@ -3596,7 +4387,7 @@ }, { "cell_type": "code", - "execution_count": 119, + "execution_count": 140, "metadata": { "slideshow": { "slide_type": "slide" @@ -3609,7 +4400,7 @@ "'0.12500000000000000000000000000000000000000000000000'" ] }, - "execution_count": 119, + "execution_count": 140, "metadata": {}, "output_type": "execute_result" } @@ -3620,10 +4411,10 @@ }, { "cell_type": "code", - "execution_count": 120, + "execution_count": 141, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -3633,7 +4424,7 @@ "'0.25000000000000000000000000000000000000000000000000'" ] }, - "execution_count": 120, + "execution_count": 141, "metadata": {}, "output_type": "execute_result" } @@ -3644,10 +4435,10 @@ }, { "cell_type": "code", - "execution_count": 121, + "execution_count": 142, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -3657,7 +4448,7 @@ "True" ] }, - "execution_count": 121, + "execution_count": 142, "metadata": {}, "output_type": "execute_result" } @@ -3685,10 +4476,28 @@ } }, "source": [ - "To understand these subtleties, we need to look at the **[binary representation of floats](https://en.wikipedia.org/wiki/Double-precision_floating-point_format)** and review the basics of the **[IEEE 754](https://en.wikipedia.org/wiki/IEEE_754)** standard. On modern machines, floats are modeled in so-called double precision with $64$ bits that are grouped as in the figure below. The first bit determines the sign ($0$ for plus, $1$ for minus), the next $11$ bits represent an $exponent$ term, and the last $52$ bits resemble the actual significant digits, the so-called $fraction$ part. The three groups are put together like so:\n", - "\n", - "$$float = (-1)^{sign} * 1.fraction * 2^{exponent-1023}$$\n", - "\n", + "To understand these subtleties, we need to look at the **[binary representation of floats](https://en.wikipedia.org/wiki/Double-precision_floating-point_format)** and review the basics of the **[IEEE 754](https://en.wikipedia.org/wiki/IEEE_754)** standard. On modern machines, floats are modeled in so-called double precision with $64$ bits that are grouped as in the figure below. The first bit determines the sign ($0$ for plus, $1$ for minus), the next $11$ bits represent an $exponent$ term, and the last $52$ bits resemble the actual significant digits, the so-called $fraction$ part. The three groups are put together like so:" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "$$float = (-1)^{sign} * 1.fraction * 2^{exponent-1023}$$" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ "A $1.$ is implicitly prepended as the first digit, and both, $fraction$ and $exponent$, are stored in base $2$ representation (i.e., they both are interpreted like integers above). As $exponent$ is consequently non-negative, between $0_{10}$ and $2047_{10}$ to be precise, the $-1023$, called the exponent bias, centers the entire $2^{exponent-1023}$ term around $1$ and allows the period within the $1.fraction$ part be shifted into either direction by the same amount. Floating-point numbers received their name as the period, formally called the **[radix point](https://en.wikipedia.org/wiki/Radix_point)**, \"floats\" along the significant digits. As an aside, an $exponent$ of all $0$s or all $1$s is used to model the special values `nan` or `inf`.\n", "\n", "As the standard defines the exponent part to come as a power of $2$, we now see why `0.125` is a *precise* float: It can be represented as a power of $2$, i.e., $0.125 = (-1)^0 * 1.0 * 2^{1020-1023} = 2^{-3} = \\frac{1}{8}$. In other words, the floating-point representation of $0.125_{10}$ is $0_2$, $1111111100_2 = 1020_{10}$, and $0_2$ for the three groups, respectively." @@ -3698,7 +4507,7 @@ "cell_type": "markdown", "metadata": { "slideshow": { - "slide_type": "slide" + "slide_type": "-" } }, "source": [ @@ -3726,7 +4535,7 @@ }, { "cell_type": "code", - "execution_count": 122, + "execution_count": 143, "metadata": { "slideshow": { "slide_type": "slide" @@ -3739,10 +4548,10 @@ }, { "cell_type": "code", - "execution_count": 123, + "execution_count": 144, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -3752,13 +4561,13 @@ "'0x1.0000000000000p-3'" ] }, - "execution_count": 123, + "execution_count": 144, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "one_eighth.hex() # this basically says 2 ** (-3)" + "one_eighth.hex()" ] }, { @@ -3774,10 +4583,10 @@ }, { "cell_type": "code", - "execution_count": 124, + "execution_count": 145, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -3787,7 +4596,7 @@ "(1, 8)" ] }, - "execution_count": 124, + "execution_count": 145, "metadata": {}, "output_type": "execute_result" } @@ -3798,10 +4607,10 @@ }, { "cell_type": "code", - "execution_count": 125, + "execution_count": 146, "metadata": { "slideshow": { - "slide_type": "fragment" + "slide_type": "slide" } }, "outputs": [ @@ -3811,7 +4620,7 @@ "'0x1.555475a31a4bep-2'" ] }, - "execution_count": 125, + "execution_count": 146, "metadata": {}, "output_type": "execute_result" } @@ -3822,10 +4631,10 @@ }, { "cell_type": "code", - "execution_count": 126, + "execution_count": 147, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -3835,7 +4644,7 @@ "(3002369727582815, 9007199254740992)" ] }, - "execution_count": 126, + "execution_count": 147, "metadata": {}, "output_type": "execute_result" } @@ -3857,10 +4666,10 @@ }, { "cell_type": "code", - "execution_count": 127, + "execution_count": 148, "metadata": { "slideshow": { - "slide_type": "slide" + "slide_type": "skip" } }, "outputs": [], @@ -3870,10 +4679,10 @@ }, { "cell_type": "code", - "execution_count": 128, + "execution_count": 149, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "skip" } }, "outputs": [ @@ -3883,7 +4692,7 @@ "'0x0.0p+0'" ] }, - "execution_count": 128, + "execution_count": 149, "metadata": {}, "output_type": "execute_result" } @@ -3894,10 +4703,10 @@ }, { "cell_type": "code", - "execution_count": 129, + "execution_count": 150, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "skip" } }, "outputs": [ @@ -3907,7 +4716,7 @@ "(0, 1)" ] }, - "execution_count": 129, + "execution_count": 150, "metadata": {}, "output_type": "execute_result" } @@ -3929,7 +4738,7 @@ }, { "cell_type": "code", - "execution_count": 130, + "execution_count": 151, "metadata": { "slideshow": { "slide_type": "skip" @@ -3942,7 +4751,7 @@ "False" ] }, - "execution_count": 130, + "execution_count": 151, "metadata": {}, "output_type": "execute_result" } @@ -3953,7 +4762,7 @@ }, { "cell_type": "code", - "execution_count": 131, + "execution_count": 152, "metadata": { "slideshow": { "slide_type": "skip" @@ -3966,7 +4775,7 @@ "True" ] }, - "execution_count": 131, + "execution_count": 152, "metadata": {}, "output_type": "execute_result" } @@ -3990,7 +4799,7 @@ }, { "cell_type": "code", - "execution_count": 132, + "execution_count": 153, "metadata": { "slideshow": { "slide_type": "skip" @@ -4003,7 +4812,7 @@ }, { "cell_type": "code", - "execution_count": 133, + "execution_count": 154, "metadata": { "slideshow": { "slide_type": "skip" @@ -4016,7 +4825,7 @@ "sys.float_info(max=1.7976931348623157e+308, max_exp=1024, max_10_exp=308, min=2.2250738585072014e-308, min_exp=-1021, min_10_exp=-307, dig=15, mant_dig=53, epsilon=2.220446049250313e-16, radix=2, rounds=1)" ] }, - "execution_count": 133, + "execution_count": 154, "metadata": {}, "output_type": "execute_result" } @@ -4044,14 +4853,14 @@ } }, "source": [ - "The [decimal](https://docs.python.org/3/library/decimal.html) module in the [standard library](https://docs.python.org/3/library/index.html) provides a [Decimal](https://docs.python.org/3/library/decimal.html#decimal.Decimal) type that may be used to represent any real number to a user-defined level of precision: \"User-defined\" does *not* mean an infinite or \"exact\" precision! The `Decimal` type merely allows us to work with a number of bits *different* from the $64$ as specified for the `float` type and also to customize the rounding rules and some other settings.\n", + "The [decimal](https://docs.python.org/3/library/decimal.html) module in the [standard library](https://docs.python.org/3/library/index.html) provides a [Decimal](https://docs.python.org/3/library/decimal.html#decimal.Decimal) type that may be used to represent any real number to a user-defined level of precision: \"User-defined\" does *not* mean an infinite or exact precision! The `Decimal` type merely allows us to work with a number of bits *different* from the $64$ as specified for the `float` type and also to customize the rounding rules and some other settings.\n", "\n", "We import the `Decimal` type and also the [getcontext()](https://docs.python.org/3/library/decimal.html#decimal.getcontext) function from the [decimal](https://docs.python.org/3/library/decimal.html) module." ] }, { "cell_type": "code", - "execution_count": 134, + "execution_count": 155, "metadata": { "slideshow": { "slide_type": "slide" @@ -4075,7 +4884,7 @@ }, { "cell_type": "code", - "execution_count": 135, + "execution_count": 156, "metadata": { "slideshow": { "slide_type": "fragment" @@ -4088,7 +4897,7 @@ "Context(prec=28, rounding=ROUND_HALF_EVEN, Emin=-999999, Emax=999999, capitals=1, clamp=0, flags=[], traps=[InvalidOperation, DivisionByZero, Overflow])" ] }, - "execution_count": 135, + "execution_count": 156, "metadata": {}, "output_type": "execute_result" } @@ -4110,7 +4919,7 @@ }, { "cell_type": "code", - "execution_count": 136, + "execution_count": 157, "metadata": { "slideshow": { "slide_type": "slide" @@ -4123,7 +4932,7 @@ "Decimal('42')" ] }, - "execution_count": 136, + "execution_count": 157, "metadata": {}, "output_type": "execute_result" } @@ -4134,10 +4943,10 @@ }, { "cell_type": "code", - "execution_count": 137, + "execution_count": 158, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -4147,7 +4956,7 @@ "Decimal('0.1')" ] }, - "execution_count": 137, + "execution_count": 158, "metadata": {}, "output_type": "execute_result" } @@ -4158,26 +4967,26 @@ }, { "cell_type": "code", - "execution_count": 138, + "execution_count": 159, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ - "Decimal('1E+5')" + "Decimal('0.001')" ] }, - "execution_count": 138, + "execution_count": 159, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "Decimal(\"1e5\")" + "Decimal(\"1e-3\")" ] }, { @@ -4193,7 +5002,7 @@ }, { "cell_type": "code", - "execution_count": 139, + "execution_count": 160, "metadata": { "slideshow": { "slide_type": "fragment" @@ -4206,13 +5015,13 @@ "Decimal('0.1000000000000000055511151231257827021181583404541015625')" ] }, - "execution_count": 139, + "execution_count": 160, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "Decimal(0.1) # do not create a Decimal from a float" + "Decimal(0.1) # do not do this" ] }, { @@ -4228,7 +5037,7 @@ }, { "cell_type": "code", - "execution_count": 140, + "execution_count": 161, "metadata": { "slideshow": { "slide_type": "slide" @@ -4241,7 +5050,7 @@ "Decimal('0.3')" ] }, - "execution_count": 140, + "execution_count": 161, "metadata": {}, "output_type": "execute_result" } @@ -4252,10 +5061,10 @@ }, { "cell_type": "code", - "execution_count": 141, + "execution_count": 162, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -4265,7 +5074,7 @@ "True" ] }, - "execution_count": 141, + "execution_count": 162, "metadata": {}, "output_type": "execute_result" } @@ -4287,7 +5096,7 @@ }, { "cell_type": "code", - "execution_count": 142, + "execution_count": 163, "metadata": { "slideshow": { "slide_type": "fragment" @@ -4300,7 +5109,7 @@ "Decimal('0.30000')" ] }, - "execution_count": 142, + "execution_count": 163, "metadata": {}, "output_type": "execute_result" } @@ -4311,10 +5120,10 @@ }, { "cell_type": "code", - "execution_count": 143, + "execution_count": 164, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "skip" } }, "outputs": [ @@ -4324,7 +5133,7 @@ "True" ] }, - "execution_count": 143, + "execution_count": 164, "metadata": {}, "output_type": "execute_result" } @@ -4346,7 +5155,7 @@ }, { "cell_type": "code", - "execution_count": 144, + "execution_count": 165, "metadata": { "slideshow": { "slide_type": "slide" @@ -4359,42 +5168,42 @@ "Decimal('42')" ] }, - "execution_count": 144, + "execution_count": 165, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "42 + Decimal(42) - 42" + "21 + Decimal(21)" ] }, { "cell_type": "code", - "execution_count": 145, + "execution_count": 166, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ - "Decimal('42')" + "Decimal('42.0')" ] }, - "execution_count": 145, + "execution_count": 166, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "10 * Decimal(42) / 10" + "10 * Decimal(\"4.2\")" ] }, { "cell_type": "code", - "execution_count": 146, + "execution_count": 167, "metadata": { "slideshow": { "slide_type": "slide" @@ -4407,7 +5216,7 @@ "Decimal('0.1')" ] }, - "execution_count": 146, + "execution_count": 167, "metadata": {}, "output_type": "execute_result" } @@ -4429,7 +5238,7 @@ }, { "cell_type": "code", - "execution_count": 147, + "execution_count": 168, "metadata": { "slideshow": { "slide_type": "fragment" @@ -4442,7 +5251,7 @@ "'0.10000000000000000000000000000000000000000000000000'" ] }, - "execution_count": 147, + "execution_count": 168, "metadata": {}, "output_type": "execute_result" } @@ -4453,10 +5262,10 @@ }, { "cell_type": "code", - "execution_count": 148, + "execution_count": 169, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -4466,7 +5275,7 @@ "'0.10000000000000000555111512312578270211815834045410'" ] }, - "execution_count": 148, + "execution_count": 169, "metadata": {}, "output_type": "execute_result" } @@ -4488,7 +5297,7 @@ }, { "cell_type": "code", - "execution_count": 149, + "execution_count": 170, "metadata": { "slideshow": { "slide_type": "slide" @@ -4502,7 +5311,7 @@ "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0;36m1.0\u001b[0m \u001b[0;34m*\u001b[0m \u001b[0mDecimal\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m42\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0;36m1.0\u001b[0m \u001b[0;34m*\u001b[0m \u001b[0mDecimal\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m42\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mTypeError\u001b[0m: unsupported operand type(s) for *: 'float' and 'decimal.Decimal'" ] } @@ -4519,15 +5328,15 @@ } }, "source": [ - "To preserve the precision for more advanced mathematical functions, `Decimal` objects come with many **methods bound** on them. For example, [ln()](https://docs.python.org/3/library/decimal.html#decimal.Decimal.ln) and [log10()](https://docs.python.org/3/library/decimal.html#decimal.Decimal.log10) take the logarithm while [sqrt()](https://docs.python.org/3/library/decimal.html#decimal.Decimal.sqrt) calculates the square root. In general, the functions in the [math](https://docs.python.org/3/library/math.html) module in the [standard library](https://docs.python.org/3/library/index.html) should only be used with `float` objects as they do *not* preserve precision." + "To preserve the precision for more advanced mathematical functions, `Decimal` objects come with many **methods bound** on them. For example, [ln()](https://docs.python.org/3/library/decimal.html#decimal.Decimal.ln) and [log10()](https://docs.python.org/3/library/decimal.html#decimal.Decimal.log10) take the logarithm while [sqrt()](https://docs.python.org/3/library/decimal.html#decimal.Decimal.sqrt) calculates the square root. The methods always return a *new* `Decimal` object. We must never use the functions in the [math](https://docs.python.org/3/library/math.html) module in the [standard library](https://docs.python.org/3/library/index.html) with `Decimal` objects as they do *not* preserve precision." ] }, { "cell_type": "code", - "execution_count": 150, + "execution_count": 171, "metadata": { "slideshow": { - "slide_type": "slide" + "slide_type": "skip" } }, "outputs": [ @@ -4537,7 +5346,7 @@ "Decimal('2')" ] }, - "execution_count": 150, + "execution_count": 171, "metadata": {}, "output_type": "execute_result" } @@ -4548,10 +5357,10 @@ }, { "cell_type": "code", - "execution_count": 151, + "execution_count": 172, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "slide" } }, "outputs": [ @@ -4561,7 +5370,7 @@ "Decimal('1.414213562373095048801688724')" ] }, - "execution_count": 151, + "execution_count": 172, "metadata": {}, "output_type": "execute_result" } @@ -4585,7 +5394,7 @@ }, { "cell_type": "code", - "execution_count": 152, + "execution_count": 173, "metadata": { "slideshow": { "slide_type": "fragment" @@ -4598,7 +5407,7 @@ "Decimal('1.999999999999999999999999999')" ] }, - "execution_count": 152, + "execution_count": 173, "metadata": {}, "output_type": "execute_result" } @@ -4617,17 +5426,17 @@ } }, "source": [ - "The [quantize()](https://docs.python.org/3/library/decimal.html#decimal.Decimal.quantize) method allows us to [quantize](https://www.dictionary.com/browse/quantize) (i.e., \"round\") a `Decimal` number at any precision that is *smaller* than the set precision. It looks at the number of decimals (i.e., to the right of the period) of the numeric argument we pass in.\n", + "However, the [quantize()](https://docs.python.org/3/library/decimal.html#decimal.Decimal.quantize) method allows us to [quantize](https://www.dictionary.com/browse/quantize) (i.e., \"round\") a `Decimal` number at any precision that is *smaller* than the set precision. It takes the number of decimals to the right of the period of the `Decimal` argument we pass in and rounds accordingly.\n", "\n", - "For example, as the overall imprecise value of `two` still has an internal precision of `28` digits, we can correctly round it to *four* decimals (i.e., `Decimal(\"0.0001\")` has four decimals)." + "For example, as the overall imprecise value of `two` still has an internal precision of `28` digits, we can correctly round it to *four* decimals (i.e., `Decimal(\"0.0000\")` has four decimals)." ] }, { "cell_type": "code", - "execution_count": 153, + "execution_count": 174, "metadata": { "slideshow": { - "slide_type": "fragment" + "slide_type": "slide" } }, "outputs": [ @@ -4637,26 +5446,34 @@ "Decimal('2.0000')" ] }, - "execution_count": 153, + "execution_count": 174, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "two.quantize(Decimal(\"0.0001\"))" + "two.quantize(Decimal(\"0.0000\"))" ] }, { "cell_type": "markdown", - "metadata": {}, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, "source": [ "We can never round a `Decimal` number and obtain a greater precision than before: The `InvalidOperation` exception tells us that *loudly*." ] }, { "cell_type": "code", - "execution_count": 154, - "metadata": {}, + "execution_count": 175, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, "outputs": [ { "ename": "InvalidOperation", @@ -4665,13 +5482,13 @@ "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mInvalidOperation\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mtwo\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mquantize\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mDecimal\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"0.1\"\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m**\u001b[0m \u001b[0;36m28\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mtwo\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mquantize\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mDecimal\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"1e-28\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mInvalidOperation\u001b[0m: []" ] } ], "source": [ - "two.quantize(Decimal(\"0.1\") ** 28)" + "two.quantize(Decimal(\"1e-28\"))" ] }, { @@ -4687,7 +5504,7 @@ }, { "cell_type": "code", - "execution_count": 155, + "execution_count": 176, "metadata": { "slideshow": { "slide_type": "skip" @@ -4700,13 +5517,13 @@ "True" ] }, - "execution_count": 155, + "execution_count": 176, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "two.quantize(Decimal(\"0.0001\")) == 2" + "two.quantize(Decimal(\"0.0000\")) == 2" ] }, { @@ -4722,10 +5539,10 @@ }, { "cell_type": "code", - "execution_count": 156, + "execution_count": 177, "metadata": { "slideshow": { - "slide_type": "fragment" + "slide_type": "skip" } }, "outputs": [ @@ -4735,13 +5552,13 @@ "True" ] }, - "execution_count": 156, + "execution_count": 177, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "(Decimal(2).sqrt() ** 2).quantize(Decimal(\"0.0001\")) == 2" + "(Decimal(2).sqrt() ** 2).quantize(Decimal(\"0.0000\")) == 2" ] }, { @@ -4757,7 +5574,7 @@ }, { "cell_type": "code", - "execution_count": 157, + "execution_count": 178, "metadata": { "slideshow": { "slide_type": "skip" @@ -4770,7 +5587,7 @@ "Decimal('NaN')" ] }, - "execution_count": 157, + "execution_count": 178, "metadata": {}, "output_type": "execute_result" } @@ -4779,9 +5596,20 @@ "Decimal(\"nan\")" ] }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`Decimal(\"nan\")`s never compare equal to anything, not even to themselves." + ] + }, { "cell_type": "code", - "execution_count": 158, + "execution_count": 179, "metadata": { "slideshow": { "slide_type": "skip" @@ -4794,18 +5622,29 @@ "False" ] }, - "execution_count": 158, + "execution_count": 179, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "Decimal(\"nan\") == Decimal(\"nan\") # nan's never compare equal to anything, not even to themselves" + "Decimal(\"nan\") == Decimal(\"nan\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Infinity is larger than any concrete number." ] }, { "cell_type": "code", - "execution_count": 159, + "execution_count": 180, "metadata": { "slideshow": { "slide_type": "skip" @@ -4818,7 +5657,7 @@ "Decimal('Infinity')" ] }, - "execution_count": 159, + "execution_count": 180, "metadata": {}, "output_type": "execute_result" } @@ -4829,7 +5668,7 @@ }, { "cell_type": "code", - "execution_count": 160, + "execution_count": 181, "metadata": { "slideshow": { "slide_type": "skip" @@ -4842,7 +5681,7 @@ "Decimal('-Infinity')" ] }, - "execution_count": 160, + "execution_count": 181, "metadata": {}, "output_type": "execute_result" } @@ -4853,7 +5692,7 @@ }, { "cell_type": "code", - "execution_count": 161, + "execution_count": 182, "metadata": { "slideshow": { "slide_type": "skip" @@ -4866,18 +5705,18 @@ "Decimal('Infinity')" ] }, - "execution_count": 161, + "execution_count": 182, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "Decimal(\"inf\") + 42 # Infinity is infinity, concrete numbers loose its meaning" + "Decimal(\"inf\") + 42" ] }, { "cell_type": "code", - "execution_count": 162, + "execution_count": 183, "metadata": { "slideshow": { "slide_type": "skip" @@ -4890,7 +5729,7 @@ "True" ] }, - "execution_count": 162, + "execution_count": 183, "metadata": {}, "output_type": "execute_result" } @@ -4907,12 +5746,12 @@ } }, "source": [ - "As with `float` objects, we cannot add infinities of different signs: Now get a module-specific `InvalidOperation` exception instead of a `nan` value. Here, **failing loudly** is a good thing as it prevents us from working with invalid results." + "As with `float` objects, we cannot add infinities of different signs: Now, get a module-specific `InvalidOperation` exception instead of a `nan` value. Here, **failing loudly** is a good thing as it prevents us from working with invalid results." ] }, { "cell_type": "code", - "execution_count": 163, + "execution_count": 184, "metadata": { "slideshow": { "slide_type": "skip" @@ -4926,7 +5765,7 @@ "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mInvalidOperation\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mDecimal\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"inf\"\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0mDecimal\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"-inf\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mDecimal\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"inf\"\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0mDecimal\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"-inf\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mInvalidOperation\u001b[0m: []" ] } @@ -4937,7 +5776,7 @@ }, { "cell_type": "code", - "execution_count": 164, + "execution_count": 185, "metadata": { "slideshow": { "slide_type": "skip" @@ -4951,7 +5790,7 @@ "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mInvalidOperation\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mDecimal\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"inf\"\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m-\u001b[0m \u001b[0mDecimal\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"inf\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mDecimal\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"inf\"\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m-\u001b[0m \u001b[0mDecimal\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"inf\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mInvalidOperation\u001b[0m: []" ] } @@ -4997,7 +5836,7 @@ }, { "cell_type": "code", - "execution_count": 165, + "execution_count": 186, "metadata": { "slideshow": { "slide_type": "slide" @@ -5021,7 +5860,7 @@ }, { "cell_type": "code", - "execution_count": 166, + "execution_count": 187, "metadata": { "slideshow": { "slide_type": "fragment" @@ -5034,21 +5873,21 @@ "Fraction(1, 3)" ] }, - "execution_count": 166, + "execution_count": 187, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "Fraction(1, 3) # this is now 1/3 with \"full\" precision" + "Fraction(1, 3) # 1/3 with \"full\" precision" ] }, { "cell_type": "code", - "execution_count": 167, + "execution_count": 188, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -5058,21 +5897,21 @@ "Fraction(1, 3)" ] }, - "execution_count": 167, + "execution_count": 188, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "Fraction(\"1/3\") # this is 1/3 with \"full\" precision again" + "Fraction(\"1/3\") # 1/3 with \"full\" precision" ] }, { "cell_type": "code", - "execution_count": 168, + "execution_count": 189, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -5082,21 +5921,21 @@ "Fraction(3333333333, 10000000000)" ] }, - "execution_count": 168, + "execution_count": 189, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "Fraction(\"0.3333333333\") # this is 1/3 with a precision of 10 significant digits" + "Fraction(\"0.3333333333\") # 1/3 with a precision of 10 significant digits" ] }, { "cell_type": "code", - "execution_count": 169, + "execution_count": 190, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "skip" } }, "outputs": [ @@ -5106,13 +5945,13 @@ "Fraction(3333333333, 10000000000)" ] }, - "execution_count": 169, + "execution_count": 190, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "Fraction(\"3333333333e-10\") # the same in scientific notation" + "Fraction(\"3333333333e-10\") # scientific notation is also allowed" ] }, { @@ -5128,7 +5967,7 @@ }, { "cell_type": "code", - "execution_count": 170, + "execution_count": 191, "metadata": { "slideshow": { "slide_type": "slide" @@ -5141,7 +5980,7 @@ "Fraction(3, 2)" ] }, - "execution_count": 170, + "execution_count": 191, "metadata": {}, "output_type": "execute_result" } @@ -5152,10 +5991,10 @@ }, { "cell_type": "code", - "execution_count": 171, + "execution_count": 192, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -5165,7 +6004,7 @@ "Fraction(3, 2)" ] }, - "execution_count": 171, + "execution_count": 192, "metadata": {}, "output_type": "execute_result" } @@ -5187,7 +6026,7 @@ }, { "cell_type": "code", - "execution_count": 172, + "execution_count": 193, "metadata": { "slideshow": { "slide_type": "slide" @@ -5200,7 +6039,7 @@ "Fraction(1, 10)" ] }, - "execution_count": 172, + "execution_count": 193, "metadata": {}, "output_type": "execute_result" } @@ -5222,10 +6061,10 @@ }, { "cell_type": "code", - "execution_count": 173, + "execution_count": 194, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -5235,7 +6074,7 @@ "Fraction(3602879701896397, 36028797018963968)" ] }, - "execution_count": 173, + "execution_count": 194, "metadata": {}, "output_type": "execute_result" } @@ -5257,7 +6096,7 @@ }, { "cell_type": "code", - "execution_count": 174, + "execution_count": 195, "metadata": { "slideshow": { "slide_type": "slide" @@ -5270,7 +6109,7 @@ "Fraction(7, 4)" ] }, - "execution_count": 174, + "execution_count": 195, "metadata": {}, "output_type": "execute_result" } @@ -5281,10 +6120,10 @@ }, { "cell_type": "code", - "execution_count": 175, + "execution_count": 196, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -5294,7 +6133,7 @@ "Fraction(1, 2)" ] }, - "execution_count": 175, + "execution_count": 196, "metadata": {}, "output_type": "execute_result" } @@ -5305,10 +6144,10 @@ }, { "cell_type": "code", - "execution_count": 176, + "execution_count": 197, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -5318,7 +6157,7 @@ "Fraction(1, 1)" ] }, - "execution_count": 176, + "execution_count": 197, "metadata": {}, "output_type": "execute_result" } @@ -5329,10 +6168,10 @@ }, { "cell_type": "code", - "execution_count": 177, + "execution_count": 198, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -5342,7 +6181,7 @@ "Fraction(1, 1)" ] }, - "execution_count": 177, + "execution_count": 198, "metadata": {}, "output_type": "execute_result" } @@ -5364,7 +6203,7 @@ }, { "cell_type": "code", - "execution_count": 178, + "execution_count": 199, "metadata": { "slideshow": { "slide_type": "slide" @@ -5377,18 +6216,18 @@ "0.1" ] }, - "execution_count": 178, + "execution_count": 199, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "10.0 * Fraction(1, 100)" + "10.0 * Fraction(1, 100) # do not do this" ] }, { "cell_type": "code", - "execution_count": 179, + "execution_count": 200, "metadata": { "slideshow": { "slide_type": "fragment" @@ -5401,7 +6240,7 @@ "'0.10000000000000000555111512312578270211815834045410'" ] }, - "execution_count": 179, + "execution_count": 200, "metadata": {}, "output_type": "execute_result" } @@ -5432,6 +6271,17 @@ "## The `complex` Type" ] }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "**What is the solution to $x^2 = -1$ ?**" + ] + }, { "cell_type": "markdown", "metadata": { @@ -5453,7 +6303,7 @@ "cell_type": "markdown", "metadata": { "slideshow": { - "slide_type": "slide" + "slide_type": "-" } }, "source": [ @@ -5475,7 +6325,7 @@ }, { "cell_type": "code", - "execution_count": 180, + "execution_count": 201, "metadata": { "slideshow": { "slide_type": "slide" @@ -5488,7 +6338,7 @@ }, { "cell_type": "code", - "execution_count": 181, + "execution_count": 202, "metadata": { "slideshow": { "slide_type": "fragment" @@ -5498,10 +6348,10 @@ { "data": { "text/plain": [ - "140166567772848" + "139663415663408" ] }, - "execution_count": 181, + "execution_count": 202, "metadata": {}, "output_type": "execute_result" } @@ -5512,10 +6362,10 @@ }, { "cell_type": "code", - "execution_count": 182, + "execution_count": 203, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -5525,7 +6375,7 @@ "complex" ] }, - "execution_count": 182, + "execution_count": 203, "metadata": {}, "output_type": "execute_result" } @@ -5536,10 +6386,10 @@ }, { "cell_type": "code", - "execution_count": 183, + "execution_count": 204, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -5549,7 +6399,7 @@ "1j" ] }, - "execution_count": 183, + "execution_count": 204, "metadata": {}, "output_type": "execute_result" } @@ -5571,7 +6421,7 @@ }, { "cell_type": "code", - "execution_count": 184, + "execution_count": 205, "metadata": { "slideshow": { "slide_type": "slide" @@ -5584,7 +6434,7 @@ "True" ] }, - "execution_count": 184, + "execution_count": 205, "metadata": {}, "output_type": "execute_result" } @@ -5606,7 +6456,7 @@ }, { "cell_type": "code", - "execution_count": 185, + "execution_count": 206, "metadata": { "slideshow": { "slide_type": "slide" @@ -5619,7 +6469,7 @@ "(2+0.5j)" ] }, - "execution_count": 185, + "execution_count": 206, "metadata": {}, "output_type": "execute_result" } @@ -5641,10 +6491,10 @@ }, { "cell_type": "code", - "execution_count": 186, + "execution_count": 207, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -5654,21 +6504,32 @@ "(2+0.5j)" ] }, - "execution_count": 186, + "execution_count": 207, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "complex(2, 0.5) # an integer and a float work just fine as arguments" + "complex(2, 0.5)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "By omitting the second argument, we set the imaginary part to $0$." ] }, { "cell_type": "code", - "execution_count": 187, + "execution_count": 208, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "skip" } }, "outputs": [ @@ -5678,18 +6539,29 @@ "(2+0j)" ] }, - "execution_count": 187, + "execution_count": 208, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "complex(2) # omitting the second argument sets the imaginary part to 0" + "complex(2) " + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + " The arguments to [complex()](https://docs.python.org/3/library/functions.html#complex) may be any numeric type." ] }, { "cell_type": "code", - "execution_count": 188, + "execution_count": 209, "metadata": { "slideshow": { "slide_type": "skip" @@ -5702,18 +6574,18 @@ "(2+0.5j)" ] }, - "execution_count": 188, + "execution_count": 209, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "complex(Decimal(\"2.0\"), Fraction(1, 2)) # the arguments may be any numeric type" + "complex(Decimal(\"2.0\"), Fraction(1, 2))" ] }, { "cell_type": "code", - "execution_count": 189, + "execution_count": 210, "metadata": { "slideshow": { "slide_type": "skip" @@ -5726,7 +6598,7 @@ "(2+0.5j)" ] }, - "execution_count": 189, + "execution_count": 210, "metadata": {}, "output_type": "execute_result" } @@ -5748,7 +6620,7 @@ }, { "cell_type": "code", - "execution_count": 190, + "execution_count": 211, "metadata": { "slideshow": { "slide_type": "slide" @@ -5762,7 +6634,7 @@ }, { "cell_type": "code", - "execution_count": 191, + "execution_count": 212, "metadata": { "slideshow": { "slide_type": "fragment" @@ -5775,7 +6647,7 @@ "(4+6j)" ] }, - "execution_count": 191, + "execution_count": 212, "metadata": {}, "output_type": "execute_result" } @@ -5786,10 +6658,10 @@ }, { "cell_type": "code", - "execution_count": 192, + "execution_count": 213, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -5799,7 +6671,7 @@ "(-2-2j)" ] }, - "execution_count": 192, + "execution_count": 213, "metadata": {}, "output_type": "execute_result" } @@ -5810,7 +6682,7 @@ }, { "cell_type": "code", - "execution_count": 193, + "execution_count": 214, "metadata": { "slideshow": { "slide_type": "skip" @@ -5820,21 +6692,21 @@ { "data": { "text/plain": [ - "(40+2j)" + "(2+2j)" ] }, - "execution_count": 193, + "execution_count": 214, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "c1 + 39" + "c1 + 1" ] }, { "cell_type": "code", - "execution_count": 194, + "execution_count": 215, "metadata": { "slideshow": { "slide_type": "skip" @@ -5844,21 +6716,21 @@ { "data": { "text/plain": [ - "-4j" + "(0.5-4j)" ] }, - "execution_count": 194, + "execution_count": 215, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "3.0 - c2" + "3.5 - c2" ] }, { "cell_type": "code", - "execution_count": 195, + "execution_count": 216, "metadata": { "slideshow": { "slide_type": "skip" @@ -5871,7 +6743,7 @@ "(5+10j)" ] }, - "execution_count": 195, + "execution_count": 216, "metadata": {}, "output_type": "execute_result" } @@ -5882,7 +6754,7 @@ }, { "cell_type": "code", - "execution_count": 196, + "execution_count": 217, "metadata": { "slideshow": { "slide_type": "skip" @@ -5895,7 +6767,7 @@ "(0.5+0.6666666666666666j)" ] }, - "execution_count": 196, + "execution_count": 217, "metadata": {}, "output_type": "execute_result" } @@ -5906,7 +6778,7 @@ }, { "cell_type": "code", - "execution_count": 197, + "execution_count": 218, "metadata": { "slideshow": { "slide_type": "fragment" @@ -5919,7 +6791,7 @@ "(-5+10j)" ] }, - "execution_count": 197, + "execution_count": 218, "metadata": {}, "output_type": "execute_result" } @@ -5930,10 +6802,10 @@ }, { "cell_type": "code", - "execution_count": 198, + "execution_count": 219, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -5943,7 +6815,7 @@ "(0.44+0.08j)" ] }, - "execution_count": 198, + "execution_count": 219, "metadata": {}, "output_type": "execute_result" } @@ -5965,7 +6837,7 @@ }, { "cell_type": "code", - "execution_count": 199, + "execution_count": 220, "metadata": { "slideshow": { "slide_type": "slide" @@ -5978,7 +6850,7 @@ "5.0" ] }, - "execution_count": 199, + "execution_count": 220, "metadata": {}, "output_type": "execute_result" } @@ -6000,7 +6872,7 @@ }, { "cell_type": "code", - "execution_count": 200, + "execution_count": 221, "metadata": { "slideshow": { "slide_type": "fragment" @@ -6013,7 +6885,7 @@ "1.0" ] }, - "execution_count": 200, + "execution_count": 221, "metadata": {}, "output_type": "execute_result" } @@ -6024,10 +6896,10 @@ }, { "cell_type": "code", - "execution_count": 201, + "execution_count": 222, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -6037,7 +6909,7 @@ "2.0" ] }, - "execution_count": 201, + "execution_count": 222, "metadata": {}, "output_type": "execute_result" } @@ -6059,7 +6931,7 @@ }, { "cell_type": "code", - "execution_count": 202, + "execution_count": 223, "metadata": { "slideshow": { "slide_type": "fragment" @@ -6072,7 +6944,7 @@ "(1-2j)" ] }, - "execution_count": 202, + "execution_count": 223, "metadata": {}, "output_type": "execute_result" } @@ -6160,7 +7032,7 @@ }, { "cell_type": "code", - "execution_count": 203, + "execution_count": 224, "metadata": { "slideshow": { "slide_type": "slide" @@ -6173,10 +7045,10 @@ }, { "cell_type": "code", - "execution_count": 204, + "execution_count": 225, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "slide" } }, "outputs": [ @@ -6201,7 +7073,7 @@ " 'abstractmethod']" ] }, - "execution_count": 204, + "execution_count": 225, "metadata": {}, "output_type": "execute_result" } @@ -6227,7 +7099,7 @@ }, { "cell_type": "code", - "execution_count": 205, + "execution_count": 226, "metadata": { "scrolled": true, "slideshow": { @@ -6338,7 +7210,7 @@ }, { "cell_type": "code", - "execution_count": 206, + "execution_count": 227, "metadata": { "slideshow": { "slide_type": "skip" @@ -6490,6 +7362,17 @@ "help(complex)" ] }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "### Duck Typing" + ] + }, { "cell_type": "markdown", "metadata": { @@ -6498,14 +7381,16 @@ } }, "source": [ - "The primary purpose of ABCs is to classify the *concrete* data types and standardize how they behave.\n", + "The primary purpose of ABCs is to classify the *concrete* data types and standardize how they behave. This guides us as the programmers in what kind of behavior we should expect from objects of a given data type. In this context, ABCs are not reflected in code but only in our heads.\n", "\n", - "For, example, as all numeric data types are `Complex` numbers in the abstract sense, they all work with the built-in [abs()](https://docs.python.org/3/library/functions.html#abs) function (cf., [documentation](https://docs.python.org/3/library/numbers.html#numbers.Complex)). While it is intuitively clear what the [absolute value](https://en.wikipedia.org/wiki/Absolute_value) (i.e., \"distance\" to $0$) of an integer, a fraction, or any real number is, [abs()](https://docs.python.org/3/library/functions.html#abs) calculates the equivalent of that for complex numbers. That concept is called the [magnitude](https://en.wikipedia.org/wiki/Magnitude_%28mathematics%29) of a number, and is really a *generalization* of the absolute value." + "For, example, as all numeric data types are `Complex` numbers in the abstract sense, they all work with the built-in [abs()](https://docs.python.org/3/library/functions.html#abs) function (cf., [documentation](https://docs.python.org/3/library/numbers.html#numbers.Complex)). While it is intuitively clear what the [absolute value](https://en.wikipedia.org/wiki/Absolute_value) (i.e., \"distance\" to $0$) of an integer, a fraction, or any real number is, [abs()](https://docs.python.org/3/library/functions.html#abs) calculates the equivalent of that for complex numbers. That concept is called the [magnitude](https://en.wikipedia.org/wiki/Magnitude_%28mathematics%29) of a number, and is really a *generalization* of the absolute value.\n", + "\n", + "Relating back to the concept of **duck typing** mentioned in [Chapter 4](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/04_iteration_00_lecture.ipynb#Type-Checking-&-Input-Validation), `int`, `float`, and `complex` objects \"walk\" and \"quack\" alike in context of the [abs()](https://docs.python.org/3/library/functions.html#abs) function." ] }, { "cell_type": "code", - "execution_count": 207, + "execution_count": 228, "metadata": { "slideshow": { "slide_type": "slide" @@ -6515,48 +7400,48 @@ { "data": { "text/plain": [ - "42" + "1" ] }, - "execution_count": 207, + "execution_count": 228, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "abs(-42)" + "abs(-1)" ] }, { "cell_type": "code", - "execution_count": 208, + "execution_count": 229, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ - "Decimal('0.1')" + "42.87" ] }, - "execution_count": 208, + "execution_count": 229, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "abs(Decimal(\"-0.1\"))" + "abs(-42.87)" ] }, { "cell_type": "code", - "execution_count": 209, + "execution_count": 230, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -6566,7 +7451,7 @@ "5.0" ] }, - "execution_count": 209, + "execution_count": 230, "metadata": {}, "output_type": "execute_result" } @@ -6588,7 +7473,31 @@ }, { "cell_type": "code", - "execution_count": 210, + "execution_count": 231, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "100" + ] + }, + "execution_count": 231, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "round(123, -2)" + ] + }, + { + "cell_type": "code", + "execution_count": 232, "metadata": { "slideshow": { "slide_type": "fragment" @@ -6601,7 +7510,7 @@ "42" ] }, - "execution_count": 210, + "execution_count": 232, "metadata": {}, "output_type": "execute_result" } @@ -6610,30 +7519,6 @@ "round(42.1)" ] }, - { - "cell_type": "code", - "execution_count": 211, - "metadata": { - "slideshow": { - "slide_type": "-" - } - }, - "outputs": [ - { - "data": { - "text/plain": [ - "0" - ] - }, - "execution_count": 211, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "round(Decimal(\"0.1\"))" - ] - }, { "cell_type": "markdown", "metadata": { @@ -6642,15 +7527,15 @@ } }, "source": [ - "`Complex` numbers are two-dimensional. So, rounding makes no sense and raises a `TypeError`." + "`Complex` numbers are two-dimensional. So, rounding makes no sense here and leads to a `TypeError`. So, in the context of the [round()](https://docs.python.org/3/library/functions.html#round) function, `int` and `float` objects \"walk\" and \"quack\" alike whereas `complex` objects do not." ] }, { "cell_type": "code", - "execution_count": 212, + "execution_count": 233, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -6661,311 +7546,13 @@ "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mround\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m4\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0;36m3j\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mround\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m1\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0;36m2j\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mTypeError\u001b[0m: type complex doesn't define __round__ method" ] } ], "source": [ - "round(4 + 3j)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "slideshow": { - "slide_type": "skip" - } - }, - "source": [ - "Knowing what ABCs a numeric type adheres to, is not only important in the context of built-ins. The [trunc()](https://docs.python.org/3/library/math.html#math.trunc) function from the [math](https://docs.python.org/3/library/math.html) module in the [standard library](https://docs.python.org/3/library/index.html), for example, only works with `Real` types (cf., [documentation](https://docs.python.org/3/library/numbers.html#numbers.Real))." - ] - }, - { - "cell_type": "code", - "execution_count": 213, - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "outputs": [], - "source": [ - "import math" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "slideshow": { - "slide_type": "skip" - } - }, - "source": [ - "[trunc()](https://docs.python.org/3/library/math.html#math.trunc) cuts off a number's decimals." - ] - }, - { - "cell_type": "code", - "execution_count": 214, - "metadata": { - "slideshow": { - "slide_type": "-" - } - }, - "outputs": [ - { - "data": { - "text/plain": [ - "0" - ] - }, - "execution_count": 214, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "math.trunc(9 / 10)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "slideshow": { - "slide_type": "skip" - } - }, - "source": [ - "A `Complex` number leads to a `TypeError`." - ] - }, - { - "cell_type": "code", - "execution_count": 215, - "metadata": { - "slideshow": { - "slide_type": "fragment" - } - }, - "outputs": [ - { - "ename": "TypeError", - "evalue": "type complex doesn't define __trunc__ method", - "output_type": "error", - "traceback": [ - "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", - "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mmath\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtrunc\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m0.9\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0;36m1j\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", - "\u001b[0;31mTypeError\u001b[0m: type complex doesn't define __trunc__ method" - ] - } - ], - "source": [ - "math.trunc(0.9 + 1j)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "slideshow": { - "slide_type": "skip" - } - }, - "source": [ - "Another way to use ABCs is in place of a *concrete* data type.\n", - "\n", - "For example, we may pass them as arguments to the built-in [isinstance()](https://docs.python.org/3/library/functions.html#isinstance) function and check in which of the five mathematical sets the object `1 / 10` is." - ] - }, - { - "cell_type": "code", - "execution_count": 216, - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "outputs": [ - { - "data": { - "text/plain": [ - "True" - ] - }, - "execution_count": 216, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "isinstance(1 / 10, float) # we know that from before" - ] - }, - { - "cell_type": "code", - "execution_count": 217, - "metadata": { - "slideshow": { - "slide_type": "fragment" - } - }, - "outputs": [ - { - "data": { - "text/plain": [ - "True" - ] - }, - "execution_count": 217, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "isinstance(1 / 10, numbers.Number) # a float object is a Number in the abstract sense" - ] - }, - { - "cell_type": "code", - "execution_count": 218, - "metadata": { - "slideshow": { - "slide_type": "-" - } - }, - "outputs": [ - { - "data": { - "text/plain": [ - "True" - ] - }, - "execution_count": 218, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "isinstance(1 / 10, numbers.Complex) # float objects are always also Complex numbers" - ] - }, - { - "cell_type": "code", - "execution_count": 219, - "metadata": { - "slideshow": { - "slide_type": "-" - } - }, - "outputs": [ - { - "data": { - "text/plain": [ - "True" - ] - }, - "execution_count": 219, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "isinstance(1 / 10, numbers.Real) # a float object's purpose is to model a Real number" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "slideshow": { - "slide_type": "skip" - } - }, - "source": [ - "Due to the `float` type's inherent imprecision, `1 / 10` is *not* a rational number." - ] - }, - { - "cell_type": "code", - "execution_count": 220, - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "outputs": [ - { - "data": { - "text/plain": [ - "False" - ] - }, - "execution_count": 220, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "isinstance(1 / 10, numbers.Rational) # the type of `1 / 10` is what is important, not its value" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "slideshow": { - "slide_type": "skip" - } - }, - "source": [ - "The `Fraction` type qualifies as a rational number; however, the `Decimal` type does not." - ] - }, - { - "cell_type": "code", - "execution_count": 221, - "metadata": { - "slideshow": { - "slide_type": "fragment" - } - }, - "outputs": [ - { - "data": { - "text/plain": [ - "True" - ] - }, - "execution_count": 221, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "isinstance(Fraction(1, 10), numbers.Rational)" - ] - }, - { - "cell_type": "code", - "execution_count": 222, - "metadata": { - "slideshow": { - "slide_type": "-" - } - }, - "outputs": [ - { - "data": { - "text/plain": [ - "False" - ] - }, - "execution_count": 222, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "isinstance(Decimal(\"0.1\"), numbers.Rational)" + "round(1 + 2j)" ] }, { @@ -6987,7 +7574,197 @@ } }, "source": [ - "Replacing *concrete* data types with ABCs is particularly valuable in the context of input validation: The revised version of the `factorial()` function below allows its user to take advantage of *duck typing*: If a real but non-integer argument `n` is passed in, `factorial()` tries to cast `n` as an `int` object with [math.trunc()](https://docs.python.org/3/library/math.html#math.trunc).\n", + "Another way to use ABCs is in place of a *concrete* data type.\n", + "\n", + "For example, we may pass them as arguments to the built-in [isinstance()](https://docs.python.org/3/library/functions.html#isinstance) function and check in which of the five mathematical sets the object `1 / 10` is." + ] + }, + { + "cell_type": "code", + "execution_count": 234, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 234, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "isinstance(1 / 10, float)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "A `float` object is a generic `Number` in the abstract sense but may also be seen as a `Complex` or `Real` number." + ] + }, + { + "cell_type": "code", + "execution_count": 235, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 235, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "isinstance(1 / 10, numbers.Number)" + ] + }, + { + "cell_type": "code", + "execution_count": 236, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 236, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "isinstance(1 / 10, numbers.Complex)" + ] + }, + { + "cell_type": "code", + "execution_count": 237, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 237, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "isinstance(1 / 10, numbers.Real)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Due to the `float` type's inherent imprecision, `1 / 10` is *not* a `Rational` number." + ] + }, + { + "cell_type": "code", + "execution_count": 238, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "False" + ] + }, + "execution_count": 238, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "isinstance(1 / 10, numbers.Rational)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "However, if we model `1 / 10` as a `Fraction`, it is recognized as a `Rational` number." + ] + }, + { + "cell_type": "code", + "execution_count": 239, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 239, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "isinstance(Fraction(\"1/10\"), numbers.Rational)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Replacing *concrete* data types with ABCs is particularly valuable in the context of input validation: The revised version of the `factorial()` function below allows its user to take advantage of *duck typing*: If a real but non-integer argument `n` is passed in, `factorial()` tries to cast `n` as an `int` object with the [int()](https://docs.python.org/3/library/functions.html#int) built-in.\n", "\n", "Two popular and distinguished Pythonistas, [Luciano Ramalho](https://github.com/ramalho) and [Alex Martelli](https://en.wikipedia.org/wiki/Alex_Martelli), coin the term **goose typing** to specifically mean using the built-in [isinstance()](https://docs.python.org/3/library/functions.html#isinstance) function with an ABC (cf., Chapter 11 in this [book](https://www.amazon.com/Fluent-Python-Concise-Effective-Programming/dp/1491946008) or this [summary](https://dgkim5360.github.io/blog/python/2017/07/duck-typing-vs-goose-typing-pythonic-interfaces/) thereof)." ] @@ -7005,7 +7782,7 @@ }, { "cell_type": "code", - "execution_count": 223, + "execution_count": 240, "metadata": { "slideshow": { "slide_type": "slide" @@ -7018,7 +7795,8 @@ "\n", " Args:\n", " n (int): number to calculate the factorial for; must be positive\n", - " strict (bool): if n must not contain decimals; defaults to True\n", + " strict (bool): if n must not contain decimals; defaults to True;\n", + " if set to False, the decimals in n are ignored\n", "\n", " Returns:\n", " factorial (int)\n", @@ -7029,15 +7807,15 @@ " \"\"\"\n", " if not isinstance(n, numbers.Integral):\n", " if isinstance(n, numbers.Real):\n", - " if n != math.trunc(n) and strict:\n", - " raise TypeError(\"n is not an integer-like value; it has decimals\")\n", - " n = math.trunc(n)\n", + " if n != int(n) and strict:\n", + " raise TypeError(\"n is not integer-like; it has decimals\")\n", + " n = int(n)\n", " else:\n", " raise TypeError(\"Factorial is only defined for integers\")\n", "\n", " if n < 0:\n", " raise ValueError(\"Factorial is not defined for negative integers\")\n", - " elif n == 0: # = base case\n", + " elif n == 0:\n", " return 1\n", " return n * factorial(n - 1)" ] @@ -7055,7 +7833,7 @@ }, { "cell_type": "code", - "execution_count": 224, + "execution_count": 241, "metadata": { "slideshow": { "slide_type": "slide" @@ -7068,7 +7846,7 @@ "1" ] }, - "execution_count": 224, + "execution_count": 241, "metadata": {}, "output_type": "execute_result" } @@ -7079,31 +7857,7 @@ }, { "cell_type": "code", - "execution_count": 225, - "metadata": { - "slideshow": { - "slide_type": "-" - } - }, - "outputs": [ - { - "data": { - "text/plain": [ - "6" - ] - }, - "execution_count": 225, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "factorial(3)" - ] - }, - { - "cell_type": "code", - "execution_count": 226, + "execution_count": 242, "metadata": { "slideshow": { "slide_type": "fragment" @@ -7116,7 +7870,31 @@ "6" ] }, - "execution_count": 226, + "execution_count": 242, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "factorial(3)" + ] + }, + { + "cell_type": "code", + "execution_count": 243, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "6" + ] + }, + "execution_count": 243, "metadata": {}, "output_type": "execute_result" } @@ -7138,7 +7916,7 @@ }, { "cell_type": "code", - "execution_count": 227, + "execution_count": 244, "metadata": { "slideshow": { "slide_type": "slide" @@ -7147,14 +7925,14 @@ "outputs": [ { "ename": "TypeError", - "evalue": "n is not an integer-like value; it has decimals", + "evalue": "n is not integer-like; it has decimals", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mfactorial\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m3.1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", - "\u001b[0;32m\u001b[0m in \u001b[0;36mfactorial\u001b[0;34m(n, strict)\u001b[0m\n\u001b[1;32m 16\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0misinstance\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mn\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mnumbers\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mReal\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 17\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mn\u001b[0m \u001b[0;34m!=\u001b[0m \u001b[0mmath\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtrunc\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mn\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32mand\u001b[0m \u001b[0mstrict\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 18\u001b[0;31m \u001b[0;32mraise\u001b[0m \u001b[0mTypeError\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"n is not an integer-like value; it has decimals\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 19\u001b[0m \u001b[0mn\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mmath\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtrunc\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mn\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 20\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", - "\u001b[0;31mTypeError\u001b[0m: n is not an integer-like value; it has decimals" + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mfactorial\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m3.1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;32m\u001b[0m in \u001b[0;36mfactorial\u001b[0;34m(n, strict)\u001b[0m\n\u001b[1;32m 17\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0misinstance\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mn\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mnumbers\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mReal\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 18\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mn\u001b[0m \u001b[0;34m!=\u001b[0m \u001b[0mint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mn\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32mand\u001b[0m \u001b[0mstrict\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 19\u001b[0;31m \u001b[0;32mraise\u001b[0m \u001b[0mTypeError\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"n is not integer-like; it has decimals\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 20\u001b[0m \u001b[0mn\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mn\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 21\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;31mTypeError\u001b[0m: n is not integer-like; it has decimals" ] } ], @@ -7175,7 +7953,7 @@ }, { "cell_type": "code", - "execution_count": 228, + "execution_count": 245, "metadata": { "slideshow": { "slide_type": "fragment" @@ -7188,7 +7966,7 @@ "6" ] }, - "execution_count": 228, + "execution_count": 245, "metadata": {}, "output_type": "execute_result" } @@ -7210,7 +7988,7 @@ }, { "cell_type": "code", - "execution_count": 229, + "execution_count": 246, "metadata": { "slideshow": { "slide_type": "slide" @@ -7224,8 +8002,8 @@ "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mfactorial\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m1\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0;36m2j\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", - "\u001b[0;32m\u001b[0m in \u001b[0;36mfactorial\u001b[0;34m(n, strict)\u001b[0m\n\u001b[1;32m 19\u001b[0m \u001b[0mn\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mmath\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtrunc\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mn\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 20\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 21\u001b[0;31m \u001b[0;32mraise\u001b[0m \u001b[0mTypeError\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"Factorial is only defined for integers\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 22\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 23\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mn\u001b[0m \u001b[0;34m<\u001b[0m \u001b[0;36m0\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mfactorial\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m1\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0;36m2j\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;32m\u001b[0m in \u001b[0;36mfactorial\u001b[0;34m(n, strict)\u001b[0m\n\u001b[1;32m 20\u001b[0m \u001b[0mn\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mn\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 21\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 22\u001b[0;31m \u001b[0;32mraise\u001b[0m \u001b[0mTypeError\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"Factorial is only defined for integers\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 23\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 24\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mn\u001b[0m \u001b[0;34m<\u001b[0m \u001b[0;36m0\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mTypeError\u001b[0m: Factorial is only defined for integers" ] } @@ -7269,6 +8047,242 @@ "\n", "The **numerical tower** is Python's way of implementing various **abstract** ideas of what numbers are in mathematics." ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "## Further Resources" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The two videos below show how addition and multiplication works with numbers in their binary representations. Subtraction is a bit more involved as we need to understand how negative numbers are represented in binary with the concept of [Two's Complement](https://en.wikipedia.org/wiki/Two%27s_complement) first. A video on that is shown further below. Division in binary is actually also quite simple." + ] + }, + { + "cell_type": "code", + "execution_count": 247, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "image/jpeg": "\n", + "text/html": [ + "\n", + " \n", + " " + ], + "text/plain": [ + "" + ] + }, + "execution_count": 247, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "from IPython.display import YouTubeVideo\n", + "YouTubeVideo(\"RgklPQ8rbkg\", width=\"60%\")" + ] + }, + { + "cell_type": "code", + "execution_count": 248, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "image/jpeg": "\n", + "text/html": [ + "\n", + " \n", + " " + ], + "text/plain": [ + "" + ] + }, + "execution_count": 248, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "YouTubeVideo(\"xHWKYFhhtJQ\", width=\"60%\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The video below explains the idea behind [Two's Complement](https://en.wikipedia.org/wiki/Two%27s_complement). This is how most modern programming languages implement negative integers. The video also shows how subtraction in binary works." + ] + }, + { + "cell_type": "code", + "execution_count": 249, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "image/jpeg": "\n", + "text/html": [ + "\n", + " \n", + " " + ], + "text/plain": [ + "" + ] + }, + "execution_count": 249, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "YouTubeVideo(\"4qH4unVtJkE\", width=\"60%\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "This video by the YouTube channel [Computerphile](https://www.youtube.com/channel/UC9-y-6csu5WGm29I7JiwpnA) explains floating point numbers in an intuitive way with some numeric examples." + ] + }, + { + "cell_type": "code", + "execution_count": 250, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "image/jpeg": "\n", + "text/html": [ + "\n", + " \n", + " " + ], + "text/plain": [ + "" + ] + }, + "execution_count": 250, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "YouTubeVideo(\"PZRI1IfStY0\", width=\"60%\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Below is a short introduction to [complex numbers](https://en.wikipedia.org/wiki/Complex_number) by [MIT](https://www.mit.edu) professor [Gilbert Strang](https://en.wikipedia.org/wiki/Gilbert_Strang) aimed at high school students." + ] + }, + { + "cell_type": "code", + "execution_count": 251, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "image/jpeg": "\n", + "text/html": [ + "\n", + " \n", + " " + ], + "text/plain": [ + "" + ] + }, + "execution_count": 251, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "YouTubeVideo(\"Jkv-55ndVYY\", width=\"60%\")" + ] } ], "metadata": { diff --git a/05_numbers_02_exercises.ipynb b/05_numbers_02_exercises.ipynb index ab2499d..ff5e9a8 100644 --- a/05_numbers_02_exercises.ipynb +++ b/05_numbers_02_exercises.ipynb @@ -19,7 +19,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Read [Chapter 5](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/05_numbers_00_lecture.ipynb) of the book. Then, work through the exercises below." + "Read [Chapter 5](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/05_numbers_00_lecture.ipynb) of the book. Then, work through the exercises below. The `...` indicate where you need to fill in your answers. You should not need to create any additional code cells." ] }, { @@ -33,20 +33,20 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Q1** in [Chapter 2's Exercises](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/02_functions_02_exercises.ipynb#Volume-of-a-Sphere) section already revealed that we must consider the effects of the `float` type's imprecision.\n", + "The \"*Volume of a Sphere*\" problem in [Chapter 2's Exercises](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/02_functions_02_exercises.ipynb#Volume-of-a-Sphere) section revealed that we must consider the effects of the `float` type's imprecision.\n", "\n", "This becomes even more important when we deal with numeric data modeling accounting or finance data (cf., [this comment](https://stackoverflow.com/a/24976426) on \"falsehoods programmers believe about money\").\n", "\n", "In addition to the *inherent imprecision* of numbers in general, the topic of **[rounding numbers](https://en.wikipedia.org/wiki/Rounding)** is also not as trivial as we might expect! [This article](https://realpython.com/python-rounding/) summarizes everything the data science practitioner needs to know.\n", "\n", - "In this exercise, we revisit **Q1** from [Chapter 3's Exercises](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/03_conditionals_02_exercises.ipynb#Discounting-Customer-Orders) section, and make the `discounted_price()` function work *correctly* for real-life sales data." + "In this exercise, we revisit the \"*Discounting Customer Orders*\" problem from [Chapter 3's Exercises](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/03_conditionals_02_exercises.ipynb#Discounting-Customer-Orders) section and make the `discounted_price()` function work *correctly* for real-life sales data." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "**Q1.1**: Execute the code cells below! What results would you have *expected*, and why?" + "**Q1**: Execute the code cells below! What results would you have *expected*, and why?" ] }, { @@ -87,7 +87,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Q1.2**: The built-in [round()](https://docs.python.org/3/library/functions.html#round) function implements the \"**[round half to even](https://en.wikipedia.org/wiki/Rounding#Round_half_to_even)**\" strategy. Describe in one or two sentences what that means!" + "**Q2**: The built-in [round()](https://docs.python.org/3/library/functions.html#round) function implements the \"**[round half to even](https://en.wikipedia.org/wiki/Rounding#Round_half_to_even)**\" strategy. Describe in one or two sentences what that means!" ] }, { @@ -101,7 +101,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Q1.3**: For the revised `discounted_price()` function, we have to tackle *two* issues: First, we have to replace the built-in `float` type with a data type that allows us to control the precision. Second, the discounted price should be rounded according to a more human-friendly rounding strategy, namely \"**[round half away from zero](https://en.wikipedia.org/wiki/Rounding#Round_half_away_from_zero)**.\"\n", + "**Q3**: For the revised `discounted_price()` function, we have to tackle *two* issues: First, we have to replace the built-in `float` type with a data type that allows us to control the precision. Second, the discounted price should be rounded according to a more human-friendly rounding strategy, namely \"**[round half away from zero](https://en.wikipedia.org/wiki/Rounding#Round_half_away_from_zero)**.\"\n", "\n", "Describe in one or two sentences how \"**[round half away from zero](https://en.wikipedia.org/wiki/Rounding#Round_half_away_from_zero)**\" is more in line with how humans think of rounding!" ] @@ -117,7 +117,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Q1.4**: We use the `Decimal` type from the [decimal](https://docs.python.org/3/library/decimal.html) module in the [standard library](https://docs.python.org/3/library/index.html) to tackle *both* issues simultaneously.\n", + "**Q4**: We use the `Decimal` type from the [decimal](https://docs.python.org/3/library/decimal.html) module in the [standard library](https://docs.python.org/3/library/index.html) to tackle *both* issues simultaneously.\n", "\n", "Assign `euro` a numeric object such that both `Decimal(\"1.5\")` and `Decimal(\"2.5\")` are rounded to `Decimal(\"2\")` (i.e., no decimal) with the [quantize()](https://docs.python.org/3/library/decimal.html#decimal.Decimal.quantize) method!" ] @@ -162,7 +162,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Q1.5**: Obviously, the two preceding code cells still [round half to even](https://en.wikipedia.org/wiki/Rounding#Round_half_to_even).\n", + "**Q5**: Obviously, the two preceding code cells still [round half to even](https://en.wikipedia.org/wiki/Rounding#Round_half_to_even).\n", "\n", "The [decimal](https://docs.python.org/3/library/decimal.html) module defines a `ROUND_HALF_UP` flag that we can pass as the second argument to the [quantize()](https://docs.python.org/3/library/decimal.html#decimal.Decimal.quantize) method. Then, it [rounds half away from zero](https://en.wikipedia.org/wiki/Rounding#Round_half_away_from_zero).\n", "\n", @@ -200,7 +200,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Q1.6**: Instead of `euro`, define `cents` such that rounding occurs to *two* decimals! `Decimal(\"2.675\")` should now be rounded to `Decimal(\"2.68\")`. Do *not* forget to include the `ROUND_HALF_UP` flag!" + "**Q6**: Instead of `euro`, define `cents` such that rounding occurs to *two* decimals! `Decimal(\"2.675\")` should now be rounded to `Decimal(\"2.68\")`. Do *not* forget to include the `ROUND_HALF_UP` flag!" ] }, { @@ -225,7 +225,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Q1.7**: Rewrite the function `discounted_price()` from [Chapter 3's Exercises](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/03_conditionals_02_exercises.ipynb#Discounting-Customer-Orders) section!\n", + "**Q7**: Rewrite the function `discounted_price()` from [Chapter 3's Exercises](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/03_conditionals_02_exercises.ipynb#Discounting-Customer-Orders) section!\n", "\n", "It takes the *positional* arguments `unit_price` and `quantity` and implements a discount scheme for a line item in a customer order as follows:\n", "\n", @@ -264,7 +264,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Q1.8**: Execute the code cells below and verify the final price for the following four test cases:\n", + "**Q8**: Execute the code cells below and verify the final price for the following four test cases:\n", "\n", "- $7$ smartphones @ $99.00$ USD\n", "- $3$ workstations @ $999.00$ USD\n", diff --git a/06_text_00_lecture.ipynb b/06_text_00_lecture.ipynb index fc0eeb2..3c23901 100644 --- a/06_text_00_lecture.ipynb +++ b/06_text_00_lecture.ipynb @@ -19,7 +19,7 @@ } }, "source": [ - "In this chapter, we continue the study of the built-in data types. Building on our knowledge of numbers, the next layer consists of textual data that are modeled primarily with the `str` type in Python. `str` objects are naturally more \"complex\" than numeric objects as any text consists of an arbitrary and possibly large number of individual characters that may be chosen from any alphabet in the history of humankind. Luckily, Python abstracts away most of this complexity." + "In this chapter, we continue the study of the built-in data types. The next layer on top of numbers consists of **textual data** that are modeled primarily with the `str` type in Python. `str` objects are more complex than the numeric objects in [Chapter 5](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/05_numbers_00_lecture.ipynb) as they *consist* of an *arbitrary* and possibly large number of *individual* characters that may be chosen from *any* alphabet in the history of humankind. Luckily, Python abstracts away most of this complexity from us. However, after looking at the `str` type in great detail, we briefly introduce the `bytes` type at the end of this chapter, and learn how characters are modeled in memory." ] }, { @@ -41,7 +41,7 @@ } }, "source": [ - "The `str` type is the default way of modeling **textual data**. To create a `str` object, we use a **literal notation** and type the text between enclosing **double quotes** `\"`." + "To create a `str` object, we use the *literal* notation and type the text between enclosing **double quotes** `\"`." ] }, { @@ -54,7 +54,7 @@ }, "outputs": [], "source": [ - "text = \"Lorem ipsum dolor sit amet, ...\"" + "text = \"Lorem ipsum dolor sit amet.\"" ] }, { @@ -65,7 +65,7 @@ } }, "source": [ - "Like everything in Python, `text` is an object." + "Like everything in Python, `text` is an object with an *identity*, a *type*, and a *value*." ] }, { @@ -80,7 +80,7 @@ { "data": { "text/plain": [ - "140483431254256" + "140461336295424" ] }, "execution_count": 2, @@ -97,7 +97,7 @@ "execution_count": 3, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -124,9 +124,9 @@ } }, "source": [ - "A `str` object evaluates to itself in a literal notation with enclosing **single quotes** `'` by default. In [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements_00_lecture.ipynb#Value-/-\"Meaning\"), we already specified the double quotes `\"` convention we stick to in this book. Yet, single quotes `'` and double quotes `\"` are *perfect* substitutes for all `str` objects that do *not* contain any of the two symbols in it. We could use the reverse convention, as well.\n", + "As seen before, a `str` object evaluates to itself in a literal notation with enclosing **single quotes** `'`.\n", "\n", - "As [this discussion](https://stackoverflow.com/questions/56011/single-quotes-vs-double-quotes-in-python) shows, many programmers have *strong* opinions about that and make up *new* conventions for their projects. Consequently, the discussion was \"closed as not constructive\" by the moderators." + "In [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements_00_lecture.ipynb#Value-/-\"Meaning\"), we specify the double quotes `\"` convention this book follows. Yet, single quotes `'` and double quotes `\"` are *perfect* substitutes. We could use the reverse convention, as well. As [this discussion](https://stackoverflow.com/questions/56011/single-quotes-vs-double-quotes-in-python) shows, many programmers have *strong* opinions about such conventions. Consequently, the discussion was \"closed as not constructive\" by the moderators." ] }, { @@ -134,14 +134,14 @@ "execution_count": 4, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ - "'Lorem ipsum dolor sit amet, ...'" + "'Lorem ipsum dolor sit amet.'" ] }, "execution_count": 4, @@ -161,13 +161,11 @@ } }, "source": [ - "As the single quote `'` is often used in the English language as a shortener, we could make an argument in favor of using the double quotes `\"`: There are possibly fewer situations like in the two code cells below, in which we must revert to using a `\\` to **escape** a single quote `'` in a text (cf., the [Special Characters](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/06_text_00_lecture.ipynb#Special-Characters) section further below). However, double quotes `\"` are often used as well. So, this argument is somewhat not convincing.\n", + "As the single quote `'` is often used in the English language as a shortener, we could make an argument in favor of using the double quotes `\"`: There are possibly fewer situations like the two code cells below, where we must **escape** the kind of quote used as the `str` object's delimiter with a backslash `\"\\\"` inside the text (cf., also the \"*Unicode & (Special) Characters*\" section further below). However, double quotes `\"` are often used as well, for example, to indicate a quote like the one by [Albert Einstein](https://de.wikipedia.org/wiki/Albert_Einstein) below. So, such arguments are not convincing.\n", "\n", - "Many proponents of the single quote `'` usage claim that double quotes `\"` make more **visual noise** on the screen. This argument is also not convincing. On the contrary, one could claim that *two* single quotes `''` look so similar to *one* double quote `\"` that it might not be apparent right away what we are looking at. By sticking to double quotes `\"`, we avoid such danger of confusion.\n", + "Many proponents of the single quote `'` usage claim that double quotes `\"` cause more **visual noise** on the screen. However, this argument is also not convincing as, for example, one could claim that *two* single quotes `''` look so similar to *one* double quote `\"` that a reader may confuse an *empty* `str` object with a missing closing quote `\"`. With the double quotes `\"` convention we at least avoid such confusion (i.e., empty `str` objects are written as `\"\"`).\n", "\n", - "This discussion is an exellent example of a [flame war](https://en.wikipedia.org/wiki/Flaming_%28Internet%29#Flame_war) in the programming world that leads to *no* result.\n", - "\n", - "An *important* fact to know is that enclosing quotes of either kind are *not* part of the `str` object's *value*! They are merely *syntax* to make the text in a code cell a *literal* that Python converts into a `str` object upon reading." + "This discussion is an excellent example of a [flame war](https://en.wikipedia.org/wiki/Flaming_%28Internet%29#Flame_war) in the programming world: Everyone has an opinion and the discussion leads to *no* result." ] }, { @@ -182,7 +180,7 @@ { "data": { "text/plain": [ - "'It\\'s cool that \"strings\" are versatile'" + "'Einstein said, \"If you can\\'t explain it, you don\\'t understand it.\"'" ] }, "execution_count": 5, @@ -191,7 +189,7 @@ } ], "source": [ - "\"It's cool that \\\"strings\\\" are versatile\"" + "\"Einstein said, \\\"If you can't explain it, you don't understand it.\\\"\"" ] }, { @@ -199,14 +197,14 @@ "execution_count": 6, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ - "'It\\'s cool that \"strings\" are versatile'" + "'Einstein said, \"If you can\\'t explain it, you don\\'t understand it.\"'" ] }, "execution_count": 6, @@ -215,7 +213,7 @@ } ], "source": [ - "'It\\'s cool that \"strings\" are versatile'" + "'Einstein said, \"If you can\\'t explain it, you don\\'t understand it.\"'" ] }, { @@ -226,12 +224,67 @@ } }, "source": [ - "We can always use the [str()](https://docs.python.org/3/library/stdtypes.html#str) built-in to cast non-`str` objects as a `str`." + "An *important* fact to know is that enclosing quotes of either kind are *not* part of the `str` object's *value*! They are merely *syntax* indicating the literal notation.\n", + "\n", + "So, printing out the sentence with the built-in [print()](https://docs.python.org/3/library/functions.html#print) function does the same in both cases." ] }, { "cell_type": "code", "execution_count": 7, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Einstein said, \"If you can't explain it, you don't understand it.\"\n" + ] + } + ], + "source": [ + "print(\"Einstein said, \\\"If you can't explain it, you don't understand it.\\\"\")" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Einstein said, \"If you can't explain it, you don't understand it.\"\n" + ] + } + ], + "source": [ + "print('Einstein said, \"If you can\\'t explain it, you don\\'t understand it.\"')" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "As an alternative to the literal notation, we may use the built-in [str()](https://docs.python.org/3/library/stdtypes.html#str) constructor to cast non-`str` objects as `str` ones. As Chapter 10 reveals, basically any object in Python has a **text representation**. Because of that we may also pass `list` objects, the boolean `True` and `False`, or `None` to [str()](https://docs.python.org/3/library/stdtypes.html#str)." + ] + }, + { + "cell_type": "code", + "execution_count": 9, "metadata": { "slideshow": { "slide_type": "slide" @@ -241,16 +294,147 @@ { "data": { "text/plain": [ - "'123'" + "'42'" ] }, - "execution_count": 7, + "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "str(123)" + "str(42)" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'42.87'" + ] + }, + "execution_count": 10, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "str(42.87)" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'[1, 2, 3]'" + ] + }, + "execution_count": 11, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "str([1, 2, 3])" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'True'" + ] + }, + "execution_count": 12, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "str(True)" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'False'" + ] + }, + "execution_count": 13, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "str(False)" + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'None'" + ] + }, + "execution_count": 14, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "str(None)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "#### User Input" ] }, { @@ -261,12 +445,103 @@ } }, "source": [ - "Another common situation where we obtain `str` objects is when reading the contents of a file with the [open()](https://docs.python.org/3/library/functions.html#open) built-in. In its simplest form, to open a [text file](https://en.wikipedia.org/wiki/Text_file) file in read-only mode, we pass in its path (i.e., \"filename\") as a `str` object." + "As shown in the \"*Guessing a Coin Toss*\" example in [Chapter 4](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/04_iteration_00_lecture.ipynb#Example:-Guessing-a-Coin-Toss), the built-in [input()](https://docs.python.org/3/library/functions.html#input) function displays a prompt to the user and returns whatever is entered as a `str` object. [input()](https://docs.python.org/3/library/functions.html#input) is in particular valuable when writing command-line tools." ] }, { "cell_type": "code", - "execution_count": 8, + "execution_count": 15, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "name": "stdin", + "output_type": "stream", + "text": [ + "Whatever you enter is put in a new string: I will be a string\n" + ] + } + ], + "source": [ + "user_input = input(\"Whatever you enter is put in a new string: \")" + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "str" + ] + }, + "execution_count": 16, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "type(user_input)" + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'I will be a string'" + ] + }, + "execution_count": 17, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "user_input" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "#### Reading Files" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "A more common situation where we obtain `str` objects is when reading the contents of a file with the [open()](https://docs.python.org/3/library/functions.html#open) built-in. In its simplest usage form, to open a [text file](https://en.wikipedia.org/wiki/Text_file) file, we pass in its path (i.e., \"filename\") as a `str` object." + ] + }, + { + "cell_type": "code", + "execution_count": 18, "metadata": { "slideshow": { "slide_type": "slide" @@ -285,12 +560,36 @@ } }, "source": [ - "[open()](https://docs.python.org/3/library/functions.html#open) returns a **[proxy](https://en.wikipedia.org/wiki/Proxy_pattern)** object of type `TextIOWrapper` that allows us to interact with the file on disk." + "[open()](https://docs.python.org/3/library/functions.html#open) returns a **[proxy](https://en.wikipedia.org/wiki/Proxy_pattern)** object of type `TextIOWrapper` that allows us to interact with the file on disk. `mode='r'` shows that we opened the file in read-only mode and `encoding='UTF-8'` is explained in detail in the \"*The `bytes` Type*\" section at the end of this chapter." ] }, { "cell_type": "code", - "execution_count": 9, + "execution_count": 19, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "_io.TextIOWrapper" + ] + }, + "execution_count": 19, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "type(file)" + ] + }, + { + "cell_type": "code", + "execution_count": 20, "metadata": { "slideshow": { "slide_type": "fragment" @@ -303,7 +602,7 @@ "<_io.TextIOWrapper name='lorem_ipsum.txt' mode='r' encoding='UTF-8'>" ] }, - "execution_count": 9, + "execution_count": 20, "metadata": {}, "output_type": "execute_result" } @@ -312,30 +611,6 @@ "file" ] }, - { - "cell_type": "code", - "execution_count": 10, - "metadata": { - "slideshow": { - "slide_type": "-" - } - }, - "outputs": [ - { - "data": { - "text/plain": [ - "_io.TextIOWrapper" - ] - }, - "execution_count": 10, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "type(file)" - ] - }, { "cell_type": "markdown", "metadata": { @@ -344,12 +619,121 @@ } }, "source": [ - "While `file` provides, for example, the [read()](https://docs.python.org/3/library/io.html#io.TextIOBase.read), [readline()](https://docs.python.org/3/library/io.html#io.TextIOBase.readline), and [readlines()](https://docs.python.org/3/library/io.html#io.IOBase.readlines) methods to access its contents, it is also *iterable*, and we may loop over the individual lines with a `for` statement." + "`TextIOWrapper` objects come with plenty of type-specific methods and attributes." ] }, { "cell_type": "code", - "execution_count": 11, + "execution_count": 21, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 21, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "file.readable()" + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "False" + ] + }, + "execution_count": 22, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "file.writable()" + ] + }, + { + "cell_type": "code", + "execution_count": 23, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'lorem_ipsum.txt'" + ] + }, + "execution_count": 23, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "file.name" + ] + }, + { + "cell_type": "code", + "execution_count": 24, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'UTF-8'" + ] + }, + "execution_count": 24, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "file.encoding" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "So far, we have not yet read anything from the file (i.e., from disk)! That is intentional as, for example, the file could contain more data than could fit into our computer's memory. Therefore, we have to explicitly instruct the `file` object to read some of or all the data in the file.\n", + "\n", + "One way to do that, is to simply loop over the `file` object with the `for` statement as shown next: In each iteration, `line` is assigned the next line in the file. Because we may loop over `TextIOWrapper` objects, they are *iterables*." + ] + }, + { + "cell_type": "code", + "execution_count": 25, "metadata": { "slideshow": { "slide_type": "slide" @@ -388,12 +772,12 @@ } }, "source": [ - "Once we looped over `file` the first time, it is **exhausted**: That means we do not see any output if we loop over it another time." + "Once we looped over the `file` object, it is **exhausted**: We can *not* loop over it a second time. So, the built-in [print()](https://docs.python.org/3/library/functions.html#print) function is *never* called in the code cell below!" ] }, { "cell_type": "code", - "execution_count": 12, + "execution_count": 26, "metadata": { "slideshow": { "slide_type": "skip" @@ -413,12 +797,12 @@ } }, "source": [ - "After the `for`-loop, `line` is still set to the last line in the file, and we verify that it is indeed a `str` object." + "After the `for`-loop, the `line` variable is still set and references the *last* line in the file. We verify that it is indeed a `str` object." ] }, { "cell_type": "code", - "execution_count": 13, + "execution_count": 27, "metadata": { "slideshow": { "slide_type": "slide" @@ -431,7 +815,7 @@ "'the 1960s with the release of Letraset sheets.\\n'" ] }, - "execution_count": 13, + "execution_count": 27, "metadata": {}, "output_type": "execute_result" } @@ -442,10 +826,10 @@ }, { "cell_type": "code", - "execution_count": 14, + "execution_count": 28, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -455,7 +839,7 @@ "str" ] }, - "execution_count": 14, + "execution_count": 28, "metadata": {}, "output_type": "execute_result" } @@ -472,14 +856,14 @@ } }, "source": [ - "An important fact is that `file` is still associated with an *open* **[file descriptor](https://en.wikipedia.org/wiki/File_descriptor)**. Without going into any technical details, we note that an operating system can only handle a limited number of \"open files\" at the same time, and, therefore, we should always *close* the file once we are done processing it.\n", + "An *important* observation is that the `file` object is still associated with an *open* **[file descriptor](https://en.wikipedia.org/wiki/File_descriptor)**. Without going into any technical details, we note that an operating system can only handle a *limited* number of \"open files\" at the same time, and, therefore, we should always *close* the file once we are done processing it.\n", "\n", - "`file` has a `closed` attribute on it that shows us if a file descriptor is open or closed, and with the [close()](https://docs.python.org/3/library/io.html#io.IOBase.close) method, we can \"manually\" close it." + "`TextIOWrapper` objects have a `closed` attribute on them that indicates if the associated file descriptor is still open or has been closed. We can \"manually\" close any `TextIOWrapper` object with the [close()](https://docs.python.org/3/library/io.html#io.IOBase.close) method." ] }, { "cell_type": "code", - "execution_count": 15, + "execution_count": 29, "metadata": { "slideshow": { "slide_type": "slide" @@ -492,7 +876,7 @@ "False" ] }, - "execution_count": 15, + "execution_count": 29, "metadata": {}, "output_type": "execute_result" } @@ -503,7 +887,7 @@ }, { "cell_type": "code", - "execution_count": 16, + "execution_count": 30, "metadata": { "slideshow": { "slide_type": "fragment" @@ -516,10 +900,10 @@ }, { "cell_type": "code", - "execution_count": 17, + "execution_count": 31, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -529,7 +913,7 @@ "True" ] }, - "execution_count": 17, + "execution_count": 31, "metadata": {}, "output_type": "execute_result" } @@ -546,12 +930,12 @@ } }, "source": [ - "The more Pythonic way is to open a file with the `with` statement (cf., [reference](https://docs.python.org/3/reference/compound_stmts.html#the-with-statement)): The indented code block is said to be executed in the **context** of the header line that acts as a **[context manager](https://docs.python.org/3/reference/datamodel.html?highlight=context%20manager#with-statement-context-managers)**. Such objects may have many different purposes. Here, the context manager created with `with open(...) as file:` mainly ensures that the file descriptor gets automatically closed after the last line in the code block is executed." + "The more Pythonic way is to use [open()](https://docs.python.org/3/library/functions.html#open) within the compound `with` statement (cf., [reference](https://docs.python.org/3/reference/compound_stmts.html#the-with-statement)): In the example below, the indented code block is said to be executed within the **context** of the `file` object that now plays the role of a **[context manager](https://docs.python.org/3/reference/datamodel.html#with-statement-context-managers)**. Many different kinds of context managers exist in Python with different applications and purposes. Context managers returned from [open()](https://docs.python.org/3/library/functions.html#open) mainly ensure that file descriptors get automatically closed after the last line in the code block is executed." ] }, { "cell_type": "code", - "execution_count": 18, + "execution_count": 32, "metadata": { "slideshow": { "slide_type": "slide" @@ -585,7 +969,7 @@ }, { "cell_type": "code", - "execution_count": 19, + "execution_count": 33, "metadata": { "slideshow": { "slide_type": "fragment" @@ -598,7 +982,7 @@ "True" ] }, - "execution_count": 19, + "execution_count": 33, "metadata": {}, "output_type": "execute_result" } @@ -615,12 +999,12 @@ } }, "source": [ - "To use constructs familiar from [Chapter 3](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/03_conditionals_00_lecture.ipynb#The-try-Statement) to explain what `with open(...) as file:` does, below is a formulation with a `try` statement *equivalent* to the `with` statement above. The `finally`-branch is *always* executed, even if an exception is raised in the `for`-loop. So, `file` is sure to be closed too, with a somewhat less expressive formulation." + "Using syntax familiar from [Chapter 3](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/03_conditionals_00_lecture.ipynb#The-try-Statement) to explain what the `with open(...) as file:` does above, we provide an alternative formulation with a `try` statement below: The `finally`-branch is *always* executed, even if an exception is raised inside the `for`-loop. Therefore, `file` is sure to be closed too. However, this formulation is somewhat less expressive." ] }, { "cell_type": "code", - "execution_count": 20, + "execution_count": 34, "metadata": { "slideshow": { "slide_type": "skip" @@ -657,7 +1041,7 @@ }, { "cell_type": "code", - "execution_count": 21, + "execution_count": 35, "metadata": { "slideshow": { "slide_type": "skip" @@ -670,7 +1054,7 @@ "True" ] }, - "execution_count": 21, + "execution_count": 35, "metadata": {}, "output_type": "execute_result" } @@ -687,17 +1071,258 @@ } }, "source": [ - "A subtlety to notice is that there is an empty line printed between each `line`. That is because each `line` ends with a `\"\\n\"` that results in a line break and that is explained further below. To print the text without empty lines in between, we pass a `end=\"\"` argument to the [print()](https://docs.python.org/3/library/functions.html#print) function." + "As an alternative to reading the contents of a file by looping over a `TextIOWrapper` object, we may also call one of the methods they come with.\n", + "\n", + "For example, the [read()](https://docs.python.org/3/library/io.html#io.TextIOBase.read) method takes a single `size` argument of type `int` and returns a `str` object with the specified number of characters." ] }, { "cell_type": "code", - "execution_count": 22, + "execution_count": 36, "metadata": { "slideshow": { "slide_type": "slide" } }, + "outputs": [], + "source": [ + "file = open(\"lorem_ipsum.txt\")" + ] + }, + { + "cell_type": "code", + "execution_count": 37, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'Lorem Ipsum'" + ] + }, + "execution_count": 37, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "file.read(11)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "When we call [read()](https://docs.python.org/3/library/io.html#io.TextIOBase.read) again, the returned `str` object begins where the previous one left off. This is because `TextIOWrapper` objects like `file` simply store a position at which the associated file on disk is being read. In other words, `file` is like a **cursor** pointing into a file." + ] + }, + { + "cell_type": "code", + "execution_count": 38, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "' is simply '" + ] + }, + "execution_count": 38, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "file.read(11)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "On the contrary, the [readline()](https://docs.python.org/3/library/io.html#io.TextIOBase.readline) method keeps reading until it hits a **newline character**. These are shown in `str` objects as `\"\\n\"`." + ] + }, + { + "cell_type": "code", + "execution_count": 39, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'dummy text of the printing and typesetting industry.\\n'" + ] + }, + "execution_count": 39, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "file.readline()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "When we call [readline()](https://docs.python.org/3/library/io.html#io.TextIOBase.readline) again, we obtain the next line." + ] + }, + { + "cell_type": "code", + "execution_count": 40, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "\"Lorem Ipsum has been the industry's standard dummy text ever since the 1500s\\n\"" + ] + }, + "execution_count": 40, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "file.readline()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Lastly, the [readlines()](https://docs.python.org/3/library/io.html#io.IOBase.readlines) method returns a `list` object that holds *all* lines in the `file` from the current position to the end of the file. The latter position is often abbreviated as **EOF** in the documentation. Let's always remember that [readlines()](https://docs.python.org/3/library/io.html#io.IOBase.readlines) has the potential to crash a computer with a `MemoryError`." + ] + }, + { + "cell_type": "code", + "execution_count": 41, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['when an unknown printer took a galley of type and scrambled it to make a type\\n',\n", + " 'specimen book. It has survived not only five centuries but also the leap into\\n',\n", + " 'electronic typesetting, remaining essentially unchanged. It was popularised in\\n',\n", + " 'the 1960s with the release of Letraset sheets.\\n']" + ] + }, + "execution_count": 41, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "file.readlines()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Calling [readlines()](https://docs.python.org/3/library/io.html#io.IOBase.readlines) a second time, is as pointless as looping over `file` a second time." + ] + }, + { + "cell_type": "code", + "execution_count": 42, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "[]" + ] + }, + "execution_count": 42, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "file.readlines()" + ] + }, + { + "cell_type": "code", + "execution_count": 43, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "file.close()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Because every `str` object created by reading the contents of a file in any of the ways shown in this section ends with a `\"\\n\"`, we see empty lines printed between each `line` in the `for`-loops above. To print the entire text without empty lines in between, we pass a `end=\"\"` argument to the [print()](https://docs.python.org/3/library/functions.html#print) function." + ] + }, + { + "cell_type": "code", + "execution_count": 44, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, "outputs": [ { "name": "stdout", @@ -726,7 +1351,7 @@ } }, "source": [ - "## A \"String\" of Characters" + "### A String of Characters" ] }, { @@ -739,21 +1364,23 @@ "source": [ "A **sequence** is yet another *abstract* concept (cf., the \"*Containers vs. Iterables*\" section in [Chapter 4](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/04_iteration_00_lecture.ipynb#Containers-vs.-Iterables)).\n", "\n", - "It unifies *four* [orthogonal](https://en.wikipedia.org/wiki/Orthogonality) (i.e., \"independent\") behaviors into one idea: Any data type, such as `str`, is considered a sequence if it simultaneously\n", + "It unifies *four* [orthogonal](https://en.wikipedia.org/wiki/Orthogonality) (i.e., \"independent\") concepts into one bigger idea: Any data type, such as `str`, is considered a sequence if it\n", "\n", - "1. **contains** other \"things,\"\n", - "2. is **iterable**, and \n", - "3. comes with a *predictable* **order** of its\n", - "4. **finite** number of \"things.\"\n", + "1. **contains**\n", + "2. a **finite** number of other \"things\" that\n", + "3. can be **iterated** over\n", + "4. in a *predictable* **order**.\n", "\n", - "[Chapter 7](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/07_sequences_00_lecture.ipynb#Collections-vs.-Sequences) formalizes sequences in great detail. Here, we keep our focus on the `str` type that historically received its name as it models a \"**[string of characters](https://en.wikipedia.org/wiki/String_%28computer_science%29)**,\" and a \"string\" is more formally called a sequence in the computer science literature.\n", + "[Chapter 7](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/07_sequences_00_lecture.ipynb#Collections-vs.-Sequences) formalizes these concepts in great detail. Here, we keep our focus on the `str` type that historically received its name as it models a **[string of characters](https://en.wikipedia.org/wiki/String_%28computer_science%29)**, and *string* is simply another term for *sequence* in the computer science literature.\n", "\n", - "Behaving like a sequence, `str` objects may be treated like `list` objects in many cases. For example, the built-in [len()](https://docs.python.org/3/library/functions.html#len) function tells us how many elements (i.e., characters) make up `text`. [len()](https://docs.python.org/3/library/functions.html#len) would not work with an *infinite* object: As anything modeled in a program must fit into a computer's finite memory at runtime, there cannot exist objects containing a truly infinite number of elements; however, [Chapter 7](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/07_sequences_00_lecture.ipynb#Iterators-vs.-Iterables) introduces iterable data types that can be used to model an *infinite* series of elements and that, consequently, have no concept of \"length.\"" + "Another example of a sequence is the `list` type. Because of that, `str` objects may be treated like `list` objects in many situations.\n", + "\n", + "Below, the built-in [len()](https://docs.python.org/3/library/functions.html#len) function tells us how many characters make up `text`. [len()](https://docs.python.org/3/library/functions.html#len) would not work with an \"infinite\" object. As anything modeled in a program must fit into a computer's finite memory, there cannot exist truly infinite objects; however, [Chapter 7](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/07_sequences_00_lecture.ipynb#Iterators-vs.-Iterables) introduces specialized iterable data types that can be used to model an *infinite* series of \"things\" and that, consequently, have no concept of \"length.\"" ] }, { "cell_type": "code", - "execution_count": 23, + "execution_count": 45, "metadata": { "slideshow": { "slide_type": "slide" @@ -763,10 +1390,34 @@ { "data": { "text/plain": [ - "31" + "'Lorem ipsum dolor sit amet.'" ] }, - "execution_count": 23, + "execution_count": 45, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "text" + ] + }, + { + "cell_type": "code", + "execution_count": 46, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "27" + ] + }, + "execution_count": 46, "metadata": {}, "output_type": "execute_result" } @@ -783,12 +1434,45 @@ } }, "source": [ - "Being iterable, we may loop over `text` and do something with the individual characters, for example, print them out with extra space in between them." + "Being iterable, we may loop over `text` and do something with the individual characters, for example, print them out with extra space in between them. If it were not for the appropriately chosen name of the `text` variable, we could not tell what *concrete* type of object the `for` statement is looping over." ] }, { "cell_type": "code", - "execution_count": 24, + "execution_count": 47, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "L o r e m i p s u m d o l o r s i t a m e t . " + ] + } + ], + "source": [ + "for character in text:\n", + " print(character, end=\" \")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "With the [reversed()](https://docs.python.org/3/library/functions.html#reversed) built-in, we may loop over `text` in reversed order. Reversing `text` only works as it has a forward order to begin with." + ] + }, + { + "cell_type": "code", + "execution_count": 48, "metadata": { "slideshow": { "slide_type": "fragment" @@ -799,13 +1483,13 @@ "name": "stdout", "output_type": "stream", "text": [ - "L o r e m i p s u m d o l o r s i t a m e t , . . . " + ". t e m a t i s r o l o d m u s p i m e r o L " ] } ], "source": [ - "for letter in text:\n", - " print(letter, end=\" \")" + "for character in reversed(text):\n", + " print(character, end=\" \")" ] }, { @@ -818,12 +1502,36 @@ "source": [ "Being a container, we may check if a given `str` object is contained in `text` with the `in` operator.\n", "\n", - "The `in` operator has *two* usages: First, it checks if a *single* character is contained in a `str` object. Second, it may also check if a shorter `str` object, then called a **substring**, is contained in a longer one." + "The `in` operator has *two* distinct usages: First, it checks if a *single* character is contained in a `str` object. Second, it may also check if a shorter `str` object, then called a **substring**, is contained in a longer one." ] }, { "cell_type": "code", - "execution_count": 25, + "execution_count": 49, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 49, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "\"L\" in text" + ] + }, + { + "cell_type": "code", + "execution_count": 50, "metadata": { "slideshow": { "slide_type": "fragment" @@ -836,31 +1544,7 @@ "True" ] }, - "execution_count": 25, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "\"L\" in text" - ] - }, - { - "cell_type": "code", - "execution_count": 26, - "metadata": { - "slideshow": { - "slide_type": "-" - } - }, - "outputs": [ - { - "data": { - "text/plain": [ - "True" - ] - }, - "execution_count": 26, + "execution_count": 50, "metadata": {}, "output_type": "execute_result" } @@ -871,10 +1555,10 @@ }, { "cell_type": "code", - "execution_count": 27, + "execution_count": 51, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -884,7 +1568,7 @@ "False" ] }, - "execution_count": 27, + "execution_count": 51, "metadata": {}, "output_type": "execute_result" } @@ -901,7 +1585,7 @@ } }, "source": [ - "## Indexing" + "### Indexing" ] }, { @@ -912,12 +1596,12 @@ } }, "source": [ - "As `str` objects have the additional property of being *ordered*, we may **index** into them to obtain individual characters with the **indexing operator** `[]`. This is analogous to how we obtained individual elements of a `list` object in [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements_00_lecture.ipynb#Who-am-I?-And-how-many?)." + "As `str` objects are *ordered* and *finite*, we may **index** into them to obtain individual characters with the **indexing operator** `[]`. This is analogous to how we obtained individual elements of a `list` object in [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements_00_lecture.ipynb#Who-am-I?-And-how-many?)." ] }, { "cell_type": "code", - "execution_count": 28, + "execution_count": 52, "metadata": { "slideshow": { "slide_type": "slide" @@ -930,7 +1614,7 @@ "'L'" ] }, - "execution_count": 28, + "execution_count": 52, "metadata": {}, "output_type": "execute_result" } @@ -941,10 +1625,10 @@ }, { "cell_type": "code", - "execution_count": 29, + "execution_count": 53, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -954,7 +1638,7 @@ "'o'" ] }, - "execution_count": 29, + "execution_count": 53, "metadata": {}, "output_type": "execute_result" } @@ -971,12 +1655,12 @@ } }, "source": [ - "The index must be of type `int` or we get a `TypeError`." + "The index must be of type `int`; othewise, we get a `TypeError`." ] }, { "cell_type": "code", - "execution_count": 30, + "execution_count": 54, "metadata": { "slideshow": { "slide_type": "skip" @@ -990,7 +1674,7 @@ "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mtext\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m1.0\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mtext\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m1.0\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mTypeError\u001b[0m: string indices must be integers" ] } @@ -1007,12 +1691,12 @@ } }, "source": [ - "The last index is one less than the above \"length\" of the `str` object as we start counting at 0." + "The last index is one less than the above \"length\" of the `str` object as we start counting at `0`." ] }, { "cell_type": "code", - "execution_count": 31, + "execution_count": 55, "metadata": { "slideshow": { "slide_type": "fragment" @@ -1025,13 +1709,13 @@ "'.'" ] }, - "execution_count": 31, + "execution_count": 55, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "text[30] # = len(text) - 1" + "text[26] # == text[len(text) - 1]" ] }, { @@ -1042,15 +1726,15 @@ } }, "source": [ - "An `IndexError` is raised whenever the index is too large." + "An `IndexError` is raised whenever the index is out of range." ] }, { "cell_type": "code", - "execution_count": 32, + "execution_count": 56, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -1061,13 +1745,13 @@ "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mIndexError\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mtext\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m31\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;31m# = len(text)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mtext\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m27\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;31m# == text[len(text)]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mIndexError\u001b[0m: string index out of range" ] } ], "source": [ - "text[31] # = len(text)" + "text[27] # == text[len(text)]" ] }, { @@ -1078,7 +1762,7 @@ } }, "source": [ - "We may use *negative* indexes to start counting from the end of the `str` object, as shown in the figure below. That only works because sequences are *finite*." + "We may use *negative* indexes to start counting from the end of the `str` object, as shown in the figure below. Note how this only works because sequences are *finite*." ] }, { @@ -1089,18 +1773,18 @@ } }, "source": [ - "| Slot | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10| 11| 12| 13| 14| 15| 16| 17| 18| 19| 20| 21| 22| 23| 24| 25| 26| 27| 28| 29| 30|\n", - "|:---------:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|\n", - "|**Reverse**|-31|-30|-29|-28|-27|-26|-25|-24|-23|-22|-21|-20|-19|-18|-17|-16|-15|-14|-13|-12|-11|-10|-9 |-8 |-7 |-6 |-5 |-4 |-3 |-2 |-1 |\n", - "| **Char** |`L`|`o`|`r`|`e`|`m`|` `|`i`|`p`|`s`|`u`|`m`|` `|`d`|`o`|`l`|`o`|`r`|` `|`s`|`i`|`t`|` `|`a`|`m`|`e`|`t`|`,`|` `|`.`|`.`|`.`|" + "| Index | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10| 11| 12| 13| 14| 15| 16| 17| 18| 19| 20| 21| 22| 23| 24| 25| 26|\n", + "|:---------:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|:-:|\n", + "|**Reverse**|-27|-26|-25|-24|-23|-22|-21|-20|-19|-18|-17|-16|-15|-14|-13|-12|-11|-10|-9 |-8 |-7 |-6 |-5 |-4 |-3 |-2 |-1 |\n", + "| **Character** |`L`|`o`|`r`|`e`|`m`|` `|`i`|`p`|`s`|`u`|`m`|` `|`d`|`o`|`l`|`o`|`r`|` `|`s`|`i`|`t`|` `|`a`|`m`|`e`|`t`|`.`|" ] }, { "cell_type": "code", - "execution_count": 33, + "execution_count": 57, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -1110,7 +1794,7 @@ "'.'" ] }, - "execution_count": 33, + "execution_count": 57, "metadata": {}, "output_type": "execute_result" } @@ -1121,7 +1805,7 @@ }, { "cell_type": "code", - "execution_count": 34, + "execution_count": 58, "metadata": { "slideshow": { "slide_type": "fragment" @@ -1134,13 +1818,13 @@ "'L'" ] }, - "execution_count": 34, + "execution_count": 58, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "text[-31] # = -len(text)" + "text[-27] # == text[-len(text)]" ] }, { @@ -1151,15 +1835,15 @@ } }, "source": [ - "One reason why programmers like to start counting at 0 is that a positive index and its *corresponding* negative index always add up to the length of the sequence. Here, `6` and `25` add to `31`." + "One reason why programmers like to start counting at `0` is that a positive index and its *corresponding* negative index always add up to the length of the sequence. Here, `6` and `21` add to `27`." ] }, { "cell_type": "code", - "execution_count": 35, + "execution_count": 59, "metadata": { "slideshow": { - "slide_type": "fragment" + "slide_type": "skip" } }, "outputs": [ @@ -1169,7 +1853,7 @@ "'i'" ] }, - "execution_count": 35, + "execution_count": 59, "metadata": {}, "output_type": "execute_result" } @@ -1180,10 +1864,10 @@ }, { "cell_type": "code", - "execution_count": 36, + "execution_count": 60, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "skip" } }, "outputs": [ @@ -1193,13 +1877,13 @@ "'i'" ] }, - "execution_count": 36, + "execution_count": 60, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "text[-25]" + "text[-21]" ] }, { @@ -1210,7 +1894,7 @@ } }, "source": [ - "## Slicing" + "### Slicing" ] }, { @@ -1223,14 +1907,14 @@ "source": [ "A **slice** is a substring of a `str` object.\n", "\n", - "The **slicing operator** is a generalization of the indexing operator: We can put one, two, or three integers within the brackets, separated by colons `:`. The three integers are then referred to as the *start*, *end*, and *step* values.\n", + "The **slicing operator** is a generalization of the indexing operator: We put one, two, or three integers within the brackets `[]`, separated by colons `:`. The three integers are then referred to as the *start*, *stop*, and *step* values.\n", "\n", - "Let's start with two integers, *start* and *end*." + "Let's start with two integers, *start* and *stop*. Whereas the character at the *start* position is included in the returned `str` object, the one at the *stop* position is not. If both *start* and *stop* are positive, the difference \"*stop* minus *start*\" tells us how many characters the resulting slice has. So, below, `5 - 0 == 5` implies that `\"Lorem\"` consists of `5` characters. So, colloquially speaking, `text[0:5]` means \"taking the first `5 - 0 == 5` characters of `text`.\"" ] }, { "cell_type": "code", - "execution_count": 37, + "execution_count": 61, "metadata": { "slideshow": { "slide_type": "slide" @@ -1243,7 +1927,7 @@ "'Lorem'" ] }, - "execution_count": 37, + "execution_count": 61, "metadata": {}, "output_type": "execute_result" } @@ -1252,22 +1936,9 @@ "text[0:5]" ] }, - { - "cell_type": "markdown", - "metadata": { - "slideshow": { - "slide_type": "skip" - } - }, - "source": [ - "Whereas the *start* is always included in the result, the *end* is not. Counter-intuitive at first, this makes working with individual slices easier as they \"add\" up to the original `str` object again (cf., the \"*String Operations*\" sub-section below regarding the overloaded `+` operator). Because the *end* is *not* included, we end the second slice below with `len(text)` or `31` below.\n", - "\n", - "Not including the *end* has another advantage: The difference \"*end* minus *start*\" tells us how many elements the resulting slice has. Above, for example, `5 - 0` implies that `\"Lorem\"` consists of `5` characters. So, colloquially, `0:5` means \"taking the first five characters.\" That rule only works if both *start* and *end* are *positive*." - ] - }, { "cell_type": "code", - "execution_count": 38, + "execution_count": 62, "metadata": { "slideshow": { "slide_type": "fragment" @@ -1277,16 +1948,16 @@ { "data": { "text/plain": [ - "'Lorem ipsum dolor sit amet, ...'" + "'dolor sit amet.'" ] }, - "execution_count": 38, + "execution_count": 62, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "text[0:5] + text[5:len(text)]" + "text[12:len(text)]" ] }, { @@ -1297,12 +1968,12 @@ } }, "source": [ - "By combining a *positive* start with a *negative* end index, we specify both ends of the slice *relative* to the ends of the entire `str` object. So, colloquially, `6:-5` below means \"drop the first six and last five characters.\" The length of the resulting slice can *not* be calculated from the indexes and depends only on the length of the original `str` object." + "If left out, *start* defaults to `0` and *stop* to the length of the `str` object (i.e., the end)." ] }, { "cell_type": "code", - "execution_count": 39, + "execution_count": 63, "metadata": { "slideshow": { "slide_type": "fragment" @@ -1312,16 +1983,145 @@ { "data": { "text/plain": [ - "'ipsum dolor sit amet'" + "'Lorem'" ] }, - "execution_count": 39, + "execution_count": 63, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "text[6:-5]" + "text[:5]" + ] + }, + { + "cell_type": "code", + "execution_count": 64, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'dolor sit amet.'" + ] + }, + "execution_count": 64, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "text[12:]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Not including the character at the *stop* position makes working with individual slices easier as they add up to the original `str` object again (cf., the \"*String Operations*\" section below regarding the overloaded `+` operator)." + ] + }, + { + "cell_type": "code", + "execution_count": 65, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'Lorem ipsum dolor sit amet.'" + ] + }, + "execution_count": 65, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "text[:5] + text[5:]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Slicing and indexing makes it easy to obtain shorter versions of the original `str` object. A common application would be to **parse** out meaningful substrings from raw text data." + ] + }, + { + "cell_type": "code", + "execution_count": 66, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'Lorem ipsum sit amet.'" + ] + }, + "execution_count": 66, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "text[:11] + text[-10:]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "By combining a positive *start* with a negative *stop* index, we specify both ends of the slice *relative* to the ends of the entire `str` object. So, colloquially speaking, `6:-10` below means \"drop the first six and last ten characters.\" The length of the resulting slice can then *not* be calculated from the indexes and depends only on the length of the original `str` object!" + ] + }, + { + "cell_type": "code", + "execution_count": 67, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'ipsum dolor'" + ] + }, + "execution_count": 67, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "text[6:-10]" ] }, { @@ -1337,7 +2137,7 @@ }, { "cell_type": "code", - "execution_count": 40, + "execution_count": 68, "metadata": { "slideshow": { "slide_type": "skip" @@ -1347,16 +2147,16 @@ { "data": { "text/plain": [ - "'Lorem ipsum dolor sit amet, ...'" + "'Lorem ipsum dolor sit amet.'" ] }, - "execution_count": 40, + "execution_count": 68, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "text[0:999]" + "text[-999:999]" ] }, { @@ -1367,12 +2167,12 @@ } }, "source": [ - "If left out, *start* defaults to `0` and *end* to the \"length\" of the `str` object. Here, we take a \"full\" slice that is essentially a copy of the original `str` object." + "By leaving out both *start* and *stop*, we take a \"full\" slice that is essentially a *copy* of the original `str` object." ] }, { "cell_type": "code", - "execution_count": 41, + "execution_count": 69, "metadata": { "slideshow": { "slide_type": "fragment" @@ -1382,10 +2182,10 @@ { "data": { "text/plain": [ - "'Lorem ipsum dolor sit amet, ...'" + "'Lorem ipsum dolor sit amet.'" ] }, - "execution_count": 41, + "execution_count": 69, "metadata": {}, "output_type": "execute_result" } @@ -1402,47 +2202,12 @@ } }, "source": [ - "Slicing (and indexing) makes it easy to obtain shorter versions of the original `str` object." + "A *step* value of `i` can be used to obtain only every `i`th character." ] }, { "cell_type": "code", - "execution_count": 42, - "metadata": { - "slideshow": { - "slide_type": "skip" - } - }, - "outputs": [ - { - "data": { - "text/plain": [ - "'Lorem ipsum dolor sit amet.'" - ] - }, - "execution_count": 42, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "text[:26] + text[30]" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "slideshow": { - "slide_type": "skip" - } - }, - "source": [ - "A *step* value of $i$ can be used to obtain only every $i$th letter." - ] - }, - { - "cell_type": "code", - "execution_count": 43, + "execution_count": 70, "metadata": { "slideshow": { "slide_type": "slide" @@ -1452,10 +2217,10 @@ { "data": { "text/plain": [ - "'Lrmismdlrstae,..'" + "'Lrmismdlrstae.'" ] }, - "execution_count": 43, + "execution_count": 70, "metadata": {}, "output_type": "execute_result" } @@ -1477,7 +2242,7 @@ }, { "cell_type": "code", - "execution_count": 44, + "execution_count": 71, "metadata": { "slideshow": { "slide_type": "fragment" @@ -1487,10 +2252,10 @@ { "data": { "text/plain": [ - "'... ,tema tis rolod muspi meroL'" + "'.tema tis rolod muspi meroL'" ] }, - "execution_count": 44, + "execution_count": 71, "metadata": {}, "output_type": "execute_result" } @@ -1507,7 +2272,7 @@ } }, "source": [ - "## Immutability" + "### Immutability" ] }, { @@ -1518,16 +2283,16 @@ } }, "source": [ - "Whereas elements of a `list` object *may* be *re-assigned*, as shortly hinted at in [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements_00_lecture.ipynb#Who-am-I?-And-how-many?), this is *not* allowed for `str` objects. Once created, they *cannot* be *changed*. Formally, we say that they are **immutable**. In that regard, `str` objects and all the numeric types in [Chapter 5](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/05_numbers_00_lecture.ipynb) are alike.\n", + "Whereas elements of a `list` object *may* be *re-assigned*, as shortly hinted at in [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements_00_lecture.ipynb#Who-am-I?-And-how-many?), this is *not* allowed for the individual characters of `str` objects. Once created, they can *not* be changed. Formally, we say that `str` objects are **immutable**. In that regard, they are like the numeric types in [Chapter 5](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/05_numbers_00_lecture.ipynb).\n", "\n", - "On the contrary, objects that may be changed after creation, are called **mutable**. We already saw in [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements_00_lecture.ipynb#Who-am-I?-And-how-many?) how mutable objects are more difficult to reason about for a beginner, in particular, if more than *one* variable references it. Yet, mutability does have its place in a programmer's toolbox, and we revisit this idea in the next chapters.\n", + "On the contrary, objects that may be changed after creation, are called **mutable**. We already saw in [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements_00_lecture.ipynb#Who-am-I?-And-how-many?) how mutable objects are more difficult to reason about for a beginner, in particular, if more than one variable references it. Yet, mutability does have its place in a programmer's toolbox, and we revisit this idea in the next chapters.\n", "\n", "The `TypeError` indicates that `str` objects are *immutable*: Assignment to an index or a slice are *not* supported." ] }, { "cell_type": "code", - "execution_count": 45, + "execution_count": 72, "metadata": { "slideshow": { "slide_type": "slide" @@ -1541,21 +2306,21 @@ "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mtext\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m\"Z\"\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mtext\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m\"X\"\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mTypeError\u001b[0m: 'str' object does not support item assignment" ] } ], "source": [ - "text[0] = \"Z\"" + "text[0] = \"X\"" ] }, { "cell_type": "code", - "execution_count": 46, + "execution_count": 73, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -1566,7 +2331,7 @@ "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mtext\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;36m5\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m\"random\"\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mtext\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;36m5\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m\"random\"\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mTypeError\u001b[0m: 'str' object does not support item assignment" ] } @@ -1583,7 +2348,7 @@ } }, "source": [ - "## String Methods" + "### String Methods" ] }, { @@ -1596,12 +2361,12 @@ "source": [ "Objects of type `str` come with many **methods** bound on them (cf., the [documentation](https://docs.python.org/3/library/stdtypes.html#string-methods) for a full list). As seen before, they work like *normal* functions and are accessed via the **dot operator** `.`. Calling a method is also referred to as **method invocation**.\n", "\n", - "The [find()](https://docs.python.org/3/library/stdtypes.html#str.find) method returns the index of the first occurrence of a character or a substring. If no match is found, it returns `-1`." + "The [find()](https://docs.python.org/3/library/stdtypes.html#str.find) method returns the index of the first occurrence of a character or a substring. If no match is found, it returns `-1`. A mirrored version searching from the right called [rfind()](https://docs.python.org/3/library/stdtypes.html#str.rfind) exists as well. The [index()](https://docs.python.org/3/library/stdtypes.html#str.index) and [rindex()](https://docs.python.org/3/library/stdtypes.html#str.rindex) methods work in the same way but raise a `ValueError` if no match is found. So, we can control if a search fails *silently* or *loudly*." ] }, { "cell_type": "code", - "execution_count": 47, + "execution_count": 74, "metadata": { "slideshow": { "slide_type": "slide" @@ -1611,10 +2376,10 @@ { "data": { "text/plain": [ - "'Lorem ipsum dolor sit amet, ...'" + "'Lorem ipsum dolor sit amet.'" ] }, - "execution_count": 47, + "execution_count": 74, "metadata": {}, "output_type": "execute_result" } @@ -1625,10 +2390,10 @@ }, { "cell_type": "code", - "execution_count": 48, + "execution_count": 75, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -1638,7 +2403,7 @@ "22" ] }, - "execution_count": 48, + "execution_count": 75, "metadata": {}, "output_type": "execute_result" } @@ -1649,10 +2414,10 @@ }, { "cell_type": "code", - "execution_count": 49, + "execution_count": 76, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -1662,18 +2427,18 @@ "-1" ] }, - "execution_count": 49, + "execution_count": 76, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "text.find(\"z\")" + "text.find(\"b\")" ] }, { "cell_type": "code", - "execution_count": 50, + "execution_count": 77, "metadata": { "slideshow": { "slide_type": "fragment" @@ -1686,7 +2451,7 @@ "12" ] }, - "execution_count": 50, + "execution_count": 77, "metadata": {}, "output_type": "execute_result" } @@ -1708,10 +2473,10 @@ }, { "cell_type": "code", - "execution_count": 51, + "execution_count": 78, "metadata": { "slideshow": { - "slide_type": "slide" + "slide_type": "fragment" } }, "outputs": [ @@ -1721,7 +2486,7 @@ "1" ] }, - "execution_count": 51, + "execution_count": 78, "metadata": {}, "output_type": "execute_result" } @@ -1732,10 +2497,10 @@ }, { "cell_type": "code", - "execution_count": 52, + "execution_count": 79, "metadata": { "slideshow": { - "slide_type": "fragment" + "slide_type": "skip" } }, "outputs": [ @@ -1745,18 +2510,18 @@ "13" ] }, - "execution_count": 52, + "execution_count": 79, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "text.find(\"o\", 2) # 2 not 1 as otherwise the same \"o\" is found again" + "text.find(\"o\", 2)" ] }, { "cell_type": "code", - "execution_count": 53, + "execution_count": 80, "metadata": { "slideshow": { "slide_type": "skip" @@ -1769,13 +2534,13 @@ "-1" ] }, - "execution_count": 53, + "execution_count": 80, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "text.find(\"o\", 2, 12) # \"o\" does not occur in the specified slice" + "text.find(\"o\", 2, 12)" ] }, { @@ -1786,17 +2551,41 @@ } }, "source": [ - "[count()](https://docs.python.org/3/library/stdtypes.html#str.count) does what we expect." + "The [count()](https://docs.python.org/3/library/stdtypes.html#str.count) method does what we expect." ] }, { "cell_type": "code", - "execution_count": 54, + "execution_count": 81, "metadata": { "slideshow": { "slide_type": "slide" } }, + "outputs": [ + { + "data": { + "text/plain": [ + "'Lorem ipsum dolor sit amet.'" + ] + }, + "execution_count": 81, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "text" + ] + }, + { + "cell_type": "code", + "execution_count": 82, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, "outputs": [ { "data": { @@ -1804,7 +2593,7 @@ "1" ] }, - "execution_count": 54, + "execution_count": 82, "metadata": {}, "output_type": "execute_result" } @@ -1826,7 +2615,7 @@ }, { "cell_type": "code", - "execution_count": 55, + "execution_count": 83, "metadata": { "slideshow": { "slide_type": "fragment" @@ -1839,7 +2628,7 @@ "2" ] }, - "execution_count": 55, + "execution_count": 83, "metadata": {}, "output_type": "execute_result" } @@ -1856,12 +2645,12 @@ } }, "source": [ - "Alternatively, we may use the [upper()](https://docs.python.org/3/library/stdtypes.html#str.upper) method and search for `\"O\"`s." + "Alternatively, we can use the [upper()](https://docs.python.org/3/library/stdtypes.html#str.upper) method and search for `\"L\"`s." ] }, { "cell_type": "code", - "execution_count": 56, + "execution_count": 84, "metadata": { "slideshow": { "slide_type": "skip" @@ -1874,7 +2663,7 @@ "2" ] }, - "execution_count": 56, + "execution_count": 84, "metadata": {}, "output_type": "execute_result" } @@ -1896,7 +2685,7 @@ }, { "cell_type": "code", - "execution_count": 57, + "execution_count": 85, "metadata": { "slideshow": { "slide_type": "slide" @@ -1909,20 +2698,20 @@ }, { "cell_type": "code", - "execution_count": 58, + "execution_count": 86, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ - "140483525349552" + "140461430391152" ] }, - "execution_count": 58, + "execution_count": 86, "metadata": {}, "output_type": "execute_result" } @@ -1933,7 +2722,7 @@ }, { "cell_type": "code", - "execution_count": 59, + "execution_count": 87, "metadata": { "slideshow": { "slide_type": "fragment" @@ -1946,20 +2735,20 @@ }, { "cell_type": "code", - "execution_count": 60, + "execution_count": 88, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ - "140483422704176" + "140461335949424" ] }, - "execution_count": 60, + "execution_count": 88, "metadata": {}, "output_type": "execute_result" } @@ -1981,7 +2770,7 @@ }, { "cell_type": "code", - "execution_count": 61, + "execution_count": 89, "metadata": { "slideshow": { "slide_type": "fragment" @@ -1994,7 +2783,7 @@ "False" ] }, - "execution_count": 61, + "execution_count": 89, "metadata": {}, "output_type": "execute_result" } @@ -2005,10 +2794,10 @@ }, { "cell_type": "code", - "execution_count": 62, + "execution_count": 90, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -2018,7 +2807,7 @@ "True" ] }, - "execution_count": 62, + "execution_count": 90, "metadata": {}, "output_type": "execute_result" } @@ -2035,36 +2824,103 @@ } }, "source": [ - "Another popular string method is [split()](https://docs.python.org/3/library/stdtypes.html#str.split): It separates a longer `str` object into smaller ones contained in a `list` object. By default, groups of contiguous whitespace are used as the *separator*.\n", - "\n", - "As an example, we use [split()](https://docs.python.org/3/library/stdtypes.html#str.split) to print out the individual words in `text` on separate lines." + "Besides [upper()](https://docs.python.org/3/library/stdtypes.html#str.upper) and [lower()](https://docs.python.org/3/library/stdtypes.html#str.lower) there exist also [title()](https://docs.python.org/3/library/stdtypes.html#str.title) and [swapcase()](https://docs.python.org/3/library/stdtypes.html#str.swapcase) methods." ] }, { "cell_type": "code", - "execution_count": 63, + "execution_count": 91, "metadata": { "slideshow": { - "slide_type": "slide" + "slide_type": "skip" } }, "outputs": [ { - "name": "stdout", - "output_type": "stream", - "text": [ - "Lorem\n", - "ipsum\n", - "dolor\n", - "sit\n", - "amet,\n", - "...\n" - ] + "data": { + "text/plain": [ + "'lorem ipsum dolor sit amet.'" + ] + }, + "execution_count": 91, + "metadata": {}, + "output_type": "execute_result" } ], "source": [ - "for word in text.split():\n", - " print(word)" + "text.lower()" + ] + }, + { + "cell_type": "code", + "execution_count": 92, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'LOREM IPSUM DOLOR SIT AMET.'" + ] + }, + "execution_count": 92, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "text.upper()" + ] + }, + { + "cell_type": "code", + "execution_count": 93, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'Lorem Ipsum Dolor Sit Amet.'" + ] + }, + "execution_count": 93, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "text.title()" + ] + }, + { + "cell_type": "code", + "execution_count": 94, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'lOREM IPSUM DOLOR SIT AMET.'" + ] + }, + "execution_count": 94, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "text.swapcase()" ] }, { @@ -2075,12 +2931,71 @@ } }, "source": [ - "The opposite of splitting is done with the [join()](https://docs.python.org/3/library/stdtypes.html#str.join) method. It is typically invoked on a `str` object that represents a separator (e.g., `\" \"` or `\", \"`) and connects the elements of an *iterable* argument passed in (e.g., `words` below) into one *new* `str` object." + "Another popular string method is [split()](https://docs.python.org/3/library/stdtypes.html#str.split): It separates a longer `str` object into smaller ones collected in a `list` object. By default, groups of contiguous whitespace characters are used as the *separator*.\n", + "\n", + "As an example, we use [split()](https://docs.python.org/3/library/stdtypes.html#str.split) to print out the individual words in `text` with more whitespace in between them." ] }, { "cell_type": "code", - "execution_count": 64, + "execution_count": 95, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['Lorem', 'ipsum', 'dolor', 'sit', 'amet.']" + ] + }, + "execution_count": 95, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "text.split()" + ] + }, + { + "cell_type": "code", + "execution_count": 96, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Lorem ipsum dolor sit amet. " + ] + } + ], + "source": [ + "for word in text.split():\n", + " print(word, end=\" \")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The opposite of splitting is done with the [join()](https://docs.python.org/3/library/stdtypes.html#str.join) method. It is typically invoked on a `str` object that represents a separator (e.g., `\" \"` or `\", \"`) and connects the elements provided by an *iterable* argument (e.g., `words` below) into one *new* `str` object." + ] + }, + { + "cell_type": "code", + "execution_count": 97, "metadata": { "slideshow": { "slide_type": "slide" @@ -2088,15 +3003,15 @@ }, "outputs": [], "source": [ - "words = [\"This\", \"will\", \"become\", \"a\", \"sentence\"]" + "words = [\"This\", \"will\", \"become\", \"a\", \"sentence.\"]" ] }, { "cell_type": "code", - "execution_count": 65, + "execution_count": 98, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [], @@ -2106,20 +3021,20 @@ }, { "cell_type": "code", - "execution_count": 66, + "execution_count": 99, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ - "'This will become a sentence'" + "'This will become a sentence.'" ] }, - "execution_count": 66, + "execution_count": 99, "metadata": {}, "output_type": "execute_result" } @@ -2136,12 +3051,12 @@ } }, "source": [ - "As the `str` object `\"abcde\"` below is an *iterable* itself, its elements (i.e., characters) are joined together with a space `\" \"` in between." + "As the `str` object `\"abcde\"` below is an *iterable* itself, its characters (!) are joined together with a space `\" \"` in between." ] }, { "cell_type": "code", - "execution_count": 67, + "execution_count": 100, "metadata": { "slideshow": { "slide_type": "fragment" @@ -2154,7 +3069,7 @@ "'a b c d e'" ] }, - "execution_count": 67, + "execution_count": 100, "metadata": {}, "output_type": "execute_result" } @@ -2176,7 +3091,7 @@ }, { "cell_type": "code", - "execution_count": 68, + "execution_count": 101, "metadata": { "slideshow": { "slide_type": "slide" @@ -2186,10 +3101,10 @@ { "data": { "text/plain": [ - "'This is a sentence'" + "'This is a sentence.'" ] }, - "execution_count": 68, + "execution_count": 101, "metadata": {}, "output_type": "execute_result" } @@ -2198,6 +3113,242 @@ "sentence.replace(\"will become\", \"is\")" ] }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Note how `sentence` itself remains unchanged. Bound to an immutable object, [replace()](https://docs.python.org/3/library/stdtypes.html#str.replace) must create *new* objects." + ] + }, + { + "cell_type": "code", + "execution_count": 102, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'This will become a sentence.'" + ] + }, + "execution_count": 102, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "sentence" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "As seen previously, the [strip()](https://docs.python.org/3/library/stdtypes.html#str.strip) method is often helpful in cleaning text data from unreliable sources like user input from unnecessary leading and trailing whitespace. The [lstrip()](https://docs.python.org/3/library/stdtypes.html#str.lstrip) and [rstrip()](https://docs.python.org/3/library/stdtypes.html#str.rstrip) methods are specialized versions of it." + ] + }, + { + "cell_type": "code", + "execution_count": 103, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'text with whitespace'" + ] + }, + "execution_count": 103, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "\" text with whitespace \".strip()" + ] + }, + { + "cell_type": "code", + "execution_count": 104, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'text with whitespace '" + ] + }, + "execution_count": 104, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "\" text with whitespace \".lstrip()" + ] + }, + { + "cell_type": "code", + "execution_count": 105, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "' text with whitespace'" + ] + }, + "execution_count": 105, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "\" text with whitespace \".rstrip()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "When justifying a `str` object for output, the [ljust()](https://docs.python.org/3/library/stdtypes.html#str.ljust) and [rjust()](https://docs.python.org/3/library/stdtypes.html#str.rjust) methods may be helpful." + ] + }, + { + "cell_type": "code", + "execution_count": 106, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'This will become a sentence. '" + ] + }, + "execution_count": 106, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "sentence.ljust(40)" + ] + }, + { + "cell_type": "code", + "execution_count": 107, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "' This will become a sentence.'" + ] + }, + "execution_count": 107, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "sentence.rjust(40)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Similarly, the [zfill()](https://docs.python.org/3/library/stdtypes.html#str.zfill) method can be used to pad a `str` representation of a number with leading `0`s for justified output." + ] + }, + { + "cell_type": "code", + "execution_count": 108, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0000042.87'" + ] + }, + "execution_count": 108, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "\"42.87\".zfill(10)" + ] + }, + { + "cell_type": "code", + "execution_count": 109, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'-000042.87'" + ] + }, + "execution_count": 109, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "\"-42.87\".zfill(10)" + ] + }, { "cell_type": "markdown", "metadata": { @@ -2206,7 +3357,7 @@ } }, "source": [ - "## String Operations" + "### String Operations" ] }, { @@ -2222,7 +3373,7 @@ }, { "cell_type": "code", - "execution_count": 69, + "execution_count": 110, "metadata": { "slideshow": { "slide_type": "slide" @@ -2235,7 +3386,7 @@ "'Hello Lore'" ] }, - "execution_count": 69, + "execution_count": 110, "metadata": {}, "output_type": "execute_result" } @@ -2246,26 +3397,26 @@ }, { "cell_type": "code", - "execution_count": 70, + "execution_count": 111, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ - "'Lorem ipsum Lorem ipsum Lorem ipsum Lorem ipsum Lorem ipsum '" + "'Lorem ipsum Lorem ipsum Lorem ipsum Lorem ipsum Lorem ipsum ...'" ] }, - "execution_count": 70, + "execution_count": 111, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "5 * text[:12]" + "5 * text[:12] + \"...\"" ] }, { @@ -2276,7 +3427,7 @@ } }, "source": [ - "### String Comparison" + "#### String Comparison" ] }, { @@ -2287,32 +3438,17 @@ } }, "source": [ - "The *relational* operators also work with `str` objects, another example of operator overloading. Comparison is done one character at a time until the first pair differs or one operand ends. However, `str` objects are sorted in a \"weird\" way. The reason for this is that computers store characters internally as numbers (i.e., $0$s and $1$s). Depending on the character encoding, these numbers vary. Commonly, characters and symbols used in American English are encoded with the numbers 0 through 127, the so-called [ASCII standard](https://en.wikipedia.org/wiki/ASCII). However, Python works with the more general [Unicode/UTF-8 standard](https://en.wikipedia.org/wiki/UTF-8) that understands every language ever used by humans, even emojis." + "The *relational* operators also work with `str` objects, another example of operator overloading. Comparison is done one character at a time in a pairwise fashion until the first pair differs or one operand ends. However, `str` objects are sorted in a \"weird\" way. For example, all upper case characters come before all lower case characters. The reason for that is given in the \"*Characters are Numbers with a Convention*\" sub-section further below." ] }, { "cell_type": "code", - "execution_count": 71, + "execution_count": 112, "metadata": { "slideshow": { "slide_type": "slide" } }, - "outputs": [], - "source": [ - "A = \"Apple\" # ignore snake_case for variable names in this example\n", - "a = \"apple\"\n", - "B = \"Banana\"" - ] - }, - { - "cell_type": "code", - "execution_count": 72, - "metadata": { - "slideshow": { - "slide_type": "fragment" - } - }, "outputs": [ { "data": { @@ -2320,21 +3456,21 @@ "True" ] }, - "execution_count": 72, + "execution_count": 112, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "A < B" + "\"Apple\" < \"Banana\"" ] }, { "cell_type": "code", - "execution_count": 73, + "execution_count": 113, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -2344,29 +3480,18 @@ "False" ] }, - "execution_count": 73, + "execution_count": 113, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "a < B" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "slideshow": { - "slide_type": "skip" - } - }, - "source": [ - "One way to fix this is to only compare lower-cased `str` objects." + "\"apple\" < \"Banana\" # upper case letter come before lower case ones" ] }, { "cell_type": "code", - "execution_count": 74, + "execution_count": 114, "metadata": { "slideshow": { "slide_type": "fragment" @@ -2379,13 +3504,13 @@ "True" ] }, - "execution_count": 74, + "execution_count": 114, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "a < B.lower()" + "\"apple\" < \"Banana\".lower()" ] }, { @@ -2396,57 +3521,31 @@ } }, "source": [ - "To provide a simple intuition for the \"weird\" sorting above, let's think of the alphabet as being represented by the numbers as listed below. Then `\"Banana\"` is clearly \"smaller\" than `\"apple\"`. In general, all the upper case letters are \"smaller\" than all the lower case letters." + "Below is an example with typical German last names that shows how characters other than the first decide the ordering." ] }, { "cell_type": "code", - "execution_count": 75, + "execution_count": 115, "metadata": { "slideshow": { - "slide_type": "slide" + "slide_type": "fragment" } }, "outputs": [ { - "name": "stdout", - "output_type": "stream", - "text": [ - "A -> 65 \t a -> 97\n", - "B -> 66 \t b -> 98\n", - "C -> 67 \t c -> 99\n", - "D -> 68 \t d -> 100\n", - "E -> 69 \t e -> 101\n", - "F -> 70 \t f -> 102\n", - "G -> 71 \t g -> 103\n", - "H -> 72 \t h -> 104\n", - "I -> 73 \t i -> 105\n", - "J -> 74 \t j -> 106\n", - "K -> 75 \t k -> 107\n", - "L -> 76 \t l -> 108\n", - "M -> 77 \t m -> 109\n", - "N -> 78 \t n -> 110\n", - "O -> 79 \t o -> 111\n", - "P -> 80 \t p -> 112\n", - "Q -> 81 \t q -> 113\n", - "R -> 82 \t r -> 114\n", - "S -> 83 \t s -> 115\n", - "T -> 84 \t t -> 116\n", - "U -> 85 \t u -> 117\n", - "V -> 86 \t v -> 118\n", - "W -> 87 \t w -> 119\n", - "X -> 88 \t x -> 120\n", - "Y -> 89 \t y -> 121\n", - "Z -> 90 \t z -> 122\n" - ] + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 115, + "metadata": {}, + "output_type": "execute_result" } ], "source": [ - "for lower_i in range(65, 91):\n", - " upper_i = lower_i + 32 # all the upper case characters are offset by 32\n", - " lower_char = chr(lower_i) # from their lower case counterpart\n", - " upper_char = chr(upper_i)\n", - " print(f\"{lower_char} -> {lower_i} \\t {upper_char} -> {upper_i}\")" + "\"Mai\" < \"Maier\" < \"Mayer\" < \"Meier\" < \"Meyer\"" ] }, { @@ -2457,7 +3556,7 @@ } }, "source": [ - "## String Interpolation" + "### String Interpolation" ] }, { @@ -2468,9 +3567,7 @@ } }, "source": [ - "The previous code cell shows an example of a so-called **f-string**, as introduced by [PEP 498](https://www.python.org/dev/peps/pep-0498/) only in 2016, that is passed as the argument to the [print()](https://docs.python.org/3/library/functions.html#print) function.\n", - "\n", - "The \"f\" stands for \"formatted\", and we can think of the `str` object as a text \"draft\" that is filled in with values determined at runtime. This concept is formally called **string interpolation**, and there are three ways to achieve that in Python." + "Often, we want to use `str` objects as drafts in the source code that are filled in with concrete text only at runtime. This approach is called **string interpolation**. There are three ways to do that in Python." ] }, { @@ -2481,7 +3578,7 @@ } }, "source": [ - "### f-strings" + "#### f-strings" ] }, { @@ -2492,12 +3589,12 @@ } }, "source": [ - "f-strings, formally called **[formatted string literals](https://docs.python.org/3/reference/lexical_analysis.html#formatted-string-literals)**, are the least recently added and most readable way: We prepend a `str` in literal notation with an `f`, and put variables, or more generally, expressions, within curly braces. These are then filled in when a `str` object is evaluated." + "**[Formatted string literals](https://docs.python.org/3/reference/lexical_analysis.html#formatted-string-literals)**, of **f-strings** for short, are the least recently added (cf., [PEP 498](https://www.python.org/dev/peps/pep-0498/) in 2016) and most readable way: We simply prepend a `str` in its literal notation with an `f`, and put variables, or more generally, expressions, within curly braces `{}`. These are then filled in when the string literal is evaluated." ] }, { "cell_type": "code", - "execution_count": 76, + "execution_count": 116, "metadata": { "slideshow": { "slide_type": "slide" @@ -2511,10 +3608,10 @@ }, { "cell_type": "code", - "execution_count": 77, + "execution_count": 117, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -2524,7 +3621,7 @@ "'Hello Alexander! Good morning.'" ] }, - "execution_count": 77, + "execution_count": 117, "metadata": {}, "output_type": "execute_result" } @@ -2541,15 +3638,15 @@ } }, "source": [ - "Separated by a colon `:`, various formatting options are available. In the beginning, the ability to round may be particularly useful: This can be achieved by adding `:.2f` to the variable name inside the curly braces, which casts the number as a `float` and rounds it to two digits. The `:.2f` is a so-called format specifier, and there exists a whole **[format specification mini-language](https://docs.python.org/3/library/string.html#formatspec)** to govern how specifiers work." + "Separated by a colon `:`, various formatting options are available. In the beginning, the ability to round numbers for output may be particularly useful: This can be achieved by adding `:.2f` to the variable name inside the curly braces, which casts the number as a `float` and rounds it to two digits. The `:.2f` is a so-called format specifier, and there exists a whole **[format specification mini-language](https://docs.python.org/3/library/string.html#formatspec)** to govern how specifiers work." ] }, { "cell_type": "code", - "execution_count": 78, + "execution_count": 118, "metadata": { "slideshow": { - "slide_type": "slide" + "slide_type": "fragment" } }, "outputs": [], @@ -2559,10 +3656,10 @@ }, { "cell_type": "code", - "execution_count": 79, + "execution_count": 119, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -2572,7 +3669,7 @@ "'Pi is 3.14'" ] }, - "execution_count": 79, + "execution_count": 119, "metadata": {}, "output_type": "execute_result" } @@ -2581,30 +3678,6 @@ "f\"Pi is {pi:.2f}\"" ] }, - { - "cell_type": "code", - "execution_count": 80, - "metadata": { - "slideshow": { - "slide_type": "-" - } - }, - "outputs": [ - { - "data": { - "text/plain": [ - "'Pi is 3.142'" - ] - }, - "execution_count": 80, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "f\"Pi is {pi:.3f}\"" - ] - }, { "cell_type": "markdown", "metadata": { @@ -2613,7 +3686,7 @@ } }, "source": [ - "### [format()](https://docs.python.org/3/library/stdtypes.html#str.format) Method" + "#### [format()](https://docs.python.org/3/library/stdtypes.html#str.format) Method" ] }, { @@ -2624,12 +3697,12 @@ } }, "source": [ - "`str` objects also provide a [format()](https://docs.python.org/3/library/stdtypes.html#str.format) method that accepts an arbitrary number of *positional* arguments that are inserted into the `str` object in the same order replacing empty curly brackets. String interpolation with the [format()](https://docs.python.org/3/library/stdtypes.html#str.format) method is a more traditional and probably the most common way one as of today. While f-strings are the recommended way going forward, usage of the [format()](https://docs.python.org/3/library/stdtypes.html#str.format) method is likely not declining any time soon." + "`str` objects also provide a [format()](https://docs.python.org/3/library/stdtypes.html#str.format) method that accepts an arbitrary number of *positional* arguments that are inserted into the `str` object in the same order replacing empty curly brackets `{}`. String interpolation with the [format()](https://docs.python.org/3/library/stdtypes.html#str.format) method is a more traditional and probably the most common way as of today. While f-strings are the recommended way going forward, usage of the [format()](https://docs.python.org/3/library/stdtypes.html#str.format) method is likely not declining any time soon." ] }, { "cell_type": "code", - "execution_count": 81, + "execution_count": 120, "metadata": { "slideshow": { "slide_type": "slide" @@ -2642,7 +3715,7 @@ "'Hello Alexander! Good morning.'" ] }, - "execution_count": 81, + "execution_count": 120, "metadata": {}, "output_type": "execute_result" } @@ -2664,7 +3737,7 @@ }, { "cell_type": "code", - "execution_count": 82, + "execution_count": 121, "metadata": { "slideshow": { "slide_type": "fragment" @@ -2677,7 +3750,7 @@ "'Good morning, Alexander'" ] }, - "execution_count": 82, + "execution_count": 121, "metadata": {}, "output_type": "execute_result" } @@ -2699,7 +3772,7 @@ }, { "cell_type": "code", - "execution_count": 83, + "execution_count": 122, "metadata": { "slideshow": { "slide_type": "fragment" @@ -2712,7 +3785,7 @@ "'Hello Alexander! Good morning.'" ] }, - "execution_count": 83, + "execution_count": 122, "metadata": {}, "output_type": "execute_result" } @@ -2734,7 +3807,7 @@ }, { "cell_type": "code", - "execution_count": 84, + "execution_count": 123, "metadata": { "slideshow": { "slide_type": "fragment" @@ -2747,7 +3820,7 @@ "'Pi is 3.14'" ] }, - "execution_count": 84, + "execution_count": 123, "metadata": {}, "output_type": "execute_result" } @@ -2760,11 +3833,11 @@ "cell_type": "markdown", "metadata": { "slideshow": { - "slide_type": "skip" + "slide_type": "slide" } }, "source": [ - "### `%` Operator" + "#### `%` Operator" ] }, { @@ -2782,10 +3855,10 @@ }, { "cell_type": "code", - "execution_count": 85, + "execution_count": 124, "metadata": { "slideshow": { - "slide_type": "skip" + "slide_type": "slide" } }, "outputs": [ @@ -2795,7 +3868,7 @@ "'Pi is 3.14'" ] }, - "execution_count": 85, + "execution_count": 124, "metadata": {}, "output_type": "execute_result" } @@ -2817,10 +3890,10 @@ }, { "cell_type": "code", - "execution_count": 86, + "execution_count": 125, "metadata": { "slideshow": { - "slide_type": "skip" + "slide_type": "fragment" } }, "outputs": [ @@ -2830,7 +3903,7 @@ "'Hello Alexander! Good morning.'" ] }, - "execution_count": 86, + "execution_count": 125, "metadata": {}, "output_type": "execute_result" } @@ -2847,7 +3920,7 @@ } }, "source": [ - "## Special Characters" + "### Unicode & (Special) Characters" ] }, { @@ -2858,19 +3931,43 @@ } }, "source": [ - "Some symbols have a special meaning within `str` objects. Popular examples are the newline `\\n` and tab `\\t` \"characters.\" The backslash symbol `\\` is also referred to as an **escape character** in this context, indicating that the following character has a meaning other than its literal meaning.\n", + "As previously seen, some characters have a special meaning when following the **escape character** `\"\\\"`. Besides escaping the kind of quote used as the `str` object's delimiter, `'` or `\"`, most of these **escape sequences** (i.e., `\"\\\"` with the subsequent character), act as a **control character** that moves the \"cursor\" in the output *without* generating any pixel on the screen. Because of that, we only see the effect of such escape sequences when used with the [print()](https://docs.python.org/3/library/functions.html#print) function. The [documentation](https://docs.python.org/3/reference/lexical_analysis.html#string-and-bytes-literals) lists all available escape sequences, of which we show the most important ones below.\n", "\n", - "The built-in [print()](https://docs.python.org/3/library/functions.html#print) function then \"prints\" out these special characters accordingly." + "The most common escape sequence is `\"\\n\"` that \"prints\" a [newline character](https://en.wikipedia.org/wiki/Newline) that is also called the line feed character or LF for short." ] }, { "cell_type": "code", - "execution_count": 87, + "execution_count": 126, "metadata": { "slideshow": { "slide_type": "slide" } }, + "outputs": [ + { + "data": { + "text/plain": [ + "'This is a sentence\\nthat is printed\\non three lines.'" + ] + }, + "execution_count": 126, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "\"This is a sentence\\nthat is printed\\non three lines.\"" + ] + }, + { + "cell_type": "code", + "execution_count": 127, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, "outputs": [ { "name": "stdout", @@ -2887,11 +3984,22 @@ ] }, { - "cell_type": "code", - "execution_count": 88, + "cell_type": "markdown", "metadata": { "slideshow": { - "slide_type": "fragment" + "slide_type": "skip" + } + }, + "source": [ + "`\"\\b\"` is the [backspace character](https://en.wikipedia.org/wiki/Backspace), or BS for short, that moves the cursor back by one character." + ] + }, + { + "cell_type": "code", + "execution_count": 128, + "metadata": { + "slideshow": { + "slide_type": "slide" } }, "outputs": [ @@ -2899,12 +4007,33 @@ "name": "stdout", "output_type": "stream", "text": [ - "Words\taligned\twith\ttabs.\n" + "ABX\n" ] } ], "source": [ - "print(\"Words\\taligned\\twith\\ttabs.\")" + "print(\"ABC\\bX\")" + ] + }, + { + "cell_type": "code", + "execution_count": 129, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "ABXY\n" + ] + } + ], + "source": [ + "print(\"ABC\\bXY\")" ] }, { @@ -2915,12 +4044,12 @@ } }, "source": [ - "As emojis are important as well, they can be inserted with the corresponding **unicode code point** number starting with `\\U`. See this [list](https://en.wikipedia.org/wiki/List_of_Unicode_characters) of unicode characters for an overview." + "Similarly, `\"\\r\"` is the [carriage return character](https://en.wikipedia.org/wiki/Carriage_return), or CR for short, that moves the cursor back to the beginning of the line." ] }, { "cell_type": "code", - "execution_count": 89, + "execution_count": 130, "metadata": { "slideshow": { "slide_type": "fragment" @@ -2931,12 +4060,33 @@ "name": "stdout", "output_type": "stream", "text": [ - "😄\n" + "XBC\n" ] } ], "source": [ - "print(\"\\U0001f604\")" + "print(\"ABC\\rX\")" + ] + }, + { + "cell_type": "code", + "execution_count": 131, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "XYC\n" + ] + } + ], + "source": [ + "print(\"ABC\\rXY\")" ] }, { @@ -2947,12 +4097,12 @@ } }, "source": [ - "Outside the [print()](https://docs.python.org/3/library/functions.html#print) function, the special characters are not treated any different from non-special ones." + "While Linux and modern MacOS systems use solely `\"\\n\"` to express a new line, Windows systems default to using `\"\\r\\n\"`. This may lead to \"weird\" bugs on software projects where people using both kind of operating systems collaborate." ] }, { "cell_type": "code", - "execution_count": 90, + "execution_count": 132, "metadata": { "slideshow": { "slide_type": "skip" @@ -2960,18 +4110,50 @@ }, "outputs": [ { - "data": { - "text/plain": [ - "'This is a sentence\\nthat is printed\\non three lines.'" - ] - }, - "execution_count": 90, - "metadata": {}, - "output_type": "execute_result" + "name": "stdout", + "output_type": "stream", + "text": [ + "This is a sentence\n", + "that is printed\n", + "on three lines.\n" + ] } ], "source": [ - "\"This is a sentence\\nthat is printed\\non three lines.\"" + "print(\"This is a sentence\\r\\nthat is printed\\r\\non three lines.\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`\"\\t\"` makes the cursor \"jump\" in equidistant tab stops. That may be useful for formatting a program with lengthy and tabular results." + ] + }, + { + "cell_type": "code", + "execution_count": 133, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Jump\tfrom\ttab\tstop\tto\ttab\tstop.\n", + "The\tsecond\tline\tdoes\tso\ttoo.\n" + ] + } + ], + "source": [ + "print(\"Jump\\tfrom\\ttab\\tstop\\tto\\ttab\\tstop.\\nThe\\tsecond\\tline\\tdoes\\tso\\ttoo.\")" ] }, { @@ -2982,7 +4164,7 @@ } }, "source": [ - "## Raw Strings" + "#### Raw Strings" ] }, { @@ -2993,14 +4175,12 @@ } }, "source": [ - "Sometimes we do *not* want the backslash `\\` and its following character be interpreted as special characters.\n", - "\n", - "For example, let's print a typical installation path on a Windows systems. Obviously, the newline character `\\n` does *not* makes sense here." + "Sometimes we do *not* want the backslash `\"\\\"` and its subsequent character be interpreted as an escape sequence. For example, let's print a typical installation path on a Windows systems. Obviously, the newline character `\"\\n\"` does *not* makes sense here." ] }, { "cell_type": "code", - "execution_count": 91, + "execution_count": 134, "metadata": { "slideshow": { "slide_type": "slide" @@ -3028,12 +4208,12 @@ } }, "source": [ - "Some `str` objects even produce a `SyntaxError` because the `\\U` *cannot* be interpreted as a unicode code point." + "Some `str` objects even produce a `SyntaxError` because the `\"\\U\"` can *not* be interpreted as a Unicode code point (cf., next section)." ] }, { "cell_type": "code", - "execution_count": 92, + "execution_count": 135, "metadata": { "slideshow": { "slide_type": "fragment" @@ -3042,15 +4222,15 @@ "outputs": [ { "ename": "SyntaxError", - "evalue": "(unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \\UXXXXXXXX escape (, line 1)", + "evalue": "(unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \\UXXXXXXXX escape (, line 1)", "output_type": "error", "traceback": [ - "\u001b[0;36m File \u001b[0;32m\"\"\u001b[0;36m, line \u001b[0;32m1\u001b[0m\n\u001b[0;31m print(\"C:\\Users\\Administrator\\Desktop\\Project_Folder\")\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \\UXXXXXXXX escape\n" + "\u001b[0;36m File \u001b[0;32m\"\"\u001b[0;36m, line \u001b[0;32m1\u001b[0m\n\u001b[0;31m print(\"C:\\Users\\Administrator\\Desktop\\Project\")\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \\UXXXXXXXX escape\n" ] } ], "source": [ - "print(\"C:\\Users\\Administrator\\Desktop\\Project_Folder\")" + "print(\"C:\\Users\\Administrator\\Desktop\\Project\")" ] }, { @@ -3061,12 +4241,12 @@ } }, "source": [ - "A simple solution would be to escape the escape character with a *second* backslash `\\`." + "A simple solution would be to escape the escape character with a *second* backslash `\"\\\"`." ] }, { "cell_type": "code", - "execution_count": 93, + "execution_count": 136, "metadata": { "slideshow": { "slide_type": "slide" @@ -3087,10 +4267,10 @@ }, { "cell_type": "code", - "execution_count": 94, + "execution_count": 137, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -3098,12 +4278,12 @@ "name": "stdout", "output_type": "stream", "text": [ - "C:\\Users\\Administrator\\Desktop\\Project_Folder\n" + "C:\\Users\\Administrator\\Desktop\\Project\n" ] } ], "source": [ - "print(\"C:\\\\Users\\\\Administrator\\\\Desktop\\\\Project_Folder\")" + "print(\"C:\\\\Users\\\\Administrator\\\\Desktop\\\\Project\")" ] }, { @@ -3114,12 +4294,12 @@ } }, "source": [ - "However, this is tedious to remember and type. Luckily, Python allows treating any string literal as \"raw,\" and this is indicated in the string literal by the prefix `r`." + "However, this is tedious to remember and type. For such use cases, Python allows to prefix any string literal with a `r`. The literal is then interpreted in a \"raw\" way." ] }, { "cell_type": "code", - "execution_count": 95, + "execution_count": 138, "metadata": { "slideshow": { "slide_type": "slide" @@ -3140,10 +4320,10 @@ }, { "cell_type": "code", - "execution_count": 96, + "execution_count": 139, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ @@ -3151,12 +4331,12 @@ "name": "stdout", "output_type": "stream", "text": [ - "C:\\Users\\Administrator\\Desktop\\Project_Folder\n" + "C:\\Users\\Administrator\\Desktop\\Project\n" ] } ], "source": [ - "print(r\"C:\\Users\\Administrator\\Desktop\\Project_Folder\")" + "print(r\"C:\\Users\\Administrator\\Desktop\\Project\")" ] }, { @@ -3167,7 +4347,7 @@ } }, "source": [ - "## Multi-line Strings" + "#### Characters are Numbers with a Convention" ] }, { @@ -3178,12 +4358,693 @@ } }, "source": [ - "Sometimes, it is convenient to split text across multiple lines in source code. For example, to make lines fit into the 79 characters requirement of [PEP 8](https://www.python.org/dev/peps/pep-0008/) or because the text naturally contains many newlines. Using double quotes `\"` around multiple lines results in a `SyntaxError`." + "So far, we used the term **character** without any further consideration. In this section, we briefly look into what characters are and how they are modeled in software.\n", + "\n", + "[Chapter 5](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/05_numbers_00_lecture.ipynb) gives us an idea on how individual **bits** are used to express all types of numbers, from \"simple\" `int` objects to \"complex\" `float` ones. To model characters, another **layer of abstraction** is put on top of whole numbers. So, just as bits are used to express integers, they themselves are used to express characters." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "##### ASCII" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Many conventions have been developed as to what integer is associated with which character. The most basic one that was also adopted around the world is the the so-called [American Standard Code for Information Interchange](https://en.wikipedia.org/wiki/ASCII), or **ASCII** for short. It uses 7 bits of information to map the unprintable control characters as well as the printable letters of the alphabet, numbers, and common symbols to the numbers `0` through `127`.\n", + "\n", + "A mapping from characters to numbers is referred to by the technical term **encoding**. We may use the built-in [ord()](https://docs.python.org/3/library/functions.html#ord) function to **encode** any single character. The inverse to that is the built-in [chr()](https://docs.python.org/3/library/functions.html#chr) function, which **decodes** a number into a character." ] }, { "cell_type": "code", - "execution_count": 97, + "execution_count": 140, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "65" + ] + }, + "execution_count": 140, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "ord(\"A\")" + ] + }, + { + "cell_type": "code", + "execution_count": 141, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'A'" + ] + }, + "execution_count": 141, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "chr(65)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Of course, unprintable escape sequences like `\"\\n\"` count as only *one* character." + ] + }, + { + "cell_type": "code", + "execution_count": 142, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "10" + ] + }, + "execution_count": 142, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "ord(\"\\n\")" + ] + }, + { + "cell_type": "code", + "execution_count": 143, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'\\n'" + ] + }, + "execution_count": 143, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "chr(10)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "In ASCII, the numbers `0` through `31` (and `127`) are mapped to all kinds of unprintable control characters. The decimal digits are encoded with the numbers `48` through `57`, the upper case letters with `65` through `90`, and the lower case letters with `97` through `122`. While this seems random as first, there is of course a \"sophisticated\" system behind it. That can immediately be seen when looking at the encoded numbers in their *binary* representations.\n", + "\n", + "For example, the digit `5` is mapped to the number `53` in ASCII. The binary representation of `53` is `0b_11_0101` and the least significant four bits, `0101`, mean $5$. Similarly, the letter `\"E\"` is the fifth letter in the alphabet. It is encoded with the number `69` in ASCII, which is `0b_100_0101` in binary. And, the least significant bits, `0_0101`, mean $5$. Analogously, `\"e\"` is encoded with `101` in ASCII, which is `0b_110_0101` in binary. And, the least significant bits, `0_0101`, mean $5$ again. This encoding was chosen mainly because programmers \"in the old days\" needed to implement these encodings \"by hand.\" Python abstracts that logic away from its users.\n", + "\n", + "This encoding scheme is also the cause for the \"weird\" sorting in the \"*String Comparison*\" section above, where `\"apple\"` comes *after* `\"Banana\"`. As `\"a\"` is encoded with `97` and `\"B\"` with `66`, `\"Banana\"` must of course be \"smaller\" than `\"apple\"` when comparison is done in a pairwise fashion of the individual characters." + ] + }, + { + "cell_type": "code", + "execution_count": 144, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "48 0b110000 -> 0\n", + "49 0b110001 -> 1\n", + "50 0b110010 -> 2\n", + "51 0b110011 -> 3\n", + "52 0b110100 -> 4\n", + "53 0b110101 -> 5\n", + "54 0b110110 -> 6\n", + "55 0b110111 -> 7\n", + "56 0b111000 -> 8\n", + "57 0b111001 -> 9\n" + ] + } + ], + "source": [ + "for number in range(48, 58):\n", + " print(number, bin(number), \"-> \", chr(number))" + ] + }, + { + "cell_type": "code", + "execution_count": 145, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "65 0b1000001 -> A\t66 0b1000010 -> B\t67 0b1000011 -> C\n", + "68 0b1000100 -> D\t69 0b1000101 -> E\t70 0b1000110 -> F\n", + "71 0b1000111 -> G\t72 0b1001000 -> H\t73 0b1001001 -> I\n", + "74 0b1001010 -> J\t75 0b1001011 -> K\t76 0b1001100 -> L\n", + "77 0b1001101 -> M\t78 0b1001110 -> N\t79 0b1001111 -> O\n", + "80 0b1010000 -> P\t81 0b1010001 -> Q\t82 0b1010010 -> R\n", + "83 0b1010011 -> S\t84 0b1010100 -> T\t85 0b1010101 -> U\n", + "86 0b1010110 -> V\t87 0b1010111 -> W\t88 0b1011000 -> X\n", + "89 0b1011001 -> Y\t90 0b1011010 -> Z\t" + ] + } + ], + "source": [ + "for i, number in enumerate(range(65, 91), start=1):\n", + " end = \"\\n\" if i % 3 == 0 else \"\\t\"\n", + " print(number, bin(number), \"-> \", chr(number), end=end)" + ] + }, + { + "cell_type": "code", + "execution_count": 146, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + " 97 0b1100001 -> a\t 98 0b1100010 -> b\t 99 0b1100011 -> c\n", + "100 0b1100100 -> d\t101 0b1100101 -> e\t102 0b1100110 -> f\n", + "103 0b1100111 -> g\t104 0b1101000 -> h\t105 0b1101001 -> i\n", + "106 0b1101010 -> j\t107 0b1101011 -> k\t108 0b1101100 -> l\n", + "109 0b1101101 -> m\t110 0b1101110 -> n\t111 0b1101111 -> o\n", + "112 0b1110000 -> p\t113 0b1110001 -> q\t114 0b1110010 -> r\n", + "115 0b1110011 -> s\t116 0b1110100 -> t\t117 0b1110101 -> u\n", + "118 0b1110110 -> v\t119 0b1110111 -> w\t120 0b1111000 -> x\n", + "121 0b1111001 -> y\t122 0b1111010 -> z\t" + ] + } + ], + "source": [ + "for i, number in enumerate(range(97, 123), start=1):\n", + " end = \"\\n\" if i % 3 == 0 else \"\\t\"\n", + " print(str(number).rjust(3), bin(number), \"-> \", chr(number), end=end)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The remaining `symbols` encoded in ASCII are encoded with the numbers still unused, which is why they are scattered." + ] + }, + { + "cell_type": "code", + "execution_count": 147, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "symbols = (\n", + " list(range(32, 48))\n", + " + list(range(58, 65))\n", + " + list(range(91, 97))\n", + " + list(range(123, 127))\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 148, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + " 32 0b100000 -> \t 33 0b100001 -> !\t 34 0b100010 -> \"\n", + " 35 0b100011 -> #\t 36 0b100100 -> $\t 37 0b100101 -> %\n", + " 38 0b100110 -> &\t 39 0b100111 -> '\t 40 0b101000 -> (\n", + " 41 0b101001 -> )\t 42 0b101010 -> *\t 43 0b101011 -> +\n", + " 44 0b101100 -> ,\t 45 0b101101 -> -\t 46 0b101110 -> .\n", + " 47 0b101111 -> /\t 58 0b111010 -> :\t 59 0b111011 -> ;\n", + " 60 0b111100 -> <\t 61 0b111101 -> =\t 62 0b111110 -> >\n", + " 63 0b111111 -> ?\t 64 0b1000000 -> @\t 91 0b1011011 -> [\n", + " 92 0b1011100 -> \\\t 93 0b1011101 -> ]\t 94 0b1011110 -> ^\n", + " 95 0b1011111 -> _\t 96 0b1100000 -> `\t123 0b1111011 -> {\n", + "124 0b1111100 -> |\t125 0b1111101 -> }\t126 0b1111110 -> ~\n" + ] + } + ], + "source": [ + "for i, number in enumerate(symbols, start=1):\n", + " end = \"\\n\" if i % 3 == 0 else \"\\t\"\n", + " print(str(number).rjust(3), bin(number).rjust(10), \"-> \", chr(number), end=end)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "As the ASCII character set does not work for many languages other than English, various encodings were developed. Popular examples are [ISO 8859-1](https://en.wikipedia.org/wiki/ISO/IEC_8859-1) for western European letters or [Windows 1250](https://en.wikipedia.org/wiki/Windows-1250) for Latin ones. Many of these encodings use 8-bit numbers (i.e., `0` through `255`) to map the multitude of non-English letters (e.g., the German [umlauts](https://en.wikipedia.org/wiki/Umlaut_%28linguistics%29) `\"ä\"`, `\"ö\"`, `\"ü\"`, or `\"ß\"`)." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "##### Unicode" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "However, none of these specialized encodings can map *all* characters of *all* languages around the world from *all* times in human history. To achieve that, a truly global standard called **[Unicode](https://en.wikipedia.org/wiki/Unicode)** was developed and its first version released in 1991. Since then, Unicode has been amended with many other \"characters.\" The most popular among them being [emojis](https://en.wikipedia.org/wiki/Emoji) or the [Klingon](https://en.wikipedia.org/wiki/Klingon_scripts) language (from the science fiction series [Star Trek](https://en.wikipedia.org/wiki/Star_Trek)). In Unicode, every character is given an identity referred to as the **code point**. Code points are hexadecimal numbers from `0x0000` through `0x10ffff`, written as U+0000 and U+10FFFF outside of Python. Consequently, there exist at most $1,114,112$ code points, of which only about 10% are currently in use, allowing lots of room for new characters to be invented. The first `127` code points are identical to the ASCII encoding for reasons explained in the \"*The `bytes` Type*\" section further below. There exist plenty of lists of all Unicode characters on the web (e.g., [Wikipedia](https://en.wikipedia.org/wiki/List_of_Unicode_characters)).\n", + "\n", + "All we need to know to print a character is its code point. Python uses the escape sequence `\"\\U\"` that is followed by eight hexadecimal digits. Underscore separators are unfortunately *not* allowed here.\n", + "\n", + "So, to print a smiley, we just need to look up the corresponding number (e.g., [here](https://en.wikipedia.org/wiki/Emoji#Unicode_blocks))." + ] + }, + { + "cell_type": "code", + "execution_count": 149, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'😄'" + ] + }, + "execution_count": 149, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "\"\\U0001f604\"" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Every Unicode character also has a descriptive name that we can use with the escape sequence `\"\\N\"` and within curly braces `{}`." + ] + }, + { + "cell_type": "code", + "execution_count": 150, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'😂'" + ] + }, + "execution_count": 150, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "\"\\N{FACE WITH TEARS OF JOY}\"" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Whenever the code point can be expressed with just four hexadecimal digits, we may use the escape sequence `\"\\u\"` for brevity." + ] + }, + { + "cell_type": "code", + "execution_count": 151, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'A'" + ] + }, + "execution_count": 151, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "\"\\U00000041\" # hex(65) == 0x41" + ] + }, + { + "cell_type": "code", + "execution_count": 152, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'A'" + ] + }, + "execution_count": 152, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "\"\\u0041\"" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Analogously, if the code point can be expressed with two hexadecimal digits, we may use the escape sequence `\"\\x\"` for even conciser code." + ] + }, + { + "cell_type": "code", + "execution_count": 153, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'A'" + ] + }, + "execution_count": 153, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "\"\\x41\"" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "As the `str` type is based on Unicode, a `str` object's behavior is more in line with how humans view text and not how it is expressed in source code.\n", + "\n", + "For example, while it is obvious that `len(\"A\")` evaluates to `1`, ..." + ] + }, + { + "cell_type": "code", + "execution_count": 154, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "1" + ] + }, + "execution_count": 154, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "len(\"A\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "... what should `len(\"\\N{SNAKE}\")` evaluate to? As the idea of a snake is expressed as *one* \"character,\" [len()](https://docs.python.org/3/library/functions.html#len) also returns `1` here." + ] + }, + { + "cell_type": "code", + "execution_count": 155, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'🐍'" + ] + }, + "execution_count": 155, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "\"\\N{SNAKE}\"" + ] + }, + { + "cell_type": "code", + "execution_count": 156, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "1" + ] + }, + "execution_count": 156, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "len(\"\\N{SNAKE}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Many of the built-in `str` methods also consider Unicode. For example, in contrast to [lower()](https://docs.python.org/3/library/stdtypes.html#str.lower), the [casefold()](https://docs.python.org/3/library/stdtypes.html#str.casefold) method knows that the German `\"ß\"` is commonly converted to `\"ss\"`. So, when searching for exact matches, normalizing text with [casefold()](https://docs.python.org/3/library/stdtypes.html#str.casefold) may yield better results than with [lower()](https://docs.python.org/3/library/stdtypes.html#str.lower)." + ] + }, + { + "cell_type": "code", + "execution_count": 157, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'straße'" + ] + }, + "execution_count": 157, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "\"Straße\".lower()" + ] + }, + { + "cell_type": "code", + "execution_count": 158, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'strasse'" + ] + }, + "execution_count": 158, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "\"Straße\".casefold()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Many other methods like [isdecimal()](https://docs.python.org/3/library/stdtypes.html#str.isdecimal), [isdigit()](https://docs.python.org/3/library/stdtypes.html#str.isdigit), [isnumeric()](https://docs.python.org/3/library/stdtypes.html#str.isnumeric), [isprintable()](https://docs.python.org/3/library/stdtypes.html#str.isprintable), [isidentifier()](https://docs.python.org/3/library/stdtypes.html#str.isidentifier), and many more may be worthwhile to know for the data science practitioner, especially when it comes to data cleaning." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "### Multi-line Strings" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Sometimes, it is convenient to split text across multiple lines in source code. For example, to make lines fit into the 79 characters requirement of [PEP 8](https://www.python.org/dev/peps/pep-0008/) or because the text consists of many lines and typing out `\"\\n\"` is tedious. However, using single double quotes `\"` around multiple lines results in a `SyntaxError`." + ] + }, + { + "cell_type": "code", + "execution_count": 159, "metadata": { "slideshow": { "slide_type": "slide" @@ -3192,10 +5053,10 @@ "outputs": [ { "ename": "SyntaxError", - "evalue": "EOL while scanning string literal (, line 1)", + "evalue": "EOL while scanning string literal (, line 1)", "output_type": "error", "traceback": [ - "\u001b[0;36m File \u001b[0;32m\"\"\u001b[0;36m, line \u001b[0;32m1\u001b[0m\n\u001b[0;31m \"\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m EOL while scanning string literal\n" + "\u001b[0;36m File \u001b[0;32m\"\"\u001b[0;36m, line \u001b[0;32m1\u001b[0m\n\u001b[0;31m \"\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m EOL while scanning string literal\n" ] } ], @@ -3213,22 +5074,22 @@ } }, "source": [ - "However, by enclosing a string literal with either **triple-double** quotes `\"\"\"` or **triple-single** quotes `'''`, Python creates a \"plain\" `str` object. Docstrings are precisely that, and, by convention, always written in triple-double quotes `\"\"\"`." + "Instead, we may enclose a string literal with either **triple double** quotes `\"\"\"` or **triple single** quotes `'''`. Then, newline characters in the source code are converted into `\"\\n\"` characters in the resulting `str` object. Docstrings are precisely that, and, by convention, always written within triple double quotes `\"\"\"`." ] }, { "cell_type": "code", - "execution_count": 98, + "execution_count": 160, "metadata": { "slideshow": { - "slide_type": "fragment" + "slide_type": "slide" } }, "outputs": [], "source": [ "multi_line = \"\"\"\n", "I am a multi-line string\n", - "consisting of 4 lines.\n", + "consisting of four lines.\n", "\"\"\"" ] }, @@ -3240,25 +5101,25 @@ } }, "source": [ - "Line breaks are kept and implicitly converted into `\\n` characters." + "A caveat is that `\"\\n\"` characters are often inserted at the beginning or end of the text when we try to format the source code nicely." ] }, { "cell_type": "code", - "execution_count": 99, + "execution_count": 161, "metadata": { "slideshow": { - "slide_type": "fragment" + "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ - "'\\nI am a multi-line string\\nconsisting of 4 lines.\\n'" + "'\\nI am a multi-line string\\nconsisting of four lines.\\n'" ] }, - "execution_count": 99, + "execution_count": 161, "metadata": {}, "output_type": "execute_result" } @@ -3267,20 +5128,9 @@ "multi_line" ] }, - { - "cell_type": "markdown", - "metadata": { - "slideshow": { - "slide_type": "skip" - } - }, - "source": [ - "The built-in [print()](https://docs.python.org/3/library/functions.html#print) function correctly prints out the `\\n` characters." - ] - }, { "cell_type": "code", - "execution_count": 100, + "execution_count": 162, "metadata": { "slideshow": { "slide_type": "fragment" @@ -3293,7 +5143,7 @@ "text": [ "\n", "I am a multi-line string\n", - "consisting of 4 lines.\n", + "consisting of four lines.\n", "\n" ] } @@ -3310,15 +5160,15 @@ } }, "source": [ - "Using the [split()](https://docs.python.org/3/library/stdtypes.html#str.split) method with the optional *sep* argument, we confirm that `multi_line` consists of *four* lines with the first and last line breaks being the first and last characters in the `str` object." + "Using the [split()](https://docs.python.org/3/library/stdtypes.html#str.split) method with the optional `sep` argument, we confirm that `multi_line` consists of *four* lines with the first and last line being empty." ] }, { "cell_type": "code", - "execution_count": 101, + "execution_count": 163, "metadata": { "slideshow": { - "slide_type": "slide" + "slide_type": "fragment" } }, "outputs": [ @@ -3326,15 +5176,15 @@ "name": "stdout", "output_type": "stream", "text": [ - "0 \n", - "1 I am a multi-line string\n", - "2 consisting of 4 lines.\n", - "3 \n" + "1 \n", + "2 I am a multi-line string\n", + "3 consisting of four lines.\n", + "4 \n" ] } ], "source": [ - "for i, line in enumerate(multi_line.split(\"\\n\")):\n", + "for i, line in enumerate(multi_line.split(\"\\n\"), start=1):\n", " print(i, line)" ] }, @@ -3346,12 +5196,75 @@ } }, "source": [ - "The next code cell puts several constructs from this chapter together to create a multi-line `str` object `content`: The `with` statement provides a context that ensures `file` is not left open. Then, the [readlines()](https://docs.python.org/3/library/io.html#io.IOBase.readlines) method returns the contents of `file` as a `list` object holding as many `str` objects as there are lines in the file on disk. Lastly, we concatenate these together with the [join()](https://docs.python.org/3/library/stdtypes.html#str.join) method to obtain `content`. We do so on an empty `str` object `\"\"` as each line already ends with a `\"\\n\"`." + "To mitigate that, we often see the [strip()](https://docs.python.org/3/library/stdtypes.html#bytes.strip) method in source code." ] }, { "cell_type": "code", - "execution_count": 102, + "execution_count": 164, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [], + "source": [ + "multi_line = \"\"\"\n", + "I am a multi-line string\n", + "consisting of two lines.\n", + "\"\"\".strip()" + ] + }, + { + "cell_type": "code", + "execution_count": 165, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "1 I am a multi-line string\n", + "2 consisting of two lines.\n" + ] + } + ], + "source": [ + "for i, line in enumerate(multi_line.split(\"\\n\"), start=1):\n", + " print(i, line)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## The `bytes` Type" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "To end this chapter, we want to briefly look at the `bytes` data type, which conceptually is a sequence of bytes. That data format is probably one of the most generic ways of exchanging data between any two programs or computers (e.g., a web browser obtains its data from a web server in this format).\n", + "\n", + "Let's open a binary file in read-only mode (i.e., `mode=\"rb\"`) and read in all of its contents." + ] + }, + { + "cell_type": "code", + "execution_count": 166, "metadata": { "slideshow": { "slide_type": "slide" @@ -3359,40 +5272,619 @@ }, "outputs": [], "source": [ - "with open(\"lorem_ipsum.txt\") as file:\n", - " content = \"\".join(file.readlines())" + "with open(\"full_house.bin\", mode=\"rb\") as binary_file:\n", + " data = binary_file.read()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`data` is an object of type `bytes`." ] }, { "cell_type": "code", - "execution_count": 103, + "execution_count": 167, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ - "\"Lorem Ipsum is simply dummy text of the printing and typesetting industry.\\nLorem Ipsum has been the industry's standard dummy text ever since the 1500s\\nwhen an unknown printer took a galley of type and scrambled it to make a type\\nspecimen book. It has survived not only five centuries but also the leap into\\nelectronic typesetting, remaining essentially unchanged. It was popularised in\\nthe 1960s with the release of Letraset sheets.\\n\"" + "140461335555696" ] }, - "execution_count": 103, + "execution_count": 167, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "content" + "id(data)" ] }, { "cell_type": "code", - "execution_count": 104, + "execution_count": 168, "metadata": { "slideshow": { - "slide_type": "-" + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "bytes" + ] + }, + "execution_count": 168, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "type(data)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "It's value is given out in the literal bytes notation with a `b` prefix (cf., the [reference](https://docs.python.org/3/reference/lexical_analysis.html#string-and-bytes-literals)). Every byte is expressed in hexadecimal representation with the escape sequence `\"\\x\"`. This representation is commonly chosen as we can *not* tell what kind of information is hidden in the `data` by just looking at the bytes. Instead, we must be told by some other source how to **decode** the raw bytes into information we can interpret." + ] + }, + { + "cell_type": "code", + "execution_count": 169, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "b'\\xf0\\x9f\\x82\\xa7\\xf0\\x9f\\x82\\xb7\\xf0\\x9f\\x83\\x97\\xf0\\x9f\\x83\\x8e\\xf0\\x9f\\x83\\x9e'" + ] + }, + "execution_count": 169, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "data" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`bytes` objects work like `str` objects in many ways. In particular, they are *sequences* as well: The number of bytes is *finite* and we may *iterate* over them in *order*." + ] + }, + { + "cell_type": "code", + "execution_count": 170, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "20" + ] + }, + "execution_count": 170, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "len(data)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Consisting of 8 bits, a single byte can always be interpreted as a whole number between `0` through `255`. That is exactly what we see when we loop over the `data` ..." + ] + }, + { + "cell_type": "code", + "execution_count": 171, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "240 159 130 167 240 159 130 183 240 159 131 151 240 159 131 142 240 159 131 158 " + ] + } + ], + "source": [ + "for byte in data:\n", + " print(byte, end=\" \")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "... or index into them." + ] + }, + { + "cell_type": "code", + "execution_count": 172, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "158" + ] + }, + "execution_count": 172, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "data[-1]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Slicing returns another `bytes` object." + ] + }, + { + "cell_type": "code", + "execution_count": 173, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "b'\\xf0\\x82\\xf0\\x82\\xf0\\x83\\xf0\\x83\\xf0\\x83'" + ] + }, + "execution_count": 173, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "data[::2]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "### Character Encodings" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Luckily, `data` consists of bytes encoded with the [UTF-8](https://en.wikipedia.org/wiki/UTF-8) encoding. That is the most common way of mapping a Unicode character's code point to a sequence of bytes.\n", + "\n", + "To obtain a `str` object out of a given `bytes` object, we decode it with the `bytes` type's [decode()](https://docs.python.org/3/library/stdtypes.html#bytes.decode) method." + ] + }, + { + "cell_type": "code", + "execution_count": 174, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "cards = data.decode()" + ] + }, + { + "cell_type": "code", + "execution_count": 175, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "str" + ] + }, + "execution_count": 175, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "type(cards)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "So, `data` consisted of a [full house](https://en.wikipedia.org/wiki/List_of_poker_hands#Full_house) hand in a poker game." + ] + }, + { + "cell_type": "code", + "execution_count": 176, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'🂧🂷🃗🃎🃞'" + ] + }, + "execution_count": 176, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "cards" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "To go the opposite direction and encode a given `str` object, we use the `str` type's [encode()](https://docs.python.org/3/library/stdtypes.html#str.encode) method." + ] + }, + { + "cell_type": "code", + "execution_count": 177, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "place = \"Café Kastanientörtchen\"" + ] + }, + { + "cell_type": "code", + "execution_count": 178, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "b'Caf\\xc3\\xa9 Kastanient\\xc3\\xb6rtchen'" + ] + }, + "execution_count": 178, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "place.encode()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "By default, [encode()](https://docs.python.org/3/library/stdtypes.html#str.encode) and [decode()](https://docs.python.org/3/library/stdtypes.html#bytes.decode) use an `encoding=\"utf-8\"` argument. We may use another encoding like, for example, `\"iso-8859-1\"`, which can deal with ASCII and western European letters." + ] + }, + { + "cell_type": "code", + "execution_count": 179, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "b'Caf\\xe9 Kastanient\\xf6rtchen'" + ] + }, + "execution_count": 179, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "place.encode(\"iso-8859-1\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "However, we must use the *same* encoding for the decoding step as for the encoding step. Otherwise, a `UnicodeDecodeError` is raised." + ] + }, + { + "cell_type": "code", + "execution_count": 180, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "ename": "UnicodeDecodeError", + "evalue": "'utf-8' codec can't decode byte 0xe9 in position 3: invalid continuation byte", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mUnicodeDecodeError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mplace\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mencode\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"iso-8859-1\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdecode\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;31mUnicodeDecodeError\u001b[0m: 'utf-8' codec can't decode byte 0xe9 in position 3: invalid continuation byte" + ] + } + ], + "source": [ + "place.encode(\"iso-8859-1\").decode()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Not all encodings map all Unicode code points. For example `\"iso-8859-1\"` does not know Czech letters. Below, [encode()](https://docs.python.org/3/library/stdtypes.html#str.encode) raises a `UnicodeEncodeError` because of that." + ] + }, + { + "cell_type": "code", + "execution_count": 181, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "ename": "UnicodeEncodeError", + "evalue": "'latin-1' codec can't encode character '\\u0159' in position 12: ordinal not in range(256)", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mUnicodeEncodeError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0;34m\"Dobrý den, přátelé!\"\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mencode\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"iso-8859-1\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;31mUnicodeEncodeError\u001b[0m: 'latin-1' codec can't encode character '\\u0159' in position 12: ordinal not in range(256)" + ] + } + ], + "source": [ + "\"Dobrý den, přátelé!\".encode(\"iso-8859-1\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "### Reading Files (continued)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The [open()](https://docs.python.org/3/library/functions.html#open) function takes an optional `encoding` argument as well." + ] + }, + { + "cell_type": "code", + "execution_count": 182, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "ename": "UnicodeDecodeError", + "evalue": "'utf-8' codec can't decode byte 0xe4 in position 9: invalid continuation byte", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mUnicodeDecodeError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;32mwith\u001b[0m \u001b[0mopen\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"umlauts.txt\"\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0mfile\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"\"\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mjoin\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mfile\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mreadlines\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;32m~/.pyenv/versions/anaconda3-2019.10/lib/python3.7/codecs.py\u001b[0m in \u001b[0;36mdecode\u001b[0;34m(self, input, final)\u001b[0m\n\u001b[1;32m 320\u001b[0m \u001b[0;31m# decode input (taking the buffer into account)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 321\u001b[0m \u001b[0mdata\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mbuffer\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0minput\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 322\u001b[0;31m \u001b[0;34m(\u001b[0m\u001b[0mresult\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mconsumed\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_buffer_decode\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mdata\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0merrors\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mfinal\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 323\u001b[0m \u001b[0;31m# keep undecoded input until the next call\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 324\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mbuffer\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mdata\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mconsumed\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;31mUnicodeDecodeError\u001b[0m: 'utf-8' codec can't decode byte 0xe4 in position 9: invalid continuation byte" + ] + } + ], + "source": [ + "with open(\"umlauts.txt\") as file:\n", + " print(\"\".join(file.readlines()))" + ] + }, + { + "cell_type": "code", + "execution_count": 183, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Lerchen-Lärchen-Ähnlichkeiten\n", + "fehlen. Dieses abzustreiten\n", + "mag im Klang der Worte liegen.\n", + "Merke, eine Lerch' kann fliegen,\n", + "Lärchen nicht, was kaum verwundert,\n", + "denn nicht eine unter hundert\n", + "ist geflügelt. Auch im Singen\n", + "sind die Bäume zu bezwingen.\n", + "Die Bätrachtung sollte reichen,\n", + "Rächtschreibfählern auszuweichen.\n", + "Leicht gälingt's, zu unterscheiden,\n", + "wär ist wär nun von dän beiden.\n" + ] + } + ], + "source": [ + "with open(\"umlauts.txt\", encoding=\"iso-8859-1\") as file:\n", + " print(\"\".join(file.readlines()))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "#### Best Practice: Use UTF-8 explicitly" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "A best practice is to *always* specify the `encoding`, especially on computers running on Windows (cf., the talk by Łukasz Langa in the \"*Further Resources*\" section below).\n", + "\n", + "Below is the first example involving [open()](https://docs.python.org/3/library/functions.html#open) one last time: It shows how *all* the contents of a text file should be read into one `str` object." + ] + }, + { + "cell_type": "code", + "execution_count": 184, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "with open(\"lorem_ipsum.txt\", encoding=\"utf-8\") as file:\n", + " content = \"\".join(file.readlines())" + ] + }, + { + "cell_type": "code", + "execution_count": 185, + "metadata": { + "slideshow": { + "slide_type": "fragment" } }, "outputs": [ @@ -3435,7 +5927,294 @@ "source": [ "Textual data is modeled with the **immutable** `str` type.\n", "\n", - "The `str` type supports *four* orthogonal **abstract concepts** that together constitute the idea of a **sequence**: Every `str` object is an iterable container of a finite number of ordered characters." + "The `str` type supports *four* orthogonal **abstract concepts** that together constitute the idea of a **sequence**: Every `str` object is an *iterable container* of a *finite* number of *ordered* characters.\n", + "\n", + "A single **character** in a `str` object follows the idea of a **Unicode** character. It is mapped to a *unique* **code point** that is encoded into **bytes** with a dedicated character encoding, for example, **UTF-8**." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "## Further Resources" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "We refer to the official [Unicode HOWTO](https://docs.python.org/3/howto/unicode.html) in the Python documentation. Furthermore, the [unicodedata](https://docs.python.org/3/library/unicodedata.html) module in the [standard library](https://docs.python.org/3/library/index.html) provides a lot of utility functions around the Unicode standard.\n", + "\n", + "Next is a brief summary video by the YouTube channel [Computerphile](https://www.youtube.com/channel/UC9-y-6csu5WGm29I7JiwpnA) titled \"*Characters, Symbols and the Unicode Miracle*\"." + ] + }, + { + "cell_type": "code", + "execution_count": 186, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "image/jpeg": "/9j/4AAQSkZJRgABAQAAAQABAAD/2wCEAAUDBAYKBgUFBggHBwYFBQcFBQcHBwgHBwcHBwcHBwcHBwcIChAMBwgOCQcHDBUMDhERExMTCAwWGBYSGBASExIBBQUFCAcIDwkJDRIMDwwUEhISFBQSEhQSEhISEhQSFBISFBISFBIUFBIUEhQUFBIUFBQSFBQSFBQUFBQSFBQUFP/AABEIAWgB4AMBIgACEQEDEQH/xAAcAAABBQEBAQAAAAAAAAAAAAAAAgMEBQYBBwj/xABWEAABAwMBAwcIBQkFBwICCwACAAEDBBESBSEiMQYTMkFCUWEHI1JicYGRoRRysdHwFTNDU4KSwdLTFySTouEWNGNzssLxCCVEsyY1ZHR1doOEo7TE/8QAGgEBAAMBAQEAAAAAAAAAAAAAAAECAwQFBv/EADERAAICAQQBAgUDBAIDAQAAAAABAhEDBBIhMUETUQUUImFxMoGxQpGhwTNSFSPwBv/aAAwDAQACEQMRAD8A+MkIQgBCEIAQhCAEIQgBCEIAQhCAEIQgBCEIAQhCAEIQgBCEIAQhCAEIQgBCEIAQhCAEIQgBCEIAQhCAEIQgBCEIAQhCAEIQgBCEIAQhCAEIQgBCEIAQhCAEIQgBCEIAQhCAEIQgBCEIAQhCAEIQgBCEIAQhCAEIQgBCEIAQhCAEIQgBCEIAQhCAEIQgBCEIAQhCAEIQgBCEIAQhCAEIQgBCEIAQhCAEIQgBCEIAQhCAEIQgBCEIAQhCAEIQgBCEIAQhCA6hXg8mat+DxfEv5V0eS9W/B4t3jvF/KqerD3J2sorour1uS9X3xfvF/IutyWq3bK8NvrH/ACJ6sPcnayhQr9+StX3w/vH/ACI/2TrO+Hu6R/yJ6sPcbWUFkWV7NyXqxbJ3i9zl/KocmkTi+JM1/a/3KVki/JG1leuXVgOkzuQju3Lhtf7lKpOTlTIWAvExdzuX8BR5IryTtZSoWri5CagXROm/fk/po/2D1De36fd9eT+mq+tD3G1mVXFqg5DV79un/fk/pqWHk21Rxy52kt/zJf6SlZYPhMODRiULcN5MtU487R/4k39Fd/sy1T9bR935yb+irbkRRhkLcP5MtU/W0f8AiTf0V1/Jjq362j/xJv6KbkKMMhbj+zLVOHO0X+JN/RR/Zlqn62j/AMSb+im5CjDoW5byY6r+tov8Sb+iuf2Zap+to/8AEm/opuQow6FuG8mWqcedov8AEm/oofyY6p+tov8AEm/opuQow6FuH8mWqcedov8AEm/orv8AZhqv62j2/wDEm/opuQowyFuP7MdU/W0X+JN/RR/Zlqn62i/xJv6KbkKMOhbh/Jlqn62j/wASb+ik/wBmeqfraP8Afm/opuQoxKFtm8muqfrKTZ/xJf6K43k11T9ZSf4kv9JNyFGKQts3k11THLnaO3/Ml/pLj+TXVOlzlJb/AJkv9JNyFGKQtq/k31RtvOUn+JL/AEkF5N9Ub9JSbf8AiS/0k3IgxSFsv7OtS/WUv+JL/SXP7O9S4c5S/wCJL/STciaMchbD+zzUv1lL/iS/0kP5PtS/WUvHH85Lx/wk3IgyK5Za1uQOofrKbZ68n9NcbkDqH6ym/fk/pqNyJpmTQvTuSXkS5Q1+ZUr0YRR9KaeSYInfZuiQwk7vZ78P4Xuan/05cogLmyrNGcu4Kisf/wDxqrzQXbLrFJ+DxhC9erv/AE/6/F06vSPY1RVO/C/D6LdUsnkj1YXcSqKDZ0vOVOzx/wB34KPmMfuifRn7HnaF6DN5KNUHG9Rp752xtLP1vb9QnKfyRaucAzhNQWIyjw52fNnYcmu3MW3trNt7LqfXh7oejP2POkL0xvIxrPSKp05m6neWp291v7uu03kY1k3IfpOnCQXyE5aln2NfZam8H+Cr8zj/AOyHoT9jzJdXpsnkW1lnH+86ZZyxyaWpceNrv/duF3QHkW1hyERqtLfISIX5+os+PU3922v4KfmMf/ZD0Z+x5ihekVHke1cMsqnTd3j56o+G2n4qIfku1RiIGmoSlEMubaWVjfZdhbKFmyt4txT5iHugsM34MEuKbW6fUxSyU08ZRyxm8ZgWx2JuqyY5g+78NxWqaZm1QyhO8weQjbaXBdenk7uihAyhPBBI5Ys21lwoiZsrbOCA9HGKRsuP470oBkx3bpqblJLKRebijHHqbb8VCDUpcRxJt2R8upeWtPN9m/qRJ/nGHrxSmaTHrxVKNbOQSZSdrdSSrpcfzj48139av8tIr6qL12kx68Ure3cv2Vm6iqlxLzj44tjtSSmk80RSPkNutPlX7k+sjUmxP0v2VU6tRyZZDfJQBqpchykfpvjt8Ni4c0uUQlI7l2t/xV4aaS8kPKiMDyMfXkKn0E0gnlty7SjkNjkxfeIcvmuMZc7uv2d7b4LWWGyFlRutJnIhGQf2lY1GTjkN155R6hUh5uKS2MnhwUqPlFXtiPONjkeV2ZcktHLwaeqrNeBE31VdaZKThiS82g5R1eBZOD72Q7GVjScs5g3ijBx5tuD22rP5XJF2izyxaPRSy/ZXHclk25exYlnCbcODs/yU6Hllp782RuYFs4t9y7EmY2i/fLxQ+WXrKBTcotPPHGePLbxe3sU6KriIo8JAf2Ez/wAVIs6Ll+0u73il23ixddbpl9VBY2LkhssfVS4m6Y33V0X3R7tqE2N5Fj6q6+WPXius25x7SeZ/3cWQixkmLFcfJSZO19ZsUOO8JKLFkaxeKGyy8U+X6P0krDfUiyKzFkS4LEpYhv8AqojDeIeyosiyIzF2UnEsfVUwA6Q33VwQ3SUiyEQkw+qkll2lMcNxIkj3BSxZFLJIPLLryUuWPdH0lyaPol2ksghExZeskOxZespxDv8AHpCkGP73NoLIPnMi45JFpPFTSbpe5Jv6P6xBZX+c6rq75Fcnpa2coyc46WAcqmVmva/RAb7Mif7HVf6P1nyXo3JesjpdNxEOlFz5Xb84bsz7fC+z3MubU5vTjx2zq02J5Jfg182o01HSxUVOOIwRYxhfciZu0WzaRO9+vInVS3KCRyKQisXSHOz9fTtazd6ws+rySFzhO7nITnt4Z9Z29jP8UydSWBZdIh4u+2z9TrzlbPS2JG0m5Uwn5g5TkLol8erBrMoFeIyDzo7RyxF9j28Ls+z4JXIHya6tqBRzWenosmLnZGs8reoHF28V7tyV8l2m0odF5DK3OET8f2W2WVtsn0Ppj2fPUmh1cnRhNwHrtsd+G3w6vep1DpUkYkJhYZJ4itt4hd7Pd9u37V9O1Gi0gBiMYNj4MsHyv0uJuiLel3WtwVMlovBxl0eIyVJMHNF+il5u3DrO38f3WUCWqHIpBd8u/wBl7g/uWg5T6Rics4bMhyw8Wf8A1f4rz+tlIS/a5z57fdxVIOzRwpF4WrCYZC7AY251uotl2Nm4df46qwdWxnEb3DLdt48W+H2rN1FZgUno9pvB+Nvf9qjSVJfnB2jj7bf6eK6Yxo5mauTVsso+Mu0Ru13e3Btux9rM6o6yvkc5Km7tKMmRdXF9l29r2VXNVWIZCd/OCxEzvf5pRvnvdrHr2M/hd9l/9FptRnbGeVFAFdEBGQR18AsMU+1mIW4RzW6tr723FZar5A62PnQiapEeuCUTdr8Nx8T+S0oniXHD0mte3u4srai1sQxyZjHo2fNnb57G+9aQz5MapdESwwyO5dnkVVR1cUpRTxyxTiO8EgFGbN1bCa9k0IS5Fxy7S9q1vTqbUaXm6dgjrx3oju1isz+bZyK4Ze/qXmtXQSxyz004vFPETBKBbDZ24s67cGpWRezODU4PSfujPiMjZY39ZcYJHEuOPaV6FPc5Nu6QptqfcxHsk+S6NxybhkWJ8sX6KVEBOOSXSgLEWb9nFORCO7ttjI5CpokjYFjl6JJXMFjl+17k/GwsMmT9IuCHMceO9hjZKBFkiJsfW6KW9MX+bFPM8eAjfeHeT1x6V75SMSAiPSybvrJb0sn+t1LBhYh233nL4rrh0Rv2X+aAhDTll+zkh4SyL1VLaIejfsY396W8QvkN/Q2+xTYogBTk5EPrYrrUMhfj4qwYY/3ZMvalxGO6V7Yk/wA07JuitHT5Mcvx4pbafJjl+1ZWQTDul6IuNvah5t3L/hsNlBFle9ATdLwSjordL6qmyzdIvSJuru4qPUSDjjfpSZexANDBiYj2lIEZGPEHcC8HsuSc25Rlf6zp05I8xK/rElWTZJo9SrRLEJj3ep3v9qtaTlLXt0sZMe9rP8lQNNHzuV91SKc48yK9gxTYmNxpKPlgLiXOxO2PSwdvsdXFFr9FIIlzmGXp7FgmeHznoluio3ODiO3o32e1VeJEqZ6xBJGY5AQF2tjsnrE2PrLx6OokHzgETEMbjZnf71Z0PKmvjAR5znMS4G1/ms3jZbceouJLriSydBy0if8APx2yJtobW+Dq9pdYpJBxCUciLg72f4OquLQsn4b2KWwEuOQ5br9lcYh6N+zioJOuJZYrgDdDEOfHdx/gjIf82SAMCSHbdySmMf3SdIEhxLJ0IOE27khxJDEOHHeXDkHv6VkJOSCTJDsWWKJTHexfpFkgiHIf832IBBAWSbsTkXqpZdIdvRXDxyL1hQgZIC3hTTBdSGxzy9X+Flwcf82SAiuJYl6q3XLOUY6OhiD9JTNlb2NdZKhhjknjiJ7DPKEfszJmv81o+WZ3GDbfm5Dx/b/8N8F5mul9cUep8Pi6kylgLIt3YI+bH6oW2/Z817f5HPJkMgxavqwXArHSUxcHbi0srP1dwrzryMcmirdXjjlbOjoh+l1fc++/NRftO1/qi6+rKZxEREW6PR7mWEXbOqfBYU4xAIxgzCI7os2z4IlqhZQjkVfWS33VvLJS4OdY7fI7XVwvlt3VRau0ZCfXu/bx/Hgu1ZFlj9+xQK6bmwIifex72XG5WzsjCjz7lLT4lIMrt7O9up/x3LxvlcAjLLh0csh/iz+/7V63yvmI8i27pbr+H3LyjlRFfnPS6Q+zrVIfqN2vpMTqp3HLgJbpfD/x8FXw1Fsh7OO832qRWlvc2W0fsdVs4Flu/wCi9GC4o4ZDlRJ6N8ey/H4t3JcFaTbuwvh9qjRyD22/HvSpKbtBb8eC148mTXlFsFXG47z7oj3Xe3c9+De9OQT0z9K3q22bfbfZ1rPSMX/cX44KRTlu5Yg/Zv1fLrTYU3OzTUxQC+QXb0nfZ/mFv4KH5QqXn4oNUi2lALUlSeeVwb8yZbL32439noqFDUj+b5zm/YDYO/tHa/wVmBZxTxXbzsTxSA3QLMXfMfqkzP8AtLNXCSkXlH1IuLMFCBP0fqpyGIssi6OWKYYhYiG9sZPs4rvPi48d4TfZ7V61HhuJMho7jkT23sU4NHuyf8NSBlFshJu1knqeYW5wibpK6Ysg/QrgRLn5LG2V97HKynDURiJDbpLrVI4+tjipsWQX08WHjvDbL3pT0Q5R4v0lLmmFxL0it8kyVQPmxFuioJs4NGOQ4v2nH4JRUwtjt3SSufFsfrOXxTM9UOUezopyLFDCOUmXRFBUw723d2fNMSVouRbN0hSQrxyLZu7PkgFlT7xZP2sRXI4vS9JxH3Lr1t+k197IUl6m/ZfdJyH3qaBzs9xYuXwTBluDIL+qTLhS+kz5YuPxTLyiwc3be7/uUEktqmwx8HEvBdOG5R49pRozFxEbe9TvpQ4x4t0VdcgZmEWx27pFim4g35BJ+iKemOMsfREslHqqgWLhukLioYOY7+Iv0h3U5Sbctu8N1HhmHMStuiOIp5qqMSIre5QgSYiuJbej1Jo47gUg9IepRYaqxSFbpKbR1A4ls6SNgik25lftYkylw0+7xuWLFb2rjOOJDbeJWGmUk85jTUcMs1RJYRjiBzN7eA7VWwRqjmxH6thJJhhLMcXuJby9d5M+Q6vmiKfVpQoAIec5oGaebY2zJhfEfZd1R8t/JlqlAI1cQ/TaCMcvpNOD3ALcZoX3gbx2ipbbHRlaTUauLDCU8ejY95vg60Gm8qhyKOqjtj2x/lWWhqI90TbdFOVMsWWQdrdJUlFMlSZ6RR1lJIXmpAcscrcH+D7VIGK/72K8qGfzvO7cvbZ1d6XyoqY90/OhlkLHxb3rJwZazc8x/wBSaen3clB03lPSS4iT80eXAuHudWoSjj35KCbGRp93j2ckHAOI+snXlHH9nFDyDjigGDhFv3sVw4rF6qdIxfq7WS7IYvjs6KAjvDvEPopmreGMcpSYAxyu/wDButVuu8poIyKOKxn0Sfqb39brFajrBSGRG7vkOP47lrDE32WUbNb+XaTnyjK7Bjuyeu/DZ6P3q3gGMxGQX3ZOisZyb0opz52dnaAbeF3bsN963AOLDiLWEeizcGs2xlWddIS4OUwYnHIXZk+xW2qheCIumJXkjNtt+DfH7lT84P2r0DyF8mvp2oZSv/c9LOKpqQfax5ZOAW9Y4Wy9VnXm63DdT9jv0Ofa9vue0eRnkmNDpMGY2qq21XVv1s5s2Eb/AFRs1vrLfNGLCqXUNYgj3ecZseprKtHlIL9F7j0fvXHGUY9na4SlyaOoMeiopMKqT1Xdy9L5Mo8+siI7z+t9l7qzmmSsbRZTiLfW8LNZZzWhjfIb3IRbhZ/ds/GxV9byphIyxkbHs/Lh8VXSavFJuiTZdLi3u6+K55tPo6IRrsrNYoxfL8e1ef8AKbTxYSEbX6Q/d+O9ej1hE47va61j+UAC2WXSWa4ZoeHa/DjOQ8BL5fh1SEdt0v8Ax7F6Byp08TyL9q6xNTTExEJdId1eljmmjlyY6ZCJr442cS6+5/YkFVExdePwfY1k7JHioxtvcF0RpnLO0hznxfLjvbpeG2/8E9HJ2R6JKIwkxCVvmnDmkfeszEPR4Ws3Vbu+9XMvuSd1seG90mvZn6r7dl/FWmkDcxEtvZz6sH4Pfh3sqMJsuk290firHQpLmIl0h3S+Lfj3LHMntZtif1IreUOnC1ZU47MvO2bxbb87v71VR0m6JetustTyqIQrMib85TRl8nZ/mypI5Rfdt2nIV26ebeNP7Hkapbcsl9x2mESIsuljkKVkLCOXSInFJYiAix2qIcxdr0sl1NGJ0pMgPFnyEvkkue5kN8hLeXY5bAQ2beTfOljzdt0lFEkiOW4xlb63xUmZhxLZ0SxUEDJhxslSVfZV7IoVWPYh2PiVlGIBzLjjjkKflnyx2NuqPPKWWSqyUdxjyj2fnBSIYt8hJu/FIGcssrdHosltOTFzlt5ASKQR7fS5zFTwqBwkHm2ch6LqmGokyLZ0iy96dp6ghy9brUpiiVLIOPRbeBy+ChtvAMgj2t5Pc+WOPq4pISWDmxtipFC5ubESxb83b5pg3uUWLOwlZLlkIt23d8u9OFl5vZ0einIo4Eg+bK27zjiScqIYnPGz9JNkxbuzoklvITkJW6KWKG4aYWPmybpXxQFOO9k3bYVJCQnLK28tfyO5Ba3XllS0zhTyFvVNR5qFrdbO7XP9lnUMlGJClHzo23h6KuOTnJuvqijioqaaoMuk8YO4B9cuiDeJOy+iORnkW0umxn1Enr6jYRC7YUwP3MF7n+0/uXpVFRRRAMFPHHFEO7HHEIgDewRayqQeG8jfIRcQn1ua3aKmpn2+w53bZ7Bb3r2Lk5yd02ii5jTqaKnHtODecPxOV94n96uMF1gUgTdJmYnAhB2YiHdcgyb3jfb8U9gjm0B5T5RPJFQVWVdQPHRVpb0gWxpZjfvH9CRP1j38F4Nr3J+toqwqKvhkgPLdv0DbhkBtsMfEV9myQCQ4mzEPc7M7bOGx+tVfKfQqSrgGmrKYKkCkbi7CcV/0oG+0XH1UYPjaKPfKMm78UqGP84JNvYr1HyheSqvo5SraC9XRETD1NNT3ezc63Ag2tvj78UzoHIakw52tmM5SHejhcRAPUyJid/kubJlUOzoxYXPo81jjuMmzeFT9L1SpiDcJ3ES6D7Q+fBep0vJ3Qo8v7tzhF1ySyP8A91vkpQ6PoDjj9EhbLpWc2+eS53qYnT8pIxGl6/BIA5+aPLEr8H9jq5JhcRIVfU/Jnk8e6NKDD2maWVv+9TdRo9JoqCepGEMIIjOON5DfnHZrsDOTk7bftSOZN0istM12ZGpOMBGQ3YA7Tv8Aw73WO5Qcoed81BeOLJxLqM7N19zeCreUnKGeplzJhjAS83FG1owbuZn4+1VkISmcYi28RboM213fYvQjjUeWY7RBtmQkLPvC+z2bFoeTnJjM4552dgIchbrf2K25PcnubxnqGbPHdj6m+t3v4LR3tvW6PRZZzy+EG6GYKeMB5sWsI2EWbqXXjFv3sU5ziTzhf9yxspRFIN0/VW20PW5dMoygoxP6RWi0lbMG3gz4xDbazCxP+05eCx+RNls6S13KflLFSwQRQRRmckASySSM7s3OCxNZuviy4tZKXEUd2ihHlvwRoeVOpSyiROePRLPbs9vUt5ye1YnEc37sm8V5bp/KeQzEZYQbKxX5sgaz7GsXD7FstLkuQSB0CLebufrXnZoyj4PWwyjJcHodRqQhFzhP2evvt1LFcoOUu6WL72LrS8qdNnj00Z/Sj3W9y8Lr9Rk504yfeyWcU5G1KrJmo63U5Fg7tx2+1rbPmoEGu1seRER+17u7dy7JUQRBztQ/rW638FDHljRZYnS3HLdfIWdm8R4rshDjhHNkkk+zV6V5QJRAY5Xvj0dnVbjd+vips3KWCcR512Yyvjbh/osqVfpco4iPNkXfa3xZ9irK+hx/NO7ej1qjxp/YmLfa5LfXnx3he+XR/gsvURZIermbzZ3fs/6qRFsEvx4qyVGqe4p6mlLEuv8AH2KmmGxcOiS1VSO6s5W5ZS5ehl8104nZyZ40Qib2/wDhckftcfxsun5OiReqEny2ppx7Q3xLpe3x/HWuk4hqOT3Ky0wt7L2F9igMFyL8cdj/AME/RPYcfWYfndUy9GmN8k3laGQU0/HIZIvZhjj7e0s3EVhEu0RP8ld61OX0eLHb58vs6viqenEnLEu0WS30vGOmcGta9RslU1SPnCLsjupUO+JjZujuqtxLexTsEsgju3XZTOTaSuAiNu/JJgG8WVuiXyUd5JMd2+K6EsmGO3FCaJVV0S2dEWxUcoSfmit3ZJJEbjvJRPJu5X9VRQoWbWxK36TFOSRb+VuzkLe5NXkchyvl2Ut2kyHjl2VKFCoxHpYNlzeXzT3MjvFZstzZ7UyPOZdeScF5MuvJSkGJel35BFv9EU8HREmvkTiSUxyNkW31lwJC7KJAXFS5AQ2bISTo0IvEJE1sS+SjtLIw9eK60s2ON3x96twiKZKnhFhLFm3bYrjxb0RW3StktHyY5A8oa0QKnpjCnk/+IqPMxWbrbLeJvqs69a5L+RWij5uXVp5KwxsXNRXgp2frZ3bfNr9dxSyaPD9M06pqZY4KWCSolIn83FGRv7XYW2N4r0Xkz5E9Ql5qfUZIqIP0kTWmmdv2XwF/e6940rTKSniGCjhip4h7MQMPve2138VMVCaMZyV8mGgUWMg0zVU/62qtM973uIWwD4LcxbN0WsPdw+zgm2ZBuTDkLXLuva/ft71NCiUy6KYAktiUURQ6lMyaySZJCYSIWzLsszs1/e/BkoUSk1VTYRHLa/Ng5W77N39SYilkfpNbHpP3v6rd3jsTjklUKHBcsd7DLuZ3+2yI5BLLHpRljI3c9r7fc7JoWsOIu7D2eH8WXY2ER3e0WRd7u/W79aCjIeVLUMIIKIXfKcnkkZn2c2FrZN171v3XXj9fWyQnznZlJo5G8ey/w2e5lreW+olNW1c4vuR+Yi/5cb2Z/wBp7l+0sJyklIqeQR6WPzbgvIzy3zPb02Jwgh2XWP8AMTqB+WJC5wRfo/8AhZ4Ya1wikFwMZ4+cuEg7NpC7EL7RK4vst3Kfp2nSR5S1T9KxCLOzt3te3HipWFMtPLtRq9Fq5ea589wCHIXfidvRv1eKgcqNa5wTjO7hjiPXs6/cqXW9ZlwItrj0RbgzNbrfqZZyXUp5cY7W7P4Z1vDGocnFOUsjJsWjySyxFTszgRNk77uF+9nXqnI3k9pcUGJg0tUe7JM7WcPCNuyyxHJ3TZ2GPK63emxSNjxyUSzbuLLxx7ROq6ZJEWVs4tuJM3wYm6nVf2h2dIVsoDzDmza4kONnWX1mgkiPIbuBdF+7wdIyMMmOuUQsLZbN7YkOG9INklzJIEy3lcyoWA9HZ35LT6hyTlrdOpqkdglQxQRvttlEzQ3J228AWWYixXumkaHUho2jRQStGRUMcsjOzu7PUXndn7n87b3Lj1ctqTO7RK5NfY8x5IcgJ4IKmOeYJpZA5vfc3szbGFnMbM21eg+TXkNOUsVNUGzgJtIRjtbm2e5Be+1+DX2cVa6TyYkI8pZDlLLe7Ife69H02lhoaCWptvuGzvt1NbqWDl6nMjt2rHxH+ozPlhr6aGh+itbLDEW7tmxfLg6bLLWGQN0pN3u2ut95SdelqK0hPNhz6/akaTRxuEeDbxdJ1yepUrO/0nHHXkwuu8jtUlljGKO8A2ylkMRG19rgLvd34qnk5D6sJ/RiKIKIpWnkZjBycwe13dmydrt3r3kdIKWAREjbEcSduPu/HWs1rGi6hFkQHzodnvb2rtx6hx6PPngjk/UeZ67oUke9E2Bj1tezt47NqiaRUFvQS9LHdv1P4eqtfqoVZbpibfJUcull+cJrF2XvtUTyqXZvDDs6IE0GQl6QpsRLHFWAUkjERXuos4lksbN0iHUturOajjkZeiL/AOn2LS1LbqyeoSb04+z7V06fs5dXwhIlux/V5svt+9dgDdIeyQ5e9nUOGS4lHw/SD7lNiLdyvu9K3wyb8dy7WjzojJMTFznER3S9nV+PBLIbCMg9HnA+bbPsTpjbd9Pr/HWuUgXPmi7W7+2G1nWU5G0EOVg4wQDZnylOT3Pi38HVfA45SbOiW78VzXquT6QUYXYYhYP+5/mSq46mT9nLeXVhh9CPJ1PORkmlcnH9pvgpkEV8/RJQIYyfoqQLFiXq9JdVlCQIbo+jtyQw7vq8380wEZOOSVgWOXZS0RQ8Me4JF2SSqnolt7TYpl4SxXXhJRvCQ/hY4yvvY4/JBMWUXpZOmXjJsUpgLLHtKu8ttH8CYpMekQ/xSsSyLF97cStP02pmlGCnikmlk6Ixg5k/sYW4L0Xkr5GtUm5uWvkjoYi6Qfnqi1tm4L4t+0XVwTc2Np5yEVzlHskKtOTnJfVKvnIqCllmykaPnGB2ib60z7gt719DcmPJhoFKMZFB9MnG2UtY/Os78btD+bb4LcRMIiIgzCI7oszWZm8GbgrUweEcmPIbVmAlq1SFOJbxRU/npfY8r7gv9XJep8m+QGiUQj9HpY5JRw89UefluLbCZz2RPxfdZuK0zOlWVqKnLIdDMkziTgQg7MXZd2u3wuoAuyZrJZg5shFji/Tvd7g1xa7Mzb3F3/ZXKJp2yGdwf9WYdbet63uU2NW6AmDEhyHol0bs7fb1JzFN1U+BRDa/Py81d3szbrlt2eFvenA53eyxfe4Ndtntfi/wVRZH1JyGCQxuxCLcLM/Fut22MneZJh3H3u53cmd/a+1k5FJHIBbLjkcUgu3B22EBMh6YXHm8pMOjgzttbuytlb3qb8CxFKecUco9GQcvZ+HTuCdjAREYxZmERxFm4MzbGZkqyhv2FjGC5gpDskk3u8e74qLAwYliWPS7N+Hy6lkuWGsRwhPAN/ptWGIs+3mgtjcH7uLj71faxPHTU8tbLI7iA9f5yQ+zGxPsZr9Qiy8qpNXKSskrqjpzyOXsBtjMN+DdX7K59Tm2Kl2zr0uHe9z6REno5zEsYZn3eLRnt+DLF1tNUyHLTQM+UZ+cct0A+s78H8F6rHr472L7uT/anQ1Smk3ZYoj9LIBL7W2rzI0nyepOTapHkdNSRwiOTsR5P1tbZt2X49/4yVBq2qZDLGD5llvPxbjZ2az8V7hU6PoUu9LSU7l3sGD8Ldh/F/ioX+xHJreIaJt6+Vpp+/8A5i6VlijkeKT7PCBGUg3jcucDhx2urLQtLlY4iNuiXBe0Q8lNDj3oKWMS8XM3+JlsRNpdMPQijb2MypPNZrCCRnaGEmHh2mWipAsUZfvJlqW3UpMUe8sEavomxCO6namkEwIS2iQ8FGiAslNhb0vRWibMtqZidY02SKfviLon/B/FVgBvEK9Iq6UZAISZnEt3b+NixGsaRJERcXiy3X/g63hkvhnLlw7eUL5I6WVVqOm6ePRqasY5PCO95T/ZjYn9y+rxoocB2NiI4jwszN1My+bPIuAjyhopC/Rx1JD7eYNvsd175V6wIAUhPuiuDW5KnR2aLE3FtF1S/RozEjtiO8LdXvUPllr1MMBZOz7u6L8PevJuWXLLeiLnMBK48bNdiLZfhwxWE5Ucp5zH847BjvXfgueOVuNI9KGmW5SZacuKyCUpJAjbKO5XBU/JHWB+kRxSs7ZF17PtWXo+UA86OEzGXoPsd/cXFWevcoISijkGJo54ybeZrdduLKmxrhna6ceD3jR6eNx3X6SkanRRkOJW6/x+O5YTkTyhIoAzfeEW67fhldahrlhLam/bwcbhbKbXNMhy47cnHwWS1HSh3tvwv/4Vzq2pX6/WWdr9R6W1Qm2XUWivqaYQHddUFZjkp9bV36KqpjWitFiPUBuksVq42OX1t35/6LbkqbUtOEzy/aXTgybXyc2qxbo8GRbYQ+qpVM9ixs7j0u+3h7FZ1ml9suiNvtVhptPGAc5xL+C65Z1VnFDSu6KyCYccSa4jcfZ4pM0RAYSDtHPIX8HfhdWeuRxCMdWLM2V45Wbg52yZ/bZRmm/u8kn6sWxZ/Tfo8eq6opblaE47G0/HJSarkU88g/rQy9wsz/NlW82Xn/Ryb/qZPyhJkXq7xLlLF2uzkvTh9MUjwpzttiYZB7+0xJ+GUfOZP0lBBr5J2ELjxsrGhNCUcR29G6AkHDef1lEYN3L0VpND5E6pUCJBE8QFbzlReMLP1sztkTfVZ1FAqSmHv6Vk7AxGYxxMUhySNiEbORv7Gba69R0DyXUEeMlbLJVGJZYh5mL2O28RfFlvtG06ipwwo4IoR/4YMzv7S4lwbr6lKgRZ4/oXk41qpxIo2pYiLPnKl8H9jQs2d/cy9E0LySaWBDJWzS1R4tkAeYi6r9F83+LcVsY5U+EqtsRFkvRdPoqYOaooYqcC6TRAw3+s/EnVkEyqAlTwSqSC2GVOBKqCunqW5ooMHHnG51n42u3DwtduHWymU00jkWbMwbMNu+/fdm2M3Dr71aiC3GRRqsZ3ngniJ3CMTGSK7Mzu4uwn4vd+/uTYyJ4TROgShqLDHzriBl1Z7L9bC7tt+CdAxfouz+zaqWJ8aoufd3yLKkd+DXFhKNu4tl/Wy9VPVcmNRSc1+dkkxlZuuHF8nNvVe21W2psEnUSkY4N5wgkyjnMLM4O7Ng7k7bo3u1/FlKYIxHuER6WTts73K6UK41NFu7ofuN8VFoDdBJzgSZtnFzr8w5tZzja1j+N8X8GUlhLHHnD/AMrv8XFdZkmZ5GEiBmMh6nfG/vtxUNk0OU4RgPNg1h2l3u7vtd3d9rvdL54chG7ZF0Wvte3Gzdaj004mO7diHdkB9hg/cTdSXLCJDibXHpex+p2fqfxUV7iiRzij4kJZRPul0o36HtF+w/giOMm63cfHa/xS7J0KHWkXc0w7IUURRifKhKTy0UZ/7rHFLPbsSTM7CzF34s7Pbxdef18WfNTi/Z+zqUvyscui/KkWmhY9OgvHUmFnM6h26bP6Md7W8S8Fn4tapMCEaiJxy3d5rt4O17suTUY+eT0NPL6UkPO5DzmKPp5Nltdt5lQ1fKCHMuaLnCG+xme3xTdBygExLnYrfUe77OOx2XG8Z2Jmli1cmItu6pNJrxMJZEqClmppRIoCZyHpBwJvaLrp0+7kLqrgTZqR1u4dK5fBOtq1wEb3JY0mIRyulxzk2PrKriyeDZlWi4jt3lICrHd2/WWWgn6PokpfP23b9JQODURzi+O3sp+OUch29FZmKfexv0VNil8ezkpK0jRRzj0bpcwRyAUZMz73WqAZbdf4dSaesLol6WKWKQ9yW00qfV6SeJ7xSSSR+IZRGPwu7Lbcp6uTmsB6Rd3j7FmeT04lVUwl2pwEfsW4ChE6iMj6MZZ+21rN8fsXDq+ZI6tKlFOil1DkkMmjEEsbHLILyRu7XcDfbsfivH+UVJJF5oxfd3R2d38F9Ccp+U1NTiNNYJJ8ciB3sEbO2xn738PFeU8rOUOmzERfRX53JxJ+ebB/djdvZdTijR1re1Z4xXTWPdCxD0Xsu0ozzSxCbme82LbGb5cVrtTpqQzy5to/Y90vStLjAxkGz49bLeUuCKafJodJaSIYvqsJKfqFUTju3UennjIcb9EvBK1FhYMh+r71xS7LXyUddUlvZKommJ1LrHv/ANyridbJItYxK6juyfkdRzdS0QhuQkxK9v8AqTlt5RdUqBjiLLpSbot1+L/BWSsrIiz1UcpjSXxES3X73bjdSypCjgky2iKo449/nB9LIfx3rUEJFSxRlsKWQB92xyf4MtJrjgrBclFrgl9CgHtS1eQ99gazv8XVJq1UIRDSXsRedk8G7Ifx+Cv9dqYyMiF/MUUfNj7mdydvG9m9ywlYJSSyym/SLL48Gbwts9y7tHjvvx/J4/xDKraX4JIVMbnIRPu44/JJCpHHdfoyPsUL6PvSDfeEclwIt3nL9rdZejSPHpGp0vkfqUhZFE0QkPGY2H4i2+3wWm0ryex7pVU7v2sIRt/nPi37LLSDOpEc6tRcf0TRNPpv93hBj2ZSPvm9rbblw792yvI51RhUJ8J0IL0J0+NQLb17Y9+xUUc6VUHI/NlETMQk+QnfAwdrOxW23VuCDQw1Qv0XZ8d0rPe3Xttw2WUh5yxLC2WO7fY1+q7t1LK0lOLEMuwJY7iODkUeG1mZxJ9j4u/RVvHOjrwCRLUVP0iLF94o3IgZ35l2F9uTY3Fyy47egraklkwHnXZz7Ttsa972bw6vcqgJ0y1RI8ssUshxjJb6M4YtdrbzZON87qe0DThMkV9VKMBFE1zG3BruzX2kw9p7bbeCpaiXm4pJRkdijjct98mO3U+W1v2bKfSVNwjImsRRsRN3O7bWRccgsIJ5OaGQJWl3cruzMD+xx6PzUvT60ZIo5xuwyDlZ+LKoYInyyAHy6Wxt/wCt6SmRyj0VDaBbZi44kzOJdJn2s/udP0/Nj0WZsulZmb42VG9cIlibOA7MZHtzb+Duz7v7VlOjmVQXQSJMFYLmURM4GPRErNk3pi7PYmUCOdPc4JbpMzj4szoC0Y04BCq4JRTwTICZgLlkTNl322+504zKKEyVFUC+WLs+JYlZ72fufudQCTZcsktIusSEHDxbpbFkfKnHqkmm/RNJeMJ6ucIp5Cl5l46ezubgXG92Fvqk61kzCQlGTM4lukztdnbrZ2fiszr2nyOIgG9BlwkN2kh8Yj7Q+qXx7KdFoqzzbRfJdRNBH+VpjllEssKeTGJm9DIhzP27qj+VCh02DTaSkooI4hGr5zcba9gdicifeJ+jtJ+plP1usqaWUozd5YtpC934LG8qNaGeIRLZh5wfst+O5VyTTib4YtTTZjo5Yxl5wm3U7ptRHvyY9K+PsuolSGcuIg75Dls+KTpdTGAyRytbKTEX6mvfY64tvB6XqJMkPURsZyDcDLue1/hwdWkPKYYwjjKNz3fOPnd38XvxJVtJSxHLkTO8Qk+Vnt1Wbb7Xv7lAroRxlxvuybr+HBlCiRJpvg3OnanTSxbjs5dpusH8WUkhHHEV5KUhBjLE5gccnFnstPpPKwhHm6hnfHDzrN3+kKl4jPekzZFUCwiNuilHWC+Kiw1UUgc4Ds45cWdEsQ5D1D/osXAupInfTxyH/MptNqI/5cVQW6JdkkCViLK+IquwspI3FHVRl0vBTQYX/wCpYGGtITxH0ch+Cn0muyN0vSxVaZPButEPGspPVqYv+tl6DrdbzQ5js7XuZv8AVeK6fyltLGRfo5QL/M33L2HlJEMmEfpD9q4tUuUdWB9ni/KbWSkrakpZmiApzKWU32CF9nutZaGq5IUQDQyHXNjV25s8wYJXeIj3HvZ+C8/8qvIzUozqZOnEXnBZr3t1M1uLqvpPJtI1HldzqBj5wws9263sPcuiOOO1Ozoyzyt1HhGg5XNRU9YNINdFkUTyE0soNbF2azvwF9vD1XVRpPKKIiljikYxiLGS21vaL9bePgshJyPrSMowiN8Sx4cPb3JZ8ja2Pe2Cf19rfurf08ddmUZZr6tHpejVe/xuJbw2VtPV3yj49pYvkDp+oDP/AHrbFi+L9fBaqpGxZX6I4rhyRVm8XZWzybxesoUprtbJv7vpJl0RdibKNIpbDdR5B3lLAwWxQxohOfnyfPsxt1NbYpFblkHo5bycoxjAiK7vl0W9qlcBqyRQUsYlkUQZD0VXcpNQFj5gCvLjiVv0YP1e0lW8peUMoy/RKezF+kk4u1+puplUUJWGSU39PJ347eL+3iumGF1uZxZtTFXGI3rNSIhzX625SexrW92V/wB1VDSC+Wzd2fJPVU0ZmUpXx5vIW7mvsZRxaPPHskOQr1cUNsaPnc098mx2M48ykJulurgvG+7btOQrkOOO9+sxSTcRH1s3EfcrmZ6qEyfjnVSEqdCValy4CdPhOqcJk/HMqguY51ICdUoTKQEyAtZzIwkjEnAiHHMXs7IoJhxGKVgGUR3rXa9u2BXu7eN1BjnTzSC/SZn9u1WTKllR1Zc/LFfMIxAhfi4u97g79fC/vU/nBIcSZnHufaypYZBboszeyzfJLgrt/mzbAv0e27G3qv1v4Ke+gXYNHu5NfHeFjciZn72Z3spEtbgIlZzHLzjhtcG9O3F2VVHOpEcyiwXEFSLiJC7OJdF2e7P7FKjnVBFixZBuZdLB7M/7PC/jZTY5kBchKnKfEehsHuZ3w9w8BVZHMpEcyiwW8cyfjmVSEyfjlUAtgmSa+tKOnnnBsyijcxDvs3X4KvlrI4xyM7CRMI9bu78GZm2u/gpUcokPqkPB26n6nZ9rexWT8sEp5Jzp9yQI5Tj3ZGByBr22sLv3P3pnSIqkB5o3jiCPdvFvnMfalIj6OXdtVfHFPGcYA7nRx33APGUOGIOT7SjHb0XYlNpZZufxFj+j82+Ty2vnfYwdq3HpK76Jsn0tfJ9KkoisfNwNPzjNa1yxwNuF+u+xWQyqoooI4ylkF3c55M5DN2d3tsYNjdEW6vFTRNUk14IJuai1oXHe+C6Jpd1RlkzCcrdOkPLEd3HuXlGt6YQmWIX7NrL6EracX6W3wWe1TRIzyyjH4LOXRtjdOzwd6eRi5y2BCOI29ipaqAmOTc6RZbV7TXclos9xnyL4KhruSsjCRWB97u22WPB0KR5zo1TzXPiYNIMm8Ldz7fx7kmLI4ixbdLrWnr9HIBLGPtPldupUVFJzYyxGO7zmQ24t3+5VkqRrimr5MzqMBCPNi3SJRSIscbd3yWj1AecLdCwYbr9d1DkoSwjK31leL4KZGnLgr6XUZ4iyifDeyJuo/ay0tBytjLm4525sh7fEH/iKpaun3S2dGRsVHqKexCQt0rKskmQnRvwrBIRkGzj2bbWTnP36VlgITliMiiJ2HFyw4jf6r7FbUWvSP+dEGLm2LwWbi0WTTNIZ72Q2TJSllw7WXvUeGvF8tnofNTQkF+59/wCCgvZG58hIse0vpLRZOfp9JqSt52jp5y8c4xf+K+dDDo4sziROvoHyUmMuh6XIPSGmeD/BlOH/ALVwa2P0pnVppcsd1uqoJCkpqwWyjnyjNrMex9lnfi3DYoktXSCXOBczG+12EOPSu7Ks5ZafI8pSWfJYqsetHINqwx5LVM9Zyg0rRouUeuQOJRRCEeV83Btrv4vxWQlISLK3vdPhSk+8fS8UkordS0lIq8nhEmnnEIt7pEqesqS3sdg/NLqZC3vVVZNJf95VivJlJ+w1K9yXXSMt5dckolMcF7Jsm3kg5LIc93dTwLI9QFy+qu7oiReiLl8Epv8AqXKmPzUnrC4/JSJdHn4yZTyyF0pJH2+9R9VqiARiHpSbxN3B1fP7FLpKcilxH9Zj77qJWQ3qpC4jt5v2A1m+TL18VN/g+e1E9q/JV/SCy4dnGy4NQWfOW7OKseZHOLY28O8ybGnxI9jPwx97rrs8+yG1SW9s6RZJPPE4427WSnjCLnINuz80mODcxt2nEn+CWLNsMqdCVVoyJwZFoXLQJk+EyqQkTwSoC2CdPhOqcZU8EyAuQnT4TqmCZPBMgLkJ085iQ4ltH8bW7nVOEyfjmQFzTyk27d39tr/FuKlR1Co45lIjnQF5HUKRHUKjjnT8c6Avo6hLqJZCiOMCwMo/Nne1j6ttuF1TBOpMc6gEzTK6dxxveWLdnilszs/eBi20C4jsf2qwodWiPrcC5x47HsubPZ2Euif7LuqCsjIxyiLmpxFxjkbufiBd4qRpcnmBgOPDmx5uRna4H3mz8Ca+33q7qrKmjnPIcgaPn4xfmCNr4O+y7bLt/ondLAYwxvkcm9PI/TkPrd37vBVFPIIiMY7BHot3KVHOqX4BfR1CfjnVFHOpEdQoBdjOnQnVKE6fCdAXITJ0ZlUDMnhmQFq0ialAX6TqKMqdCRQ0WTI89NfotbxVdPQj9cleO90nD0W96o4I0U6MpU6WRZZMzCqGv5K0xZYxNkXWzL0UqYe09/BNFS3HdawrN4zRZTxvU+RpN0X3e7g6o63k5UiOOLsPs2L3SbT48eFyUSo0q472zwUbWi+9Hz/WaRK26TO5exV9Vp8/bZ29FfQNVosZY+bZ/F2a6qqzkzA/TbeyfgqtEqR4LNTS5cHyUWanky3r5Yr2iv5IX6Nm3X4ss5qHJSUTHFs8R3nbb8lBazzWKepAt25j3f6qZTazI3SuxK9q9FkApB4bzKrqKHeLFv0m97EpeSbJFJre7x3e9fQf/px1qOSilortlSVL7P8AhzWMX/faX4L5i/J/53bbe6vatH5NeVFTpuo01XdypSHmq0Ot48m327yF9vxbtLHPh3w4NcWXa+T7Q1bRIJC5w3ZxLqb+Kxer8loMyIb4/Lu2KRDyrjkiilA2KKSNijMH2ODtdnZ+trOzqNPr9yLb6u38bV5e1V9z0VNlTW8n4QHg3tusxq1PHGPHeV5yi5RDGOIvcsfg68+1PWCkIiJSoWWWT3G62Qd5Ukp7yemqslBMt5aUTvTHmNJI02zrjulE2KckkS7KS7rsLXFVdFkvI+CW+0cU26XTvvCs5M1SswdYZU5T7fOnIYxN18XbNUnPSZDxy7KveXVGUeojKXQlHzb9T8b/AGqiYi3culif+i93TJOCkvJ8trL9Rxfgc56TLryTPPyZFi75dpdyLOLbvY7y5jYpRF97YS6KOShDTSMREN/WUmgIn+rlvJqIbyy+iQ/NPU74gIj2ZHyRol9F8zpQkmxe45JTLQuPCacGRRbrrEgJoyJ4JVXMacGRAWQyp4JVWDInBlQFqEqfCVVISp4JUBbBMn45lUBMngnQFzHMnwmVNHOn451UF1HMpEcypAnUiOdAXccykRzKkjnUiOdCpeRzKRHMqOOoUmOoQF3HMn45lSx1CfjnQF1HMpMcypI51IjqEBdjKn45FShOpMc6ElwEieCRVITp0J0ILcJE9GaqI51JimQFmLpTtfpOoITpwZkJskuHosuPTj2tpJAzJ0JFFE2xk6Yn8BTJ0Q9lrkp/OJYkPZUOJO4op9Mv0lAqNJv0W961bhH2tqbOHLo7B+Co4WXWRo881Lk9CWXOsz/jvWV1TkYJZ81u+1ew1FNHvYtclXz6fkW98Fm8fsaLJZ4FqHJKUcsWd/Ztb5LO1mjyiJETdHqX0dUaXHjwZh73s3zdZLXqCmcSEcHKTdzYLv7r/btVG9vZ2YdPky/pR55yR5S1NHF9EN3kpdpCF/ORXfbhfZj4eK2/5ZkOCOpG7wSi5RSNwfDizv1EPd0uCrqfRKIDy5tpMR3im37/ALLtj8lquSUkTV9HFiLQc2cRR2bF+eYm2jw6h+K48yi7aR6uLQ5IRubRj6+tzLK7qslZegcseTkISySQCwb3QDh8OpY+ajJi4LnhkT6MpYpIqhDpJp41YlATdSbeIlZszSZAICXGAu0rBwFJKG6q2aJMhOCcjCydGEskthWZuuhg02B2JOVDejsUSU1Vqy8XQzroDPEUBszllmDvwYmb426lipKekcijOOSIw3ScScrP7Dfa3v61rqmWwkXorG1kxHLJJ6S7tFdV4PP1ijJ20N1OmCPNyRSc4BbpPZxIH7iG+z2qLHTXKQb9HpOp9NKTF97M/wAnT2MZZbMCId63B/G3UvRU30zysmm4uH9iCNF0tvRt80qGnJiLb+bU2aMmyImuGzaFi4d9uCiNUDlIXpdH4t9yunZxyjJdot45hIOcDf8Al7n7k4L+7wWU06uKMsh6P6QO9aSmqRkDnInbLufqfxWxcfQuN6ysdB0StrJ+YooTlMd6R2s0cQ8MppS3Yh8SdlWU1FXJ8FZzjBXJ0ivRdXfLTTdNoYI6caxqvVSlb6THTt/dYQs9wzdspDvbbu9exUrwyjBBUyxyRwVefMSEBNHLzb4nzRO1jxK7bvc6zx5oTVrr+fwZYtRDIty669r/AAdY0sTXaKlmlIo6eOSUxjOUgiAjdowa5G7C2wRa73UcSF1ru5o2tXRMGRODIoTEn6OKWSWOCCM5ZZSaOKOMHOQ3fgwiO0n8FLdcsluuSWMqeCVXfLTkrVxxaJ+T6Kqknl02OXVGGOWU2qDsRZj+ifba1m4LNVUc8J8xVRS08+LFzcoEBWfg9ia9uK5sGrx5v0v3/wAOjl0+sxZ19D9+PPDonhKnwmVUEqeCVdJ1FsEyeGdK5HaRNV1tNAMcr0v0mGOumjB3anikKxGRWxB8RPHL0X8Va8sOUdVSyzRwaNTRaeEpU1NPLTO4vxYX58h84WzjdyXDn1myaxxVyfi0jztRr1DIsUFvk/FpfyV8c/ipEc6yminLvFK98vxsVyEy7Uegi6jnUgKjxVLHMtFqmtDR6Bp9aFPTTS1OoVEUhSxibuzMNt52v1fN1zarO8MU0rbdUcus1LwQUlG22lXRyOq8VIjnWY8olTJFrwwRWCIqSmkIQZhC5x5PZma3/hWVPPuir6fL6uNTqr8F9Nn9bGp1V+C+jnT4VCpI51a6RFGYVdTUSc1S0NM9TUyM2R2uzMEbekTv8n998mSOOLlLpGmXLHFBzl0idHOpMdQs6Gp0U8FaelyyyS0NMVXLFKLNeIHZidnYeNyZveyZ5Pa2M4ZD0u0sdPqoZr23x4apmGm1kM97bTXaaaf+TYBUJ8J1SRzKRHMug6i6jmUmOdUmqtOGl1dTFFI9QMtL9EbAneSMyLnnAe3us23xVfoup1Iwc5XxS0xEWI87GUTX7rk3Hj8FzQ1eKU3BPlcHLDW4pzeNSVp14NiM6djnWR5Z6kUeh/Tad7H+UIhza13DHoX9Ha+z2JXJOuneijqaoJQEh5wTOMmZw9Nndto+KtHUwbkm6p1/ixHVwcpJuqdc+eL/ANmyCZPhMqKkr4z3gJn9imFOQ9IT6OXQfh38OC0nkjD9To2nmjBXJpfktxlToyqhpdUgMsQJnLu6/H2KadVGA5m7CPe/2N3v4K+5VZpD662/UWwyJRSjjkT2Eeld9nvvwWUquUYtlgzeq5cXfwFur3qn1nWiKLeI3IujwsHsEdl/HxWE9RFdcnq6f4Vkn+rhGxk16iYiAH5wh6Vtge8n+5U+pcp8SIQYMuzsu3z4rESangJCPSL7bKA1ZchG773S61ySyzl5PfwfDMGPxf5NJqGpSHjIZuZylzcTX4d7s3BVlUViL1d0UyE1z5zsxR4x+1+LqPJU3L1clmeiqSpHSYsSx6UhY/F09RVQx1UUo/oqkC/w3Zv4JoJP0hdne+5VcM+8Jet/FSlZlmkkezcoaUTApB7W8sDqdEORbFvdLqRl0vT6kdoy0URF7cWZ/myzOtU45F+8vJa2yPMjIyE9PZV1Q1t1aGeEnVdU0a2jINIpnFKBlImhsksNloZ2MuybNhTh9JJcbpRO6iBN2lCkFWFUyY5vtej0lVotuM9yklwixHpS7v3rKsys+UdXzlQWPQj82P8AF/iq+Md5epghtiedmlukLEbCliuElxjdbFaJEL28E62nwSFvjYvSjfE/ss7+5JjFTaXpKjZssalw0YAVKo6ogPIHsXyfwdk29NJ2Uw+TLsPBNnoldBIcAz5AHOh9JYHsfN5Nnh443Xq/Lj6bLpzR8lDhHSY75wUVwn8Xly3zPG3SfNfPMM5CXG2PRdavkvylqYpRlglOnqOjmL7sjegYvuk3gTLi1ulnl2yg+Y+H0/yeb8Q0mTLtljfMOafMX+RumpGYi5y/OiWJsbOxM/Xdn2s62nlCcv8AZfkTIPZjr4vhqFTb5WVfy05SR1dPER0kUWpRyNnVRPiEkbNtZx43vbZt4K6oh07UOT2maXU1sVFVaRU1BXmbZJHNKUzGFya+03a1+z6zLHLllshOcdlS5rnxV8eDDNlmoQnkg4bJcpc+KtV4OeR5/P6t/wDl2t+2FYDSyuUmXpL1fk1qHJqiCv0+lqefrT0qoGpr5bBE+6xPTUwXs2Ts3pEWPHsrx6iMsi+s6tps3qZZySaVKrVX2aaPM82ec0mlSq1V9mw1nk5qVLANXWUxwwFjvG4fpGZwZxYshez8LKv0+tljljq6WQ454iyikB7OL26n9n2qy5b8utRr6P6DLFDGxFEchx5Xd42tsZ32d6zunMTBiXZW+mlknF+qkuX17HVpvWnCXrxSdvhe3g9C8p/KLVIKbkwVLUyRSVmjQVNS4285KYg5m+y19qwdTNUzz/S6yY5pSFhzN9tm4Ns6lu9ZptM1Sg0Jy1GChn0qhDT6mKdtt4rCxC2TXEmFnv6yzXLk9IgDTdP0s2qqiKIy1KrFyxmkMriwxuRCIiO7u921ceheKEtqhUrl/TVK35o874c8OOWxQqdy/pqlbfdCtC0utqpSgoojmljieeRgxbCMHEXMnJ7M2RC3HtMm9RilgqJKKqDmp47c5G7i7tdrttF7PsdP8jeUtTQyz1MEcchVNG9JIEl7YPJDNdrPxygD5qn1+vqazUJdRnYAKXFsQvizCLC1r+DLu3ZfWqlsrvzZ6G/P6+2l6dd+bNt5M9QnHVtNoglMKeu1Cmjq4mfclZidmyb2GbX9Z1neWuq10+qahp808hUtPXSc1E7tiOJELW9ymeT+rhj1vRp5yaOKLUqeSWQ3sAAxtcyfqbxVDrU4nrOpTg+cUlZKQG3B2c3dnbwWLwweq37V+nuvNmEtPB6tTcVe3uvN+5a0j4iIq9bR9Qaj/KXMH9DKI5xldwZnjAnFzYXLJxyF24dSzISq/ruWde+lDowRw4DSFR85vZ4EZne17X33b3MttTLKkvSSbtXft5N9XPNGK9FJu1d+F5GtPnGQ44wf84bR37ruzLW8u9U5PQF/sxVR1soaZUmdxcWKSQ2bI3JuDWdtll5xybyiKIjvjHIEhd9mdnW45Z6LpdXrNbq46xSRBVzc7HGY5PbFma++Nn2cLLh+I05xU21Hl8X3xXX7nn/Fac4rI5KPL+m+110mR+X8tJV0X+1NKM0ZwVkWmyDJZmdgiF2s31SHbfvUTSa7OKOT1U7y0rNPg5Pfkinq4q2on1b6Z5lrMAc2Abd4vQ+aouT5kMArX4YqhJJtxvi+6/f7m/wlVCSTbjf033Vff7mpCdaDklJJJLLp4xc9BXRc1WhsbCJnu8rk+wWH1nbj7FjAnWi5K6lC0Oq0Ms30b8pUfMRVHZjJiYsSfqa19vqst9cv/TLi+Df4ir08uL4/+69hrlRzGk6dPBpsTylq5S01TXk7G0cGTNzEVmsLXG9/SH1WFqrkdTlCPnbgcliwO7PZ2u2x+q38Fc6FqGk6ZSFSVdeGpDU1kckUIQtJFS2/OyuzuTv2dnqLLa3XSnygrDKp+lBLKxjMD3A2cRxZnbY+LWb3LzPhWSam4STfne7V14p+x5HwbLNTeOSb873a3V0qa8G6jqFJjnVBDP0VKjqF71H0Zda7WTtoes1YSyNLBJQQQHfbEBFMJMPc2Ii37KxmjQVtbBGVVVSyDHJmIu+xns7X2ddnf4rUtJBLpuqaXPK0JVpU0sUp7QZ6cpXs/wC+37rqljqNO0/TauMauOtrZcI6IIgJmifK5SHvbdmy3i/gvGj6OLPLdD6m01Uft7nhR9DDqJbofU2mqj7rw6NRX1Y0fJ4pebCoKDUB5oJWZw5zEcTJnbbi+33Km8nlVq1bqMVTWznJBPK4yxP+bcDZ2swvw4t8FX1+rlNyXnjl/wB4LVIpI49ufNsDO5/V3X2+xW3kqqBj+glK+ADUxlI77GYM2yd/Cyl6SM3lk42/F/hdE/JQnLNOUbd8X+F0N+S+okHWa6kLfCD6dLED7WY4RneK7PxYXFvgu09JymlqpJ6iSWMZC5y/Owu178MHLY3hbqVZyRrRi5Q1s5M7xSyVsV2Z2fCo54GNr+Bs/uTlRpVMFRLnrbAGW7EchsbfW3/4Li1TkpRbSrau4tq/PXTOHVb1OO5KnBdxbV+eumS+XFYVJqmmygTAc9FFJVsFsHlvY7Mz24+PUrKq1uSQRlJ93Hdbqb2ePivPOUgUkdZB/e2riKPIZAcnaNsntHcnLb1+9WU2oCIRgL9ll36Zf+mMb9/defufdf8A5vTRx4IybvuuGq56SfJpXrPx9yhV9bchjv0RyLwVHJqPioj1tykK91oon07zIuJJ7lx3e77UrnsSjHtFHkqeOq3h+t/5TdTV+fH6v49ybR6xeS11shH0VymmIsR9IlUhJ2i7SsaAhbe9VGi0cjbLWSWwbvpMPzVYxYiXqmf2uyeqJbCBf8T8faoZyfnB9YvvSKGaXJ7B5J5+c0GKAtp0VTUQF9XnCMNndgbLvKGERLLsrH+RvWxjrZ6I33KuNpBv1SBsf/K7fBehcraYSgkkHsj/AAXn58dSPLTp0YiUd5MzQ3FIpqvex449Sl88L5LPou2U1TTKtrYRYhIr7vRWmjjzLIeiPSVBq7b+Iq6fIRAia5ZIn2KWEOI5EoNaSsQytnLfVNyk1Tm4uaDpyDj7G63UzUajmwklL8f6rF1cxSGUhdro+DdTLqwYr5Zz5JvpEZmTgCuiF0TviP1l2mSQkNpF+6pcAJimGwCpbbBUMvCPkcBT6QbKHTj+8rClbtKkmdMFyZF3FRTp8vvT8A5EpQRWXonzJR1FJIGWTbvZfqXKTaQh2SWkYRccSa49yrKnTsTGQOhlvB3exUZZE6FvSe6kN2fsTMOwU877qys0aHRhjYcrW9qhs45ZCnZayM4iwfeyxJutMArSZEUux4XSymxCTHpbPmkMo9Q/Z9n2qqZZnWG/SXQhFupEbpy6gj6SVSlccf3VIVj5P9U02mrJ59UpvpkRUUkVMDgJsFQ8sLtI4m9n82Mw9f5xlSnUZ1VScY83BJLIUUezcFydxDZsbZ9ijHlbm4V158fg5vVvK4bXSV34/BIdJGMWLJLXLrey7aQtiSxNM3RdLJ3IlDIg44y6TKOxJTGoltIlt8j4UsXcymxHboqWdDA3Jqp1nfeqj1cKOPf820XMgb7ttpXPj6rKioanIBJZ48sZNpeHX+zLFmjJyS/pdf4v/ZchKnhlVWM3inglW3Zr2S/osLlkQtkpUUMTEJCzZKAEqfjlVSS3jmUiOZUwT+KejqB703IjdEuWlv0kw1DTOWWDZe5PcndTpIp5JayL6QHMSCAWEmaR23ScS2P1/FlmR1gnrauUvNU5SyFEGzcBydxBmbw+xYrJeTZXi78fgxWW8uzb4u/H4NeEceHN2bHuSpK+mgDedgHsg3Tf2MsbX8pZHyGnbAf1h9P3NwZZnUNZxyJ3eQ+1tu/v7l0UjVtGy1HXSMiKBuZEt27fnD97cEyNJFgObXll6TvtdrrIcm5JZp+fle0EG8IdTn1Xfrxbb8Fp5KoWEpL/AFfsXNmkuj1dFhglvn+wxVhDzu4zYxqNXVnn4x9EW+xlEOouRbekoVZN58S9VUO2WVJcFy9X4/jwSPpO8SrIpt7irrktV6KLVZavFUzERRfRPo54ADXPns9t3L81j7D8FTJJRi5Vf4OfNrFjxudOe3wuWJp6ks4/rJyeXz8ad8odDFRa4NDSsQQcxFLYic3Zya77XVfMW+JesqwkpwU15L6LWx1GJZI3T9+zQU5fjwU6I1U08o48eyn45L4xi7MREwi7vZtve/UyNnqQzQSsl10+4PvVfJU72XZK3x6/lZTuWOkVtHuytzkAl/vMNzhfxydrs3iTNxWceoFx3X/8qMOSGRXB2ckNfiz845Jr7Gm5MVohWU09/wA3KxFt6n2P9q+gr85S94yx7vwXyvRz2ISXvfk35QjNpwwG/nYPN/c6zzwvkzyPncZitikjqpPRzU0BEhy9JO8rNk/C5SKVyf08jAcmXJtsu5+RWmwbknvVGVLlPJ9b+K2s1KMcUnpFuqs+hYAReksqdhTRma+Gw4iqPW5IIoucnkYMuizvtf2DxdOa/wAoxzlpqAWmliF+dmN7U1Pbi5n2n8BXmWs1fOzlIchzH2pT2X+oHCIPBdmLA32Zyy3whWt6gUp4jdoh/Nt395v4quEV1k8AdL/M/wDD2LtSSM0rCMf8qiTPlKI9kVKq5MQUWiDd5wu0rIiXdImRD0U4e0hFKgHdySYdpkqmq6JcAqwjbsqFAKnRuqSOrEYsYiFK5/eUkyFQqnFekfKkyOYcU3VTboj6yYo23ciXKjiKylI1jElROpDKJA6lMqFyqia1RKP7S9C5CU0R6HyykKOMpYqOjKKQwEijbnZnPAna4ZOwdH0WWDOLz5F6q9E8ksfOU/KfSmcWqNQ0gWpQJ2bnCilZ3BnfiWJu9vVJc2slWNv2r+Th+IPbgcvan+yaMXyaxkqKSM2uJ1MUZM/B2cxZ2f3XXoldSwf2lDS81D9H/KDeZ5oOa3IHIW5q2NsmZ7W6lC5IeT2WnlirNdmh0+niqYyijGWKapqSYmcRjYCIYhvsyL4F1T5Tv5TYSfYRVmRt3O9KV2+K48mojkyVB3UJddePPR5efWRzZKxyuoS5XXjz1ZE1zljpFDVVNHo+nU0030mQqyqqo2lIiye8cLE3mo+qw24Mq3yrxxcxyd1mCKOA9ZoefqYomZo+cEsXdmZrcWf5LJa9/wDWuo//AHyb/rda7ykvlya5DSf/AGStj/crZw/7UjgWOWNq7fDbbd8FselWJ4ppttum227VMtfJNyhIqjSNBKloiinqZBOoeAXqH5zI7kTtvONmb6rMsXq5W1bUohZmEaybFmazM2b2ZmbqWv8AI5yb1J6/Rtb5m2nR1MhFM8sI7I2MCcQI8ybLZut1OqXlpyb1Cmq6rUqqJgpKmukGCTnYiyciIh3BPNtjP1Mr4pYo6ppSVte/m/5LRyYYaxqMlbXKvzfsXvJGg0+LTqvlPqgvUQUVW1FR0jdCoqcBld53/VCxhu9q732DiUZvKZVG5FHp+msHZb6OLWbqazNbgpfJT6NW6LV8mDlaCql1D8pUMh/m5SeGOLmr+l5ltnazdVlJ5PNfASD6IxiJOOTVNKzPbrZjlEmb3Mom8MsslndNdJulVePfkxnLBPNNaiVNP6U3Sql11fIeTbTI63WZZNRa9PHFU6pPTx7jSc0zk0LOPRDNxyx7LP4Wl/2lkU5Rxabp7UgniMTQALtEz9BnZrM9uuyV5MayGHWZ6SqJoSnpKrTbm44BUSDYWIme1shxv4sodP5NNcCoMGhieIpcRmeppwC1+m7GeTNb1epRn9F5ayuo7Vt5pebr/A1HoSzVmlUdq280vN17voR5RtNgh1TTSom5qn1uhpNQiifhF9METFrNw2GOz2rR8teU1PpNcXJ/TaClkChtHPPUxjJUVMjszvJJJa/u6I32LP8AlX1GmfVtGpqeRqgdE02g0+eSLoSSUcUYG4X6tz5q88pHJDUK7WZdb0loquk1C08RjNFFZrM29zxi19nDwdZXFxhHK/pqXbq+eLf4Mbi4445pfQ1LttXT4t/gc5Z11HPyPnqqCH6ME+rA9VDa3N1TRgxi1t1xwwcXFm+N1B0Kan0zkzp2sBSxVuo6xLUMBVAc7HTQQTFBjGDtZjJ4yfLjvM3feby00kKLkfLQ/SI55/yrHPWvE7PFFMcQD9HF+JOIgz5bOk/1nm8itS1A+S2lxaEcJV2llUw1sMmLG7S1Ms8ZDk9mHGVtt26L+iudSUML2O4765bXFLt+xyKSx4JbHcPUrltcUu33RC5N6sOrjV6XW0cEB/QpqulqKcGjeM4Qzdi2X4N39VvZ57pdYRZRl0hLH5r0+rrOXzwSicEDAUR847T0jvhbb+n2rybTY8SIr3LLe9q7/hn6pVVccKV0/wDVnp/CL3SqtvHClup+X9rL0ZFpfJu0Z63pccohIBTuRAYMQPYCJri+x9rM/uWPElqfJaf/AL9pP/PP/wCSa9HU/wDG/wAM9TVf8M/w/wCDV6frlBO/KGM6Cm+jcnQaqAObBimkaYmLMmbaxE3isxy/1JptH0HXhjip5ayeqinGERjjYYZTAGsLW4Cz+9HJWVreUfa3+5F//ZP/AEVPyhkYuQmgm77ItXrY/ibn/wBy8PBgUJpq+4+X5XJ89p9MseSMlfcfLfad/wBy/wDJ7ywc6rRtECCmkirtTignqCiZ5bVJxRPvP0sW2j7XWD5b1Yx61q8d7BHWSjGHUzZPsFm4MrzyO8ndVl1LQdbhg/8Aa6TWqaSeoOaEN2mmhkncYzMTPEX7LP8AFV3lX5JapHXanrMsLDQTVx81JzsTu/OEWPm2PNvgu+MsUNXxJW1zz5tePc9GMsMNY0pK3HlX5tePc1Xk5i02p5N61PqT8zTw6rTc7OLC0/NBE580EjiThk7u1h70xpXlC0qTUqTSQ0zTqbRZaqKhIyhFzjhM2jKeSQmJ5CFnzJyuT4vtVPyd2chOUX/41SN//G6xvJ+MRyqT7O7E3j1v+O9V+WWXJNzbfNJXSXC54Jw6BanLk3t1dJW0lwuePJoalwjr6uhoN+nKtkCkxuWbPJYGHrLqb4L0mTUuUgBBAGgxOMEEUFnonF3aMBC5M7XYytcvWd1lPI5UQjyj0+SVwEiiqxpjPg1SVJONPa/b5x2x9bHwVueg8uzqj/vDxQPK/npK6AYhG73MgzzYbdWLv4Ll1r5WNuP0q7k2r8cUT8Vy7FHTtx+lJ3JtX44rtlL5Y6GOn1HTZ4ompn1LTKesqYGZmaGeQBKWKzNsISdxfY28Lq55Sahp2gjFQR0sNfrU8UctdU1MYyRQ5jfmaYC6I8Wy4lxfZYWr/wD1BNabQLyjMY6YASzC1hmlCzSzC3UJHk/vUzl1oT65zPKHRTCWYoooayjIgjlhkYeGRvZm47eD2VcUovDj9R/Tyn3XD4t91+Tlw6hy0+JZZNY25Ju3XD4TfdGc1/l1VVlHJRPSUNOJ4EUkMDDI1nvsfq4KgpnJhHJW2rci9Zo6WStraUYoI7DIf0mmktd7NuxykT7fBc5M8nNSrmnLToeeGm5vn352KII+dz5pneUxZ3Lmj2eo69CDwQxtwa2+92v7n0GllpcWGTxSW3y7tX/cv/Lm/wD9Jv8A9rB/8tSvJxp1JNUahV1ovNBpGmyal9HZ3ZqggOGIIydtuF5crdrBm71M8tHJrUJdVn1iCFioqali5+XnYWx5sbFYSPI/2WdR/IzqkYVmpQc5HHPqGlFSULyfm3qWqKeYAe/pMBbPBm7lx+qnpG4O2l4ODBq1/wCNm8UraT6fK/sdpvKPNJLjJptF9EyxOKOEQNouDsG7iz28FV+UGljotalo4Hf6PKEc8TP2GlbLD2LZnWeUBjIQhpyAScRJpqcWJm2M7Mcwkze5l51ywHUpNXy1txiqyGLPFwNgj6j80ZMWy72usdGl6lralTtKV3+xyfB8r+YUoUlTtKbk3+xf6DylrYGjiD+9U8hc2VJIzmJZ7LRdYFt7PfwJP+U/QqSmoqbVYgk0+qqzHntNkcLjm73fm2e8b7L22cWuwpWp8sdO04SpOTkX0iqxeOTVakLyv1O8AcIR47Bt4uSwNUdTUylU1spyyybxOb39zN1N4LbHpnLKskF6a8+8v2/32d60ss2qWbDF4VfL6cvyuv3HqSquIl6XS8FqeSGuyQziQu+8TZN3ssiUeHR/8pAz26P/AIXpyinwfTXxTPo+iaOpCKTj2lsdIoxAN2y+a+R3LuekOOM25yDLhezt32fqXo3K/wAoJNoMuoaXVRw1AlHzbSAJm+8zFFgWxns977eC5fQdmMpbUbvW5xYiIiZgjuUju9mZm2u7v3WXlvKTlJPW5xUpfRdLjHz9Se7LNH6n6qMv3iWer+X8lbp0EdR5oRD/AN2NtjVEjdGKJm4AVmIvbbvWL13lBLN5oN2nEug3W/eVutTHBzyPUTVk3lFrMZD9Bom5ujj6XUcrt2jt9ipBUdnTglvbvS+z/VdKikRvskj6I9Ivk33qQzWHH0ekkQBgPrF0n7kmokxAi7WKhnQuFZDrZMjGMf2lMgHdEVX0Q3MpCVpCylmePl2OzFiHrf6IohsHrEmq0t4Q9ZSIWQ3XZIjT8R9ntfjimB6KIT3i9FZs6IujEFVk6b54nLimxZdjHeH6y72fLl3S7BFMVj7wp6Fuim60N1YmyHKd1LZ91RIG3RT5vuoSJIhySoZSEhIHcDEshMXdnZ+p2dtrOo7rrKGr7FJqmSquaaUhKaaWW3DnJDO3syfYkucvP/S+dl+kZc5z3OFzubcCzvll43TYuu3UKCXSKLDBdJHBj3ik4kXSd+LpVQUhjHEZE4RX5oHN3Ec3uWIvsHbtShRZTSLbEPwV1WADFFPURAN8QjlMQa/Gws9mdLcp5hEZ5ppRAshaSUzZn72YntdRmT9Gdi+sojjjuukZS08L3UrJUYWER9Ho9/8AonjrK9//AIuqb/8AXl/mSELSWOMu1ZnPDCf6kmRoabEiK7uRbxO73d3fvdSpairceb+k1LBjjg00lrdzNlayTddUvFGXaEsMJdpEWCjESy4kpoVE4jzcU00QlvE0cpgzv32F7XSE3NJiOSSxxapomWKMlTRHmCwFEUkjgZ86UbyE4ETXZiIXezltfb4pNPPLHvRSSRFjiLxmQPb2i9027k5ERLhOsnFVVcGiwQqqHJtUr3EhKrqnEhxJufks7cHZ2yXNIYcS9VQKolyCchEhF7c58VbHBR/SqCxQgvpSRc1VXGHiXc38e5V0moysQyATxlGTFG8ZuLg7cHYm2s/iq2epFssdpKFNIRdL4LZ0VasmHXkPO81JJ58ebnsZM0g3YnY7PvtkzPbwUc6uZ4hgeQnhGR5Bizfm2J2ZnJg4MVmZr+DLkNNIXRbd7+pWEGnC3S2qu1exXavYb0/VK+IOap554osnkwCUwC72ZyxF7XszbfBkus1CtlbmqieeUMssDlM2v32J7XUsaYUFELKPSjd0rI9OF3Ssj08cnNHBzkjRSkMhxsTtGTjfFyG9nfa+3xdPOwjjGPRFdc+yo8km8qtcnbhhtV+5JE7b3D0U7NrFeQlG9XVEJDiQFPI7O3c7ZcFDy3f2VEKXeVHjjLtE5cUJ1uSf5Hqk5CYGkkM2iHCJjJyYRu74izvutd3e3i6VHWzwnlTyywljiTxSEDu3c7i/BMEe6m5n3lbamqrgpLHDbtrgs5dRrZgwnqJ5QLpAcxkGz1SeykUNXPExDTzSw525zmpCDO17Xxfbx+aqqUuipTEq+nGqrgtDDjUdu1V+CyesrZBKOWqqZALpAU0hA7eLOVnTdOOPRTNOe6lZqFBLpG2LDjgvpSRZnqNa/Rq6pvZPL/MopxkZ87LJJIZdIpDcje3C7k902JpbyKsccY9ItDT4oPdGKT/BI2WSWOybz3UhyVqOhskuQuos0fo7F0ZUtyyUEN2V8jk3STVVLI4jtfESytxZ/cpsjKFUgO7i7MW3i9uFuGzx+SvFWzny0kNlVlIOJbOb6m4fBAOkjAW6Q4Pu9Rt39d02wk5YqzRzQlY/zl90el+OCsqSHAecLpJijphDzh9LuT7mTks2duOPljwuq7UZblipssmIfjuVTfI1CRbJLiidQjYVYU3aUOBlY0TDnEJdEpAEvZdrqsnXJpiVF1yZ5EalWlzsEdovTPZdWmrcgNUgHIhaTH0OK9y5MV9FSUUEYMDeabHh3Kn1vlXGR8AccvBeRLWZL4PThpongEsUgFzZg4F0bO1kl9m7+9/Be7Hp+k1wc3KABKQ8djP8WXnnLDkDV0xFLT3mg28Nps3u4roxa2EnUuGUnicTw8XslMSj86XgjnS8F7bPlS/ozTk6pItQkHqH4P8AelFqcr9QfB/vVNrNNyLfKwrvOKm/KMncHwf71xtQk7g+D/eo2sbkXbOhUzalL3B8H+9d/KkvcHwf702snci5ZF1T/lWX0Y/gX8yPyrL3R/AvvU7WRuRdiSVdUX5Vl9GP4F/Mu/laXuj+BfzJtY3Iu7rsbqj/ACtL3R/Av5kNq8vdH8C/mTaxaNXBJkPrJxZINanbg0fwL+ZOBr9S3VE/tYv5lZFGalCy/wDtDU+jF8D/AJ0f7Q1PoxfA/wCdSQah3UGcsi9VUkmvVD7MYvcxfzJttZl9GP4F/MqyTZaNIvUxJIqh9Ym7o/gX8yaLUpX6g+D/AHqNrLbkTauoFi3lBlqSLwTEsxEWRWXYpLFliJeBXt8nVkqKN2SKenkPot7+pT46IR6W0vkoY6vK3RCJvcX8y4WqSv1R/AvvUkF7HjilOQrPtqsvcHwL+ZH5Vm7g+D/erEUX5SJieZVH5Tl7o/g/8ybKvNyysPz+9VZKXPJOKUkm6gfSz7m+f3o+ll3N8/vVNpv6iLIj3f2VBN95IKsN+pvn96aeUvBSkRLImTBLdXJX3lFGYm7vmulMT93zSiN6J1O+6pDEqoKom6m+f3pbVh9zfP71G0usiLmIt1KyVOOoSN1D8/vR+UZO4fn96jaX9aJeAaXmqH8qS9w/B/vXfypN3B8H+9RsLfMRL8TSDNUn5Vm7g+D/AHpP5Ul7g+D/AHpsHzES6ySoS3lRflKTuH5/elDqkrdQfB/vTYPmEaCTamZ6MZMdtiFVDavL3R/B/wCZKHW527MfwL+ZNjDzQfZPbTbFvE6khHGPR6Q96pX1eb0Q+BfzJBarK/UHwf70cWyI5ccei6OW6ciVA2pydwfB/vS21eXuD4F/Mo9NllqIlnXSpmjFVslcZcWH5/eux6gY8GH5/ep2Mq88W7NDCnpCss6GsSt2Q+BfzIPWJn6o/gX8yr6bZqtVFI9W5Pcr5JIo4JyfKMWjv322MrKeci3xe4rxWHV5h6LD/m/mVtSctK6NsWaEm9cTf7DXHPQ83E6Y/ElVM9YodRkjISu60+icq97CffDo2deCny6rn/RUzfsS/wBRcDlxXNwjp/3ZP6i58nw2UvY0XxOHkyiEIXtngghCEAIQhACEIQAhCEAIQhACEIQAhCEAIQhACEIQAhCEAIQhACEIQAhCEAIQhACEIQAhCEAIQhACEIQAhCEAIQhACEIQAhCEAIQhACEIQAhCEAIQhACEIQAhCEAIQhACEIQAhCEAIQhACEIQAhCEAIQhACEIQAhCEAIQhACEIQAhCEAIQhACEIQAhCEAIQhACEIQAhCEAIQhACEIQAhCEAIQhACEIQAhCEAIQhACEIQAhCEAIQhACEIQAhCEAIQhACEIQAhCEAIQhACEIQAhCEAIQhACEIQAhCEAIQhACEIQAhCEAIQhACEIQAhCEAIQhACEIQAhCEAIQhACEIQAhCEAIQhACEIQAhCEAIQhACEIQAhCEAIQhACEIQAhCEAIQhAf/9k=\n", + "text/html": [ + "\n", + " \n", + " " + ], + "text/plain": [ + "" + ] + }, + "execution_count": 186, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "from IPython.display import YouTubeVideo\n", + "YouTubeVideo(\"MijmeoH9LT4\", width=\"60%\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "In his [PyCon Australia 2018](https://2018.pycon-au.org/) talk titled \"*Unicode and Python: The absolute minimum you need to know*\" [Raphaël Merx](https://www.linkedin.com/in/raphaelmerx/) explains some caveats and best practices regarding Unicode." + ] + }, + { + "cell_type": "code", + "execution_count": 187, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "image/jpeg": "/9j/4AAQSkZJRgABAQAAAQABAAD/2wCEABALDBkYFhoaGQ4SHRsfIyclHyIhIDMvMSkyOi83MjAuNTI4PVVCNkRLRS04R2FRT1VWW1xbN0FlbWRYbVBZXVcBERISGBYZLxsbL1c9NT9XV1dXV1dXV11XV1dXV1dXV1dXV11XV1dXV11XV1dXV1dXV1dYV1dXV1dXV1ddV1dXV//AABEIAWgB4AMBIgACEQEDEQH/xAAbAAEAAwEBAQEAAAAAAAAAAAAAAgMFAQQGB//EADUQAQACAgEDAgQDBwQCAwAAAAABAgMREgQhYQUxEyJBUQZxgRQykZKhsfAzwdHhI1IVJEL/xAAWAQEBAQAAAAAAAAAAAAAAAAAAAQL/xAAYEQEBAQEBAAAAAAAAAAAAAAAAEQEhQf/aAAwDAQACEQMRAD8A/PwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABLhPg4T4BES4T4OE+AREuE+DhPgERLhPg4T4BES4T4OE+AREuE+DhPgERLhPg4T4BES4T4OE+AREuE+DhPgERLhPg4T4BES4T4OE+AREuE+DhPgERLhPg4T4BES4T4OE+AREuE+DhPgERLhPg4T4BES4T4OE+AREuE+DhPgERLhPg4T4BES4T4OE+AREuE+DhPgERLhPg4T4BES4T4OE+AREuE+DhPgERLhPg4T4BES4T4OE+AREuE+DhPgERLhPg4T4BES4T4OE+AREuE+DhPgFgCMgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA+lr+E616fD1Ob1PHiwZMdb2tNO8TMRMViIndp9/wCDz+ofh6tennquk66vU4Kzq/yzW1PzrP8An6AwgAAAAAAAAABrZ/ROPp+PradRzra3C9eOuE94999+8f1hD0H0aety3p8aMdKUm97zXcREfr/mpBmDs63Op3H0lwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAH1X4rtb9g9Jjc8Pgb/AF40/wCUvwP3x+o1t/pTg3bft7W/220vVOt6anp3puLq+kyZMeTDExak6tSa1rqY/mlh9d6909Olt0nQdLkx48n+rkyT81vH+fwFfOx7QAIAAAAAAAA+r/Bto6jD1fp957ZqTfH4tH+Vn9HMUT0XouS01mubrL8PMUjcT/af5oY34ey5Kdd004o3f4lYiPvE9p/pMtX8e9fGXrfhU1wwRxjX/tPe3+0foK+ZAEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAaHqPq+TqcXTYr48cV6enCkxvcxqI7/AMrPAAAAAAAAAAAG56P+J8vR4JxY+l6eb7ma5bR81d/Tyxb3m1pta0zaZmZmfrM+8ogAAAAAAAAAAAAAAAAAAAAAAAAAAAAAv6DpZz58WGtoicl61iZ+m51tq/sPR5s/7Ng/aceX4sY62vaLReN6tMxqOMx7x/AGG7FZn2iZ1G51H0+7dn03ps8dRj6anUUy4I3E5LxMZIi0VncREcZ77jT3dP0/S4MvXdPjr1HxcXS56zktaNXnh83y6+XX07/mK+TH1mP8MYo+HjyVycr0i1s/x8cVpaY3EfDnvMR23Pv9mH6J0Nc/WYsGSbRW1pi3Ge/aJntP6CM8b3TelYOspvpq58dq5ceO3xLRblW86i/tGp+8eyOXoOly06qvT16mmTp6zaLXvExkrFtW3ERHGfrArEtWYnU1mJ+0uPqvVfTunpk63Nnt1eWcWXFSsfEiJtypv5rcfH0+x0/SdNgnq9Yc16X6KmakTeImtbTWZrvj77mO/wBo8g+VAEAAAAAWYYiZmJrvtOu/t2kwVizBETesTXe5iEaTEe9d/aCCIszViJj5YidfNEfSVZvAAAAAAABZgiJtETXe5iPcFYlSY33rv7QnliItHyx7fNWJ+v2IKhZmiItOo1Hb+ysAAAAAAAAAAAAAAAAAAAAAAAAAAE8OW2O9b0tNbVmLVn7THeJamb8QXnlOLpenwXvet8l8cTu1qzyj3mYrG++oZADV6r1y16ZYx9H0+G2bU5r4+W7d96jc/LG++oW3/Ed5jLM9D0vxs2O2PLl1bdomNb1vUT99e+mKA18fr9oilrdF0t8+OkUpmtEzMREajdd6tMfSZV/hvqKYeu6fJkyRWlbTM2n6fLLMAaeX1y/GtcPTYOniL1y2+Hv5rx7T3mdRH0j2S6v1y16Za06PpsNs3+tem93771G51WJnvMQygGn1/reTPGeLYscfGvS9tb7TWvGIhLF65et4tbpsN6/s9entSd6tWNancTuJ+WGUAAAAAAAJ4snGd8In89oALMeSK25cI7d4jc9imWK23GOPEbnsrC6RK8xPtWI/VEAAAAAAAE8V+MxPGJ17bQAWUyxW3KMceI3Pby5No3E/DjX23PdAW6RPLk5TvhEfkgCbtAAAAAAAAAAAAAAAAAAAAAAAAAAAAAGt6L6bXqMfV7mkWx462pa1uNa/PEWmf02DJGvT0a9L2i1MOWs4MmWl65J4zFY/eiYjvMTH7s/q7j/Dee0U1m6Tnkxxkx45yfPes15do17/AJ/aQY40+i9EyZqY7ftPSY5yzMYq5MmrX767Rqfr276MXoWWcVst83S4q1vfHPxcmp5V96xGu8/8AzB78npGauTqMcxSLdPWb5J321GtanXfe4008HT9FnjqKYukvXFhxTaOqte3LlEduVZnj809ojWwfOhAAO1rM9orMz4gis71xnf213BwdtWYnUxMT5cAHbUmvvWY/ONO2pMe9bRv23AIglwnXLjbX312BESik63wtqPrpEASrjtPtS0/lBFJmdRW2/toEQmPBEb9omZAHbVmJ1NZifMO2pNfeto/ONAiCU0mIiZpaIn2nQIiXCdb4W199dkQBKtJn2pade+oK0mfatp/KNgiDtazPtWZ/KNg4JTSYnU0tE/bTlqzHaazE+YBwAAAAAAAAAAAAAAAAAAAAAAABq+i58EY+qxZ898cZqUrW0Um2pi8W7xH07MoB9HT1Xp8da4KZb3pj6bqKRkmkxyvk+kV94j27yj0/q2GvXdDmnJb4eHBipeeM9prSYmNfXvL54FfT+k+p9Lhx9LauemG2PU56/s/O+SeW+15jURrzGnh9Y9QxZcNqUvMzPV58sdpj5ba4z/0xgH0vrPWa6DBFsdq9R1Faxm37zTFMxSdT3+bcfysi/Vf/SphjPH+ra9qRSY+kRFptvU+I12eXP1GTLblkzZL29t2tMz/ABlWIAAtxxM0vERMz27R9v8ANLLRPK8f/rhEfr23/aXm2LUi3PH7sT7xWN/11/TTnT/v1/NWJeqviNVrziY+fff7dtu5KzEX5RPe0a39ffc/1eeZNrUg9Op/e1PH4fv9PbWv4vMGbDcerHE/+OdTxis7n6e87eUDdMxPDG7RE+0d5/vP9k67tXJ23MzEzr7bnf8AXSkiTNItz/vzv31G/wA9RswfvTr3mLRH56VBe1fHpx9pxRPadz7/AJ9ld6zGOItExPKff8u6qSZKkHpyxP8A5Z1PGYjXnvGtfo8xszSPTqffU8fh639PbWv4vMBu0x6MVZmMeontad6+nt7/AKOcZtSeMTPz/T+ijZspFmefntr7yhH2j6uCKvm2snaJnj8vb8tf9meIitK6tExv399fTt9PqogWpABFAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABXznwc58CxYK+c+DnPgIsFfOfBznwEWCvnPg5z4CLBXznwc58BFgr5z4Oc+AiwV858HOfARYK+c+DnPgIsFfOfBznwEWCvnPg5z4CLBXznwc58BFgr5z4Oc+AiwV858HOfARYK+c+DnPgIsFfOfBznwEWCvnPg5z4CLBXznwc58BFgr5z4Oc+AiwV858HOfARYK+c+DnPgIsFfOfBznwEWCvnPg5z4CLBXznwc58BFgr5z4Oc+AiwV858HOfARYK+c+DnPgIsFfOfBznwEbXSeqY64a4b48nHUxbXffz8tcZnjO47b1uF/8A8h0lJrenT/Nue3w6x7RSOXv2idW7R27vnuc+DnPgG/h9R6SvGf2W0X3uZile08ZiZjv7bmJ17dlcep4fi3tOC/w74vhzWNbiJvudd/t7edMTnPg5z4Bv9R6r02SJtPRR8SY97Vie8U41n39omI7a19Wd1+XFe1JxYuERWItGojdvrbt9/t9Hh5z4Oc+ARAVQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAH//2Q==\n", + "text/html": [ + "\n", + " \n", + " " + ], + "text/plain": [ + "" + ] + }, + "execution_count": 187, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "YouTubeVideo(\"oXVmZGN6plY\", width=\"60%\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "In a similar talk at [PyCon 2017](https://us.pycon.org/2017/) titled \"*Unicode what is the big deal*\" [Łukasz Langa](https://www.linkedin.com/in/llanga/) provides further lessons learned regarding Unicode." + ] + }, + { + "cell_type": "code", + "execution_count": 188, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "image/jpeg": "\n", + "text/html": [ + "\n", + " \n", + " " + ], + "text/plain": [ + "" + ] + }, + "execution_count": 188, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "YouTubeVideo(\"7m5JA3XaZ4k\", width=\"60%\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "In a \"classic\" talk from PyCon 2012 titled \"*Pragmatic Unicode, or, How do I stop the pain?*\" [Ned Batchelder](https://nedbatchelder.com/) explains among others the concept of a \"Unicode Sandwich.\"" + ] + }, + { + "cell_type": "code", + "execution_count": 189, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "image/jpeg": "\n", + "text/html": [ + "\n", + " \n", + " " + ], + "text/plain": [ + "" + ] + }, + "execution_count": 189, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "YouTubeVideo(\"sgHbC6udIqc\", width=\"60%\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Lastly, in his entertaining talk at [PyCon.DE 2019](https://de.pycon.org/) titled \"*Your Name is Invalid!*\" [Miroslav Šedivý](https://www.linkedin.com/in/%C5%A1ediv%C3%BD/) shows how hard it actually is to write software that can process any name a human can possibly have. Miroslav also gave a lightning talk where he shows how he uses only one keyboard for the 12 (!!!) languages he speaks." + ] + }, + { + "cell_type": "code", + "execution_count": 190, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "image/jpeg": "\n", + "text/html": [ + "\n", + " \n", + " " + ], + "text/plain": [ + "" + ] + }, + "execution_count": 190, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "YouTubeVideo(\"pBuS7EUPnQA\", width=\"60%\")" + ] + }, + { + "cell_type": "code", + "execution_count": 191, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "image/jpeg": "\n", + "text/html": [ + "\n", + " \n", + " " + ], + "text/plain": [ + "" + ] + }, + "execution_count": 191, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "YouTubeVideo(\"-4QjII981sM\", width=\"60%\")" ] } ], diff --git a/full_house.bin b/full_house.bin new file mode 100644 index 0000000..4ad5e35 --- /dev/null +++ b/full_house.bin @@ -0,0 +1 @@ +🂧🂷🃗🃎🃞 \ No newline at end of file diff --git a/umlauts.txt b/umlauts.txt new file mode 100644 index 0000000..9c0f911 --- /dev/null +++ b/umlauts.txt @@ -0,0 +1,12 @@ +Lerchen-Lrchen-hnlichkeiten +fehlen. Dieses abzustreiten +mag im Klang der Worte liegen. +Merke, eine Lerch' kann fliegen, +Lrchen nicht, was kaum verwundert, +denn nicht eine unter hundert +ist geflgelt. Auch im Singen +sind die Bume zu bezwingen. +Die Btrachtung sollte reichen, +Rchtschreibfhlern auszuweichen. +Leicht glingt's, zu unterscheiden, +wr ist wr nun von dn beiden. \ No newline at end of file