From 04d53956a3644375996a69f48847cbabcbd3d603 Mon Sep 17 00:00:00 2001 From: Alexander Hess Date: Wed, 30 Oct 2019 11:04:59 +0100 Subject: [PATCH] Streamline previous content --- 00_start_up.ipynb | 9 +- 01_elements.ipynb | 32 +- 02_functions.ipynb | 32 +- 03_conditionals.ipynb | 225 +++-- 04_iteration.ipynb | 32 +- 04_iteration_review_and_exercises.ipynb | 2 +- 05_numbers.ipynb | 67 +- 05_numbers_review_and_exercises.ipynb | 4 +- 06_text.ipynb | 1088 ++++++++++++++++++----- 06_text_review_and_exercises.ipynb | 10 +- lorem_ipsum.txt | 6 + 11 files changed, 1097 insertions(+), 410 deletions(-) create mode 100644 lorem_ipsum.txt diff --git a/00_start_up.ipynb b/00_start_up.ipynb index 9ace23c..97fe84b 100644 --- a/00_start_up.ipynb +++ b/00_start_up.ipynb @@ -792,13 +792,12 @@ "**Part 2: Managing Data and Memory**\n", "\n", "- How is data stored in memory?\n", - " 5. [Numbers](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/05_numbers.ipynb)\n", - " 6. Text\n", - " 7. Sequences\n", + " 5. [Bits & Numbers](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/05_numbers.ipynb)\n", + " 6. [Bytes & Text](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/06_text.ipynb)\n", + " 7. [Sequential Data](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/07_sequences.ipynb)\n", " 8. Mappings & Sets\n", - " 9. Arrays\n", "- How can we create custom data types?\n", - " 10. Object-Orientation" + " 9. Object-Orientation" ] }, { diff --git a/01_elements.ipynb b/01_elements.ipynb index c941e43..f9c6131 100644 --- a/01_elements.ipynb +++ b/01_elements.ipynb @@ -46,11 +46,11 @@ } }, "source": [ - "As our introductory example, we want to calculate the *average* of all *even* numbers from *one* through *ten*.\n", + "As our introductory example, we want to calculate the *average* of all *even* numbers from `1` through `10`.\n", "\n", "While we could come up with an [analytical solution](https://math.stackexchange.com/questions/935405/what-s-the-difference-between-analytical-and-numerical-approaches-to-problems/935446#935446) (i.e., derive some equation with \"pen and paper\" from, e.g., one of [Faulhaber's formulas](https://en.wikipedia.org/wiki/Faulhaber%27s_formula)), we instead solve the task programmatically.\n", "\n", - "We start by creating a **list** called `numbers` that holds all the individual numbers." + "We start by creating a **list** called `numbers` that holds all the individual numbers between **brackets** `[` and `]`." ] }, { @@ -988,7 +988,7 @@ { "data": { "text/plain": [ - "139829040614096" + "139878568414128" ] }, "execution_count": 28, @@ -1012,7 +1012,7 @@ { "data": { "text/plain": [ - "139829040789352" + "139878568593256" ] }, "execution_count": 29, @@ -1036,7 +1036,7 @@ { "data": { "text/plain": [ - "139829040474736" + "139878567760368" ] }, "execution_count": 30, @@ -1080,7 +1080,7 @@ } }, "source": [ - "`a` and `d` indeed have the same value as is checked with the **equality operator** `==`. The resulting `True` (and the `False` further below) is yet another data type, a so-called **boolean**. We look into them closely in [Chapter 3](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/03_conditionals.ipynb)." + "`a` and `d` indeed have the same value as is checked with the **equality operator** `==`. The resulting `True` (and the `False` further below) is yet another data type, a so-called **boolean**. We look into them closely in [Chapter 3](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/03_conditionals.ipynb#Boolean-Expressions)." ] }, { @@ -1222,7 +1222,7 @@ "source": [ "Different types imply different behaviors for the objects. The `b` object, for example, can be \"asked\" if it could also be interpreted as an `int` with the [is_integer()](https://docs.python.org/3/library/stdtypes.html#float.is_integer) \"functionality\" that comes with every `float` object.\n", "\n", - "Formally, we call such type-specific functionalities **methods** (to differentiate them from functions) and we formally introduce them in Chapter 10. For now, it suffices to know that we access them using the **dot operator** `.`. Of course, `b` could be converted into an `int`, which the boolean value `True` tells us." + "Formally, we call such type-specific functionalities **methods** (to differentiate them from functions) and we formally introduce them in Chapter 9. For now, it suffices to know that we access them using the **dot operator** `.`. Of course, `b` could be converted into an `int`, which the boolean value `True` tells us." ] }, { @@ -1817,7 +1817,7 @@ " " ], "text/plain": [ - "" + "" ] }, "execution_count": 51, @@ -1838,7 +1838,7 @@ } }, "source": [ - "For example, while the above code to calculate the average of the even numbers from 1 through 10 is correct, a Pythonista would re-write it in a more \"Pythonic\" way and use the [sum()](https://docs.python.org/3/library/functions.html#sum) and [len()](https://docs.python.org/3/library/functions.html#len) (= \"length\") built-in functions (cf., [Chapter 2](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/02_functions.ipynb)) as well as a so-called **list comprehension** (cf., Chapter 7). Pythonic code runs faster in many cases and is less error-prone." + "For example, while the above code to calculate the average of the even numbers from 1 through 10 is correct, a Pythonista would re-write it in a more \"Pythonic\" way and use the [sum()](https://docs.python.org/3/library/functions.html#sum) and [len()](https://docs.python.org/3/library/functions.html#len) (= \"length\") [built-in functions](https://docs.python.org/3/library/functions.html) (cf., [Chapter 2](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/02_functions.ipynb#Built-in-Functions)) as well as a so-called **list comprehension** (cf., [Chapter 7](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/07_sequences.ipynb#List-Comprehensions)). Pythonic code runs faster in many cases and is less error-prone." ] }, { @@ -2040,7 +2040,7 @@ "\n", "At the same time, for a beginner's course, it is often easier to code linearly.\n", "\n", - "In real data science projects, one would probably employ a mixed approach and put re-usable code into so-called Python modules (i.e., *.py* files; cf., [Chapter 2](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/02_functions.ipynb)) and then use Jupyter notebooks to build up a linear report or storyline for a business argument to be made." + "In real data science projects, one would probably employ a mixed approach and put re-usable code into so-called Python modules (i.e., *.py* files; cf., [Chapter 2](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/02_functions.ipynb#Local-Modules-and-Packages)) and then use Jupyter notebooks to build up a linear report or storyline for a business argument to be made." ] }, { @@ -2330,7 +2330,7 @@ } }, "source": [ - "Some variables magically exist when we start a Python process or are added by Jupyter. We may safely ignore the former until Chapter 10 and the latter for good." + "Some variables magically exist when we start a Python process or are added by Jupyter. We may safely ignore the former until Chapter 9 and the latter for good." ] }, { @@ -2779,7 +2779,7 @@ "source": [ "Let's change the first element of `x`.\n", "\n", - "Chapter 7 discusses lists in more depth. For now, let's view a `list` object as some sort of **container** that holds an arbitrary number of pointers to other objects and treat the brackets `[]` attached to it as just another operator, called the **indexing operator**. `x[0]` instructs Python to first follow the pointer from the global list of all names to the `x` object. Then, it follows the first pointer it finds there to the `1` object. The indexing operator must be an operator as we merely read the first element and do not change anything in memory.\n", + "[Chapter 7](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/07_sequences.ipynb#The-list-Type) discusses lists in more depth. For now, let's view a `list` object as some sort of **container** that holds an arbitrary number of pointers to other objects and treat the brackets `[]` attached to it as just another operator, called the **indexing operator**. `x[0]` instructs Python to first follow the pointer from the global list of all names to the `x` object. Then, it follows the first pointer it finds there to the `1` object. The indexing operator must be an operator as we merely read the first element and do not change anything in memory.\n", "\n", "Note how Python **begins counting at 0**. This is not the case for many other languages, for example, [MATLAB](https://en.wikipedia.org/wiki/MATLAB), [R](https://en.wikipedia.org/wiki/R_%28programming_language%29), or [Stata](https://en.wikipedia.org/wiki/Stata). To understand why this makes sense, see this short [note](https://www.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/EWD831.html) by one of the all-time greats in computer science, the late [Edsger Dijkstra](https://en.wikipedia.org/wiki/Edsger_W._Dijkstra)." ] @@ -3189,7 +3189,7 @@ " " ], "text/plain": [ - "" + "" ] }, "execution_count": 94, @@ -3579,12 +3579,12 @@ "name": "stdout", "output_type": "stream", "text": [ - "I change the display of the computer\n" + "I change the state of the computer's display\n" ] } ], "source": [ - "print(\"I change the display of the computer\")" + "print(\"I change the state of the computer's display\")" ] }, { @@ -3686,7 +3686,7 @@ } }, "source": [ - "We end each chapter with a summary of the main points. The essence in this first chapter is that just like a sentence in a real language like English may be decomposed into its parts (e.g., subject, predicate, and objects), the same may be done with programming languages." + "We end each chapter with a summary of the main points (i.e., **TL;DR** = \"too long; didn't read\"). The essence in this first chapter is that just as a sentence in a real language like English may be decomposed into its parts (e.g., subject, predicate, and objects), the same may be done with programming languages." ] }, { diff --git a/02_functions.ipynb b/02_functions.ipynb index 941c71b..809982f 100644 --- a/02_functions.ipynb +++ b/02_functions.ipynb @@ -45,7 +45,7 @@ } }, "source": [ - "So-called **[user-defined functions](https://docs.python.org/3/reference/compound_stmts.html#function-definitions)** may be created with the `def` statement. To extend an already familiar example, we re-use the introductory example from [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb) in its final Pythonic version and transform it into the function `average_evens()` below. \n", + "So-called **[user-defined functions](https://docs.python.org/3/reference/compound_stmts.html#function-definitions)** may be created with the `def` statement. To extend an already familiar example, we re-use the introductory example from [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb#Best-Practices) in its final Pythonic version and transform it into the function `average_evens()` below. \n", "\n", "A function's **name** must be chosen according to the same naming rules as ordinary variables as Python manages function names like variables. In this book, we further adopt the convention of ending function names with parentheses `()` in text cells for faster comprehension when reading (i.e., `average_evens()` vs. `average_evens`). These are *not* part of the name but must always be written out in the `def` statement for syntactic reasons.\n", "\n", @@ -55,9 +55,9 @@ "\n", "Together, the name and the list of parameters are also referred to as the function's **[signature](https://en.wikipedia.org/wiki/Type_signature)** (i.e., `average_evens(numbers)` below).\n", "\n", - "A function may come with an *explicit* **[return value](https://docs.python.org/3/reference/simple_stmts.html#the-return-statement)** (i.e., \"result\" or \"output\") specified with the `return` statement: Functions that have one are considered **fruitful**; otherwise, they are **void**. Functions of the latter kind are still useful because of their **side effects** (e.g., the [print()](https://docs.python.org/3/library/functions.html#print) built-in). Strictly speaking, they also have an *implicit* return value of `None` that is different from the `False` we saw in Chapter 1.\n", + "A function may come with an *explicit* **[return value](https://docs.python.org/3/reference/simple_stmts.html#the-return-statement)** (i.e., \"result\" or \"output\") specified with the `return` statement: Functions that have one are considered **fruitful**; otherwise, they are **void**. Functions of the latter kind are still useful because of their **side effects** (e.g., the [print()](https://docs.python.org/3/library/functions.html#print) built-in). Strictly speaking, they also have an *implicit* return value of `None` that is different from the `False` we saw in [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb#Identity-/-\"Memory-Location\").\n", "\n", - "A function should define a **docstring** that describes what it does in a short subject line, what parameters it expects (i.e., their types), and what it returns (if anything). A docstring is a syntactically valid multi-line string (i.e., type `str`) defined within **triple-double quotes** `\"\"\"`. Strings are covered in depth in [Chapter 6](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/06_text.ipynb). Widely adopted standards as to how to format a docstring are [PEP 257](https://www.python.org/dev/peps/pep-0257/) and section 3.8 in [Google's Python Style Guide](https://github.com/google/styleguide/blob/gh-pages/pyguide.md)." + "A function should define a **docstring** that describes what it does in a short subject line, what parameters it expects (i.e., their types), and what it returns (if anything). A docstring is a syntactically valid multi-line string (i.e., type `str`) defined within **triple-double quotes** `\"\"\"`. Strings are covered in depth in [Chapter 6](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/06_text.ipynb#The-str-Type). Widely adopted standards as to how to format a docstring are [PEP 257](https://www.python.org/dev/peps/pep-0257/) and section 3.8 in [Google's Python Style Guide](https://github.com/google/styleguide/blob/gh-pages/pyguide.md)." ] }, { @@ -142,7 +142,7 @@ { "data": { "text/plain": [ - "140519730762208" + "139829511291360" ] }, "execution_count": 3, @@ -657,7 +657,7 @@ "source": [ "[PythonTutor](http://pythontutor.com/visualize.html#code=nums%20%3D%20%5B1,%202,%203,%204,%205,%206,%207,%208,%209,%2010%5D%0A%0Adef%20average_wrong%28numbers%29%3A%0A%20%20%20%20evens%20%3D%20%5Bn%20for%20n%20in%20nums%20if%20n%20%25%202%20%3D%3D%200%5D%0A%20%20%20%20average%20%3D%20sum%28evens%29%20/%20len%28evens%29%0A%20%20%20%20return%20average%0A%0Arv%20%3D%20average_wrong%28%5B123,%20456,%20789%5D%29&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false) is again helpful at visualizing the error interactively: Creating the `list` object `evens` eventually points to takes *12* computational steps, namely one for setting up an empty `list` object, *ten* for filling it with elements derived from `nums` in the global scope, and one to make `evens` point at it (cf., steps 6-18).\n", "\n", - "The frames logic shown by PythonTutor is the mechanism by which Python not only manages the names inside *one* function call but also for *many* potentially simultaneous* calls, as revealed in [Chapter 4](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/04_iteration.ipynb). It is the reason why we may re-use the same names for the parameters and variables inside both `average_evens()` and `average_wrong()` without Python mixing them up. So, as we already read in the [Zen of Python](https://www.python.org/dev/peps/pep-0020/), \"namespaces are one honking great idea\" (cf., `import this`), and a frame is just a special kind of namespace." + "The frames logic shown by PythonTutor is the mechanism by which Python not only manages the names inside *one* function call but also for *many* potentially *simultaneous* calls, as revealed in [Chapter 4](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/04_iteration.ipynb#Trivial-Example:-Countdown). It is the reason why we may re-use the same names for the parameters and variables inside both `average_evens()` and `average_wrong()` without Python mixing them up. So, as we already read in the [Zen of Python](https://www.python.org/dev/peps/pep-0020/), \"namespaces are one honking great idea\" (cf., `import this`), and a frame is just a special kind of namespace." ] }, { @@ -1170,7 +1170,7 @@ } }, "source": [ - "So far, we have only specified one parameter in each of our user-defined functions. In [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb), however, we saw the built-in function [divmod()](https://docs.python.org/3/library/functions.html#divmod) take two arguments. And, the order of the numbers passed in mattered! Whenever we call a function and list its arguments in a comma separated manner, we say that we pass in the arguments by position or refer to them as **positional arguments**." + "So far, we have only specified one parameter in each of our user-defined functions. In [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb#%28Arithmetic%29-Operators), however, we saw the built-in function [divmod()](https://docs.python.org/3/library/functions.html#divmod) take two arguments. And, the order of the numbers passed in mattered! Whenever we call a function and list its arguments in a comma separated manner, we say that we pass in the arguments by position or refer to them as **positional arguments**." ] }, { @@ -1941,7 +1941,7 @@ "source": [ "The main point of having functions without a name is to use them in a situation where we know ahead of time that we use the function *once* only.\n", "\n", - "Popular applications of lambda expressions are with the **map-filter-reduce** paradigm in Chapter 7 or when we do \"number crunching\" with **arrays** and **data frames** in Chapter 9." + "Popular applications of lambda expressions occur in combination with the **map-filter-reduce** paradigm or when we do \"number crunching\" with **arrays** and **data frames**. We look at both in detail in [Chapter 7](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/07_sequences.ipynb)." ] }, { @@ -2084,7 +2084,7 @@ { "data": { "text/plain": [ - "140519828111832" + "139829616976344" ] }, "execution_count": 58, @@ -2356,7 +2356,7 @@ "source": [ "Observe how the arguments passed to functions do not need to be just variables or simple literals. Instead, we may pass in any *expression* that evaluates to a *new* object of the type the function expects.\n", "\n", - "So just as a reminder from the expression vs. statement discussion in [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb): An expression is *any* syntactically correct combination of variables and literals with operators. And the call operator `()` is yet another operator. So both of the next two code cells are just expressions! They have no permanent side effects in memory. We may execute them as often as we want *without* changing the state of the program (i.e., this Jupyter notebook).\n", + "So just as a reminder from the expression vs. statement discussion in [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb#Expressions): An expression is *any* syntactically correct combination of variables and literals with operators. And the call operator `()` is yet another operator. So both of the next two code cells are just expressions! They have no permanent side effects in memory. We may execute them as often as we want *without* changing the state of the program (i.e., this Jupyter notebook).\n", "\n", "So, regarding the very next cell in particular: Although the `2 ** 2` creates a *new* object `4` in memory that is then immediately passed into the [math.sqrt()](https://docs.python.org/3/library/math.html#math.sqrt) function, once that function call returns, \"all is lost\" and the newly created `4` object is forgotten again, as well as the return value of [math.sqrt()](https://docs.python.org/3/library/math.html#math.sqrt)." ] @@ -2537,7 +2537,7 @@ } }, "source": [ - "Besides the usual dunder-style attributes, the built-in [dir()](https://docs.python.org/3/library/functions.html#dir) function lists some attributes in an upper case naming convention and many others starting with a *single* underscore `_`. To understand the former, we must wait until Chapter 10, while the latter is explained further below." + "Besides the usual dunder-style attributes, the built-in [dir()](https://docs.python.org/3/library/functions.html#dir) function lists some attributes in an upper case naming convention and many others starting with a *single* underscore `_`. To understand the former, we must wait until Chapter 9, while the latter is explained further below." ] }, { @@ -2697,7 +2697,7 @@ { "data": { "text/plain": [ - "0.5291407120147841" + "0.4165845384207939" ] }, "execution_count": 75, @@ -2732,7 +2732,7 @@ { "data": { "text/plain": [ - ">" + ">" ] }, "execution_count": 76, @@ -2781,7 +2781,7 @@ { "data": { "text/plain": [ - "4" + "2" ] }, "execution_count": 78, @@ -2927,9 +2927,9 @@ } }, "source": [ - "[numpy](http://www.numpy.org/) is the de-facto standard in the Python world for handling **array-like** data. That is a fancy word for data that can be put into a matrix or vector format. We look at it in depth in Chapter 9.\n", + "[numpy](http://www.numpy.org/) is the de-facto standard in the Python world for handling **array-like** data. That is a fancy word for data that can be put into a matrix or vector format. We look at it in depth in [Chapter 7](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/07_sequences.ipynb).\n", "\n", - "As [numpy](http://www.numpy.org/) is *not* in the [standard library](https://docs.python.org/3/library/index.html), it must be *manually* installed, for example, with the [pip](https://pip.pypa.io/en/stable/) tool. As mentioned in [Chapter 0](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/00_start_up.ipynb), to execute terminal commands from within a Jupyter notebook, we start a code cell with an exclamation mark.\n", + "As [numpy](http://www.numpy.org/) is *not* in the [standard library](https://docs.python.org/3/library/index.html), it must be *manually* installed, for example, with the [pip](https://pip.pypa.io/en/stable/) tool. As mentioned in [Chapter 0](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/00_start_up.ipynb#Markdown-vs.-Code-Cells), to execute terminal commands from within a Jupyter notebook, we start a code cell with an exclamation mark.\n", "\n", "If you are running this notebook with an installation of the [Anaconda Distribution](https://www.anaconda.com/distribution/), then [numpy](http://www.numpy.org/) is probably already installed. Running the cell below confirms that." ] @@ -3433,7 +3433,7 @@ } }, "source": [ - "Packages are a generalization of modules, and we look at one in detail in Chapter 10. You may, however, already look at a [sample package](https://github.com/webartifex/intro-to-python/tree/master/sample_package) in the repository, which is nothing but a folder with *.py* files in it.\n", + "Packages are a generalization of modules, and we look at one in detail in Chapter 9. You may, however, already look at a [sample package](https://github.com/webartifex/intro-to-python/tree/master/sample_package) in the repository, which is nothing but a folder with *.py* files in it.\n", "\n", "As a further reading on modules and packages, we refer to the [official tutorial](https://docs.python.org/3/tutorial/modules.html)." ] diff --git a/03_conditionals.ipynb b/03_conditionals.ipynb index d0ed61d..90da64f 100644 --- a/03_conditionals.ipynb +++ b/03_conditionals.ipynb @@ -139,7 +139,7 @@ } }, "source": [ - "There are, however, cases where even well-behaved Python does not make us happy. [Chapter 5](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/05_numbers.ipynb) provides more insights into this \"bug.\"" + "There are, however, cases where even well-behaved Python does not make us happy. [Chapter 5](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/05_numbers.ipynb#Imprecision) provides more insights into this \"bug.\"" ] }, { @@ -189,7 +189,7 @@ { "data": { "text/plain": [ - "94906834637792" + "94731133531104" ] }, "execution_count": 5, @@ -213,7 +213,7 @@ { "data": { "text/plain": [ - "94906834637760" + "94731133531072" ] }, "execution_count": 6, @@ -281,7 +281,7 @@ } }, "source": [ - "Let's not confuse the boolean `False` with `None`, another built-in object! We saw the latter before in [Chapter 2](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/02_functions.ipynb) as the *implicit* return value of a function without a `return` statement.\n", + "Let's not confuse the boolean `False` with `None`, another built-in object! We saw the latter before in [Chapter 2](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/02_functions.ipynb#Function-Definition) as the *implicit* return value of a function without a `return` statement.\n", "\n", "We might think of `None` in a boolean context indicating a \"maybe\" or even an \"unknown\" answer; however, for Python, there are no \"maybe\" or \"unknown\" objects, as we see further below!\n", "\n", @@ -313,7 +313,7 @@ { "data": { "text/plain": [ - "94906834624752" + "94731133518064" ] }, "execution_count": 10, @@ -357,7 +357,7 @@ } }, "source": [ - "`True`, `False`, and `None` have the property that they each exist in memory only *once*. Objects designed this way are so-called **singletons**. This **[design pattern](https://en.wikipedia.org/wiki/Design_Patterns)** was originally developed to keep a program's memory usage at a minimum. It may only be employed in situations where we know that an object does *not* mutate its value (i.e., to re-use the bag analogy from [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb), no flipping of $0$s and $1$s in the bag is allowed). In languages \"closer\" to the memory like C, we would have to code this singleton logic ourselves, but Python has this built in for *some* types.\n", + "`True`, `False`, and `None` have the property that they each exist in memory only *once*. Objects designed this way are so-called **singletons**. This **[design pattern](https://en.wikipedia.org/wiki/Design_Patterns)** was originally developed to keep a program's memory usage at a minimum. It may only be employed in situations where we know that an object does *not* mutate its value (i.e., to re-use the bag analogy from [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb#Objects-vs.-Types-vs.-Values), no flipping of $0$s and $1$s in the bag is allowed). In languages \"closer\" to the memory like C, we would have to code this singleton logic ourselves, but Python has this built in for *some* types.\n", "\n", "We verify this with either the `is` operator or by comparing memory addresses." ] @@ -897,9 +897,11 @@ } }, "source": [ - "The operands of the logical operators do not need to be *boolean* expressions as defined above but may be *any* expression. If a sub-expression does *not* evaluate to an object of type `bool`, Python automatically casts it as such.\n", + "The operands of the logical operators do not need to be *boolean* expressions but may be *any* expression. If a sub-expression does *not* evaluate to an object of type `bool`, Python automatically casts it as such.\n", "\n", - "For example, any non-zero numeric object becomes `True`. While this behavior allows writing more concise and thus more \"beautiful\" code, it is also a common source of confusion. `(x - 9)` is cast as `True` and then the overall expression evaluates to `True` as well." + "For example, any non-zero numeric object becomes `True`. While this behavior allows writing more concise and thus more \"beautiful\" code, it is also a common source of confusion.\n", + "\n", + "So, `(x - 9)` is cast as `True` and then the overall expression evaluates to `True` as well." ] }, { @@ -1122,7 +1124,7 @@ } }, "source": [ - "Pythonistas often use the terms **truthy** or **falsy** to describe a non-boolean expression's behavior when used in place of a boolean one." + "Pythonistas use the terms **truthy** or **falsy** to describe a non-boolean expression's behavior when evaluated in a boolean context." ] }, { @@ -1144,14 +1146,18 @@ } }, "source": [ - "When evaluating boolean expressions with logical operators in it, Python follows the **[short-circuiting](https://en.wikipedia.org/wiki/Short-circuit_evaluation)** strategy: First, the inner-most sub-expressions are evaluated. Second, with identical **[operator precedence](https://docs.python.org/3/reference/expressions.html#operator-precedence)**, evaluation goes from left to right. Once it is clear what the overall truth value is, no more sub-expressions are evaluated, and the result is *immediately* returned.\n", + "When evaluating expressions involving the `and` and `or` operators, Python follows the **[short-circuiting](https://en.wikipedia.org/wiki/Short-circuit_evaluation)** strategy: Once it is clear what the overall truth value is, no more sub-expressions are evaluated, and the result is *immediately* returned.\n", "\n", - "In summary, data science practitioners must know *how* the following two generic expressions are evaluated:\n", + "Also, if such expressions are evaluated in a non-boolean context, the result is returned as is and *not* cast as a `bool` type.\n", "\n", - "- `x or y`: The `y` expression is evaluated *only if* `x` evaluates to `False`, in which case `y` is returned; otherwise, `x` is returned *without* even looking at `y`.\n", - "- `x and y`: The `y` expression is evaluated *only if* `x` evaluates to `True`. Then, if `y` also evaluates to `True`, it is returned; otherwise, `x` is returned.\n", + "The two rules can be summarized as:\n", "\n", - "Let's look at a couple of examples." + "- `x or y`: If `x` is truthy, it is returned *without* evaluating `y`. Otherwise, `y` is evaluated *and* returned.\n", + "- `x and y`: If `x` is falsy, it is returned *without* evaluating `y`. Otherwise, `y` is evaluated *and* returned.\n", + "\n", + "The rules may also be chained or combined.\n", + "\n", + "Let's look at a couple of examples below. To visualize which sub-expressions are evaluated, we define a helper function `expr()` that prints out the only argument it is passed before returning it." ] }, { @@ -1163,32 +1169,6 @@ } }, "outputs": [], - "source": [ - "x = 0\n", - "y = 1\n", - "z = 2" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "slideshow": { - "slide_type": "skip" - } - }, - "source": [ - "We define a helper function `expr()` that prints out the only argument it is passed before returning it. With `expr()`, we can see if a sub-expression is evaluated or not." - ] - }, - { - "cell_type": "code", - "execution_count": 37, - "metadata": { - "slideshow": { - "slide_type": "fragment" - } - }, - "outputs": [], "source": [ "def expr(arg):\n", " \"\"\"Print and return the only argument.\"\"\"\n", @@ -1204,12 +1184,12 @@ } }, "source": [ - "With the `or` operator, the first sub-expression that evaluates to `True` is returned." + "With the `or` operator, the first truthy sub-expression is returned." ] }, { "cell_type": "code", - "execution_count": 38, + "execution_count": 37, "metadata": { "slideshow": { "slide_type": "slide" @@ -1230,18 +1210,18 @@ "1" ] }, - "execution_count": 38, + "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "expr(x) or expr(y)" + "expr(0) or expr(1)" ] }, { "cell_type": "code", - "execution_count": 39, + "execution_count": 38, "metadata": { "slideshow": { "slide_type": "-" @@ -1261,18 +1241,18 @@ "1" ] }, - "execution_count": 39, + "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "expr(y) or expr(z)" + "expr(1) or expr(2) # 2 is not evaluated due to short-circuiting" ] }, { "cell_type": "code", - "execution_count": 40, + "execution_count": 39, "metadata": { "slideshow": { "slide_type": "-" @@ -1293,13 +1273,13 @@ "1" ] }, - "execution_count": 40, + "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "expr(x) or expr(y) or expr(z)" + "expr(0) or expr(1) or expr(2) # 2 is not evaluated due to short-circuiting" ] }, { @@ -1310,12 +1290,12 @@ } }, "source": [ - "If all sub-expressions evaluate to `False`, the last one is the result." + "If all sub-expressions are falsy, the last one is returned." ] }, { "cell_type": "code", - "execution_count": 41, + "execution_count": 40, "metadata": { "slideshow": { "slide_type": "fragment" @@ -1337,13 +1317,13 @@ "0" ] }, - "execution_count": 41, + "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "expr(False) or expr([]) or expr(x)" + "expr(False) or expr([]) or expr(0)" ] }, { @@ -1354,12 +1334,12 @@ } }, "source": [ - "With the `and` operator, the first sub-expression that evaluates to `False` is returned." + "With the `and` operator, the first falsy sub-expression is returned." ] }, { "cell_type": "code", - "execution_count": 42, + "execution_count": 41, "metadata": { "slideshow": { "slide_type": "slide" @@ -1373,6 +1353,38 @@ "Arg: 0\n" ] }, + { + "data": { + "text/plain": [ + "0" + ] + }, + "execution_count": 41, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "expr(0) and expr(1) # 1 is not evaluated due to short-circuiting" + ] + }, + { + "cell_type": "code", + "execution_count": 42, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Arg: 1\n", + "Arg: 0\n" + ] + }, { "data": { "text/plain": [ @@ -1385,7 +1397,7 @@ } ], "source": [ - "expr(x) and expr(y)" + "expr(1) and expr(0)" ] }, { @@ -1417,40 +1429,7 @@ } ], "source": [ - "expr(y) and expr(x)" - ] - }, - { - "cell_type": "code", - "execution_count": 44, - "metadata": { - "slideshow": { - "slide_type": "-" - } - }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Arg: 2\n", - "Arg: 1\n", - "Arg: 0\n" - ] - }, - { - "data": { - "text/plain": [ - "0" - ] - }, - "execution_count": 44, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "expr(z) and expr(y) and expr(x)" + "expr(1) and expr(0) and expr(2) # 2 is not evaluated due to short-circuiting" ] }, { @@ -1461,12 +1440,12 @@ } }, "source": [ - "If all sub-expressions evaluate to `True`, the last one is returned." + "If all sub-expressions are truthy, the last one is returned." ] }, { "cell_type": "code", - "execution_count": 45, + "execution_count": 44, "metadata": { "slideshow": { "slide_type": "fragment" @@ -1487,13 +1466,13 @@ "2" ] }, - "execution_count": 45, + "execution_count": 44, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "expr(y) and expr(z)" + "expr(1) and expr(2)" ] }, { @@ -1543,7 +1522,7 @@ }, { "cell_type": "code", - "execution_count": 46, + "execution_count": 45, "metadata": { "code_folding": [], "slideshow": { @@ -1557,7 +1536,7 @@ }, { "cell_type": "code", - "execution_count": 47, + "execution_count": 46, "metadata": { "code_folding": [], "slideshow": { @@ -1599,7 +1578,7 @@ }, { "cell_type": "code", - "execution_count": 48, + "execution_count": 47, "metadata": { "slideshow": { "slide_type": "slide" @@ -1612,13 +1591,21 @@ }, { "cell_type": "code", - "execution_count": 49, + "execution_count": 48, "metadata": { "slideshow": { "slide_type": "-" } }, - "outputs": [], + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "You read this just as often as you see heads when tossing a coin\n" + ] + } + ], "source": [ "if random.random() > 0.5:\n", " print(\"You read this just as often as you see heads when tossing a coin\")" @@ -1637,7 +1624,7 @@ }, { "cell_type": "code", - "execution_count": 50, + "execution_count": 49, "metadata": { "slideshow": { "slide_type": "skip" @@ -1674,7 +1661,7 @@ }, { "cell_type": "code", - "execution_count": 51, + "execution_count": 50, "metadata": { "slideshow": { "slide_type": "slide" @@ -1717,7 +1704,7 @@ }, { "cell_type": "code", - "execution_count": 52, + "execution_count": 51, "metadata": { "slideshow": { "slide_type": "slide" @@ -1790,7 +1777,7 @@ }, { "cell_type": "code", - "execution_count": 53, + "execution_count": 52, "metadata": { "slideshow": { "slide_type": "slide" @@ -1814,7 +1801,7 @@ }, { "cell_type": "code", - "execution_count": 54, + "execution_count": 53, "metadata": { "slideshow": { "slide_type": "-" @@ -1830,7 +1817,7 @@ }, { "cell_type": "code", - "execution_count": 55, + "execution_count": 54, "metadata": { "slideshow": { "slide_type": "-" @@ -1843,7 +1830,7 @@ "9" ] }, - "execution_count": 55, + "execution_count": 54, "metadata": {}, "output_type": "execute_result" } @@ -1865,7 +1852,7 @@ }, { "cell_type": "code", - "execution_count": 56, + "execution_count": 55, "metadata": { "slideshow": { "slide_type": "fragment" @@ -1878,7 +1865,7 @@ }, { "cell_type": "code", - "execution_count": 57, + "execution_count": 56, "metadata": { "slideshow": { "slide_type": "skip" @@ -1891,7 +1878,7 @@ "9" ] }, - "execution_count": 57, + "execution_count": 56, "metadata": {}, "output_type": "execute_result" } @@ -1913,7 +1900,7 @@ }, { "cell_type": "code", - "execution_count": 58, + "execution_count": 57, "metadata": { "slideshow": { "slide_type": "fragment" @@ -1926,7 +1913,7 @@ }, { "cell_type": "code", - "execution_count": 59, + "execution_count": 58, "metadata": { "slideshow": { "slide_type": "skip" @@ -1939,7 +1926,7 @@ "9" ] }, - "execution_count": 59, + "execution_count": 58, "metadata": {}, "output_type": "execute_result" } @@ -1956,7 +1943,7 @@ } }, "source": [ - "Conditional expressions may not only be used in the way described in this section. We already saw them as part of a *list comprehension* in [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb) and [Chapter 2](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/02_functions.ipynb) and revisit this in Chapter 7 in greater detail." + "Conditional expressions may not only be used in the way described in this section. We already saw them as part of a *list comprehension* in [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb) and [Chapter 2](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/02_functions.ipynb) and revisit this construct in greater detail in [Chapter 7](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/07_sequences.ipynb#List-Comprehensions)." ] }, { @@ -1987,7 +1974,7 @@ }, { "cell_type": "code", - "execution_count": 60, + "execution_count": 59, "metadata": { "slideshow": { "slide_type": "slide" @@ -2000,7 +1987,7 @@ }, { "cell_type": "code", - "execution_count": 61, + "execution_count": 60, "metadata": { "slideshow": { "slide_type": "-" @@ -2039,7 +2026,7 @@ }, { "cell_type": "code", - "execution_count": 62, + "execution_count": 61, "metadata": { "slideshow": { "slide_type": "fragment" @@ -2078,7 +2065,7 @@ }, { "cell_type": "code", - "execution_count": 63, + "execution_count": 62, "metadata": { "slideshow": { "slide_type": "slide" @@ -2089,7 +2076,7 @@ "name": "stdout", "output_type": "stream", "text": [ - "Yes, division worked smoothly.\n", + "Oops. Division by 0. How does that work?\n", "I am always printed\n" ] } diff --git a/04_iteration.ipynb b/04_iteration.ipynb index b760a25..3a2e42d 100644 --- a/04_iteration.ipynb +++ b/04_iteration.ipynb @@ -8,7 +8,7 @@ } }, "source": [ - "# Chapter 4: Iteration" + "# Chapter 4: Recursion & Looping" ] }, { @@ -859,7 +859,7 @@ "name": "stdout", "output_type": "stream", "text": [ - "39.9 µs ± 5.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n" + "69.6 µs ± 25.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n" ] } ], @@ -881,7 +881,7 @@ "name": "stdout", "output_type": "stream", "text": [ - "1.61 ms ± 8.84 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n" + "1.55 ms ± 11.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n" ] } ], @@ -903,7 +903,7 @@ "name": "stdout", "output_type": "stream", "text": [ - "199 ms ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)\n" + "189 ms ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)\n" ] } ], @@ -925,7 +925,7 @@ "name": "stdout", "output_type": "stream", "text": [ - "2.21 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)\n" + "2.07 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)\n" ] } ], @@ -947,7 +947,7 @@ "name": "stdout", "output_type": "stream", "text": [ - "5.8 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)\n" + "5.41 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)\n" ] } ], @@ -4289,7 +4289,7 @@ } }, "source": [ - "With everything *officially* introduced so far (i.e., without the introductory example in [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb) that only served as an overview), Python would be what is called **[Turing complete](https://en.wikipedia.org/wiki/Turing_completeness)**. That means that anything that could be formulated as an algorithm could be expressed with all the language features we have seen. Note that, in particular, we have *not* yet formally *introduced* the `for` and `while` statements!" + "With everything *officially* introduced so far, Python would be what is called **[Turing complete](https://en.wikipedia.org/wiki/Turing_completeness)**. That means that anything that could be formulated as an algorithm could be expressed with all the language features we have seen. Note that, in particular, we have *not* yet formally *introduced* the `for` and `while` statements!" ] }, { @@ -4565,7 +4565,7 @@ "name": "stdout", "output_type": "stream", "text": [ - "4.69 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)\n" + "4.9 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)\n" ] } ], @@ -4877,7 +4877,7 @@ } }, "source": [ - "For sequences of integers, the [range()](https://docs.python.org/3/library/functions.html#func-range) built-in makes the `for` statement even more convenient: It creates a list-like object of type `range` that generates integers \"on the fly,\" and we look closely at the underlying effects in memory in Chapter 7." + "For sequences of integers, the [range()](https://docs.python.org/3/library/functions.html#func-range) built-in makes the `for` statement even more convenient: It creates a list-like object of type `range` that generates integers \"on the fly,\" and we look closely at the underlying effects in memory in [Chapter 7](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/07_sequences.ipynb#Mapping)." ] }, { @@ -5008,13 +5008,13 @@ "\n", "Now, just as we classify objects by their types, we also classify these **concrete data types** (e.g., `int`, `float`, `str`, or `list`) into **abstract concepts**.\n", "\n", - "We did this already in [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb) when we described a `list` object as \"some sort of container that holds [...] pointers to other objects\". So, abstractly speaking, **containers** are any objects that are \"composed\" of other objects and also \"manage\" how these objects are organized. `list` objects, for example, have the property that they model an internal order associated with its elements. There exist, however, other container types, many of which do *not* come with an order. So, containers primarily \"contain\" other objects and have *nothing* to do with looping.\n", + "We did this already in [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb#Who-am-I?-And-how-many?) when we described a `list` object as \"some sort of container that holds [...] pointers to other objects\". So, abstractly speaking, **containers** are any objects that are \"composed\" of other objects and also \"manage\" how these objects are organized. `list` objects, for example, have the property that they model an internal order associated with its elements. There exist, however, other container types, many of which do *not* come with an order. So, containers primarily \"contain\" other objects and have *nothing* to do with looping.\n", "\n", "On the contrary, the abstract concept of **iterables** is all about looping: Any object that we can loop over is, by definition, an iterable. So, `range` objects, for example, are iterables that do *not* contain other objects. Moreover, looping does *not* have to occur in a *predictable* order, although this is the case for both `list` and `range` objects.\n", "\n", - "Typically, containers are iterables, and iterables are containers. Yet, only because these two concepts coincide often, we must not think of them as the same. Chapter 10 finally gives an explanation as to how abstract concepts are implemented and play together.\n", + "Typically, containers are iterables, and iterables are containers. Yet, only because these two concepts coincide often, we must not think of them as the same. Chapter 9 finally gives an explanation as to how abstract concepts are implemented and play together.\n", "\n", - "So, `list` objects like `first_names` below are iterable containers. They implement even more abstract concepts, as Chapter 7 reveals." + "So, `list` objects like `first_names` below are iterable containers. They implement even more abstract concepts, as [Chapter 7](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/07_sequences.ipynb#Collections-vs.-Sequences) reveals." ] }, { @@ -5165,7 +5165,7 @@ } }, "source": [ - "If we must have an index variable in the loop's body, we use the [enumerate()](https://docs.python.org/3/library/functions.html#enumerate) built-in that takes an *iterable* as its argument and then generates a \"stream\" of \"pairs\" consisting of an index variable and an object provided by the iterable. There is *no* need to ever revert to the `while` statement to loop over an iterable object." + "If we must have an index variable in the loop's body, we use the [enumerate()](https://docs.python.org/3/library/functions.html#enumerate) built-in that takes an *iterable* as its argument and then generates a \"stream\" of \"pairs\" of an index variable, `i` below, and an object provided by the iterable, `name`, separated by a `,`. There is *no* need to ever revert to the `while` statement with an explicitly managed index variable to loop over an iterable object." ] }, { @@ -5288,7 +5288,7 @@ } }, "source": [ - "In contrast to its recursive counterpart, the iterative `fibonacci()` function below is somewhat harder to read. For example, it is not so obvious as to how many iterations through the `for`-loop we need to make when implementing it. There is an increased risk of making an *off-by-one* error. Moreover, we need to track a `temp` variable along, at least until we have worked through Chapter 7. Do you understand what `temp` does?\n", + "In contrast to its recursive counterpart, the iterative `fibonacci()` function below is somewhat harder to read. For example, it is not so obvious as to how many iterations through the `for`-loop we need to make when implementing it. There is an increased risk of making an *off-by-one* error. Moreover, we need to track a `temp` variable along.\n", "\n", "However, one advantage of calculating Fibonacci numbers in a **forward** fashion with a `for` statement is that we could list the entire sequence in ascending order as we calculate the desired number. To show this, we added `print()` statements in `fibonacci()` below.\n", "\n", @@ -5843,7 +5843,7 @@ } }, "source": [ - "Often, we process some iterable with numeric data, for example, a list of numbers as in the introductory example in [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb) or, more realistically, data from a CSV file with many rows and columns.\n", + "Often, we process some iterable with numeric data, for example, a list of numbers as in the introductory example in [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb#Example:-Average-of-a-Subset-of-Numbers) or, more realistically, data from a CSV file with many rows and columns.\n", "\n", "Processing numeric data usually comes down to operations that may be grouped into one of the following three categories:\n", "\n", @@ -5851,7 +5851,7 @@ "- **filtering**: throw away individual samples (e.g., statistical outliers)\n", "- **reducing**: collect individual samples into summary statistics\n", "\n", - "We study this **map-filter-reduce** paradigm extensively in Chapter 7 after introducing more advanced data types that are needed to work with \"big\" data.\n", + "We study this **map-filter-reduce** paradigm extensively in [Chapter 7](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/07_sequences.ipynb#The-Map-Filter-Reduce-Paradigm) after introducing more advanced data types that are needed to work with \"big\" data.\n", "\n", "In the remainder of this section, we focus on *filtering out* some samples within a `for`-loop." ] diff --git a/04_iteration_review_and_exercises.ipynb b/04_iteration_review_and_exercises.ipynb index a953796..64d6903 100644 --- a/04_iteration_review_and_exercises.ipynb +++ b/04_iteration_review_and_exercises.ipynb @@ -5,7 +5,7 @@ "metadata": {}, "source": [ "\n", - "# Chapter 4: Iteration" + "# Chapter 4: Recursion & Looping" ] }, { diff --git a/05_numbers.ipynb b/05_numbers.ipynb index 73daebb..c545cde 100644 --- a/05_numbers.ipynb +++ b/05_numbers.ipynb @@ -8,7 +8,7 @@ } }, "source": [ - "# Chapter 5: Numbers" + "# Chapter 5: Bits & Numbers" ] }, { @@ -21,9 +21,9 @@ "source": [ "After learning about the basic building blocks of expressing and structuring the business logic in programs, we focus our attention on the **data types** Python offers us, both built-in and available via the [standard library](https://docs.python.org/3/library/index.html) or third-party packages.\n", "\n", - "We start with the \"simple\" ones: Numeric types in this chapter and textual data in [Chapter 6](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/06_text.ipynb). An important fact that holds for all objects of these types is that they are **immutable**. To re-use the bag analogy from [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb#Objects-vs.-Types-vs.-Values), this means that the $0$s and $1$s making up an object's *value* cannot be changed once the bag is created in memory, implying that any operation with or method on the object creates a *new* object in a *different* memory location.\n", + "We start with the \"simple\" ones: Numerical types in this chapter and textual data in [Chapter 6](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/06_text.ipynb). An important fact that holds for all objects of these types is that they are **immutable**. To re-use the bag analogy from [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb#Objects-vs.-Types-vs.-Values), this means that the $0$s and $1$s making up an object's *value* cannot be changed once the bag is created in memory, implying that any operation with or method on the object creates a *new* object in a *different* memory location.\n", "\n", - "Chapters 7, 8, and 9 then cover the more \"complex\" data types, including, for example, the `list` type. Finally, Chapter 10 completes the picture by introducing language constructs to create custom types.\n", + "[Chapter 7](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/07_sequences.ipynb) and Chapter 8 then cover the more \"complex\" data types, including, for example, the `list` type. Finally, Chapter 9 completes the picture by introducing language constructs to create custom types.\n", "\n", "We have already seen many hints indicating that numbers are not as trivial to work with as it seems at first sight:\n", "\n", @@ -31,7 +31,7 @@ "- [Chapter 3](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/03_conditionals.ipynb#Boolean-Expressions) raises questions regarding the **limited precision** of `float` numbers (e.g., `42 == 42.000000000000001` evaluates to `True`), and\n", "- [Chapter 4](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/04_iteration.ipynb#Infinite-Recursion) shows that sometimes a `float` \"walks\" and \"quacks\" like an `int`, whereas the reverse is true in other cases.\n", "\n", - "This chapter introduces all the [built-in numeric types](https://docs.python.org/3/library/stdtypes.html#numeric-types-int-float-complex): `int`, `float`, and `complex`. To mitigate the limited precision of floating-point numbers, we also look at two replacements for the `float` type in the [standard library](https://docs.python.org/3/library/index.html), namely the `Decimal` type in the [decimals](https://docs.python.org/3/library/decimal.html#decimal.Decimal) and the `Fraction` type in the [fractions](https://docs.python.org/3/library/fractions.html#fractions.Fraction) module." + "This chapter introduces all the [built-in numerical types](https://docs.python.org/3/library/stdtypes.html#numeric-types-int-float-complex): `int`, `float`, and `complex`. To mitigate the limited precision of floating-point numbers, we also look at two replacements for the `float` type in the [standard library](https://docs.python.org/3/library/index.html), namely the `Decimal` type in the [decimals](https://docs.python.org/3/library/decimal.html#decimal.Decimal) and the `Fraction` type in the [fractions](https://docs.python.org/3/library/fractions.html#fractions.Fraction) module." ] }, { @@ -53,7 +53,7 @@ } }, "source": [ - "The simplest numeric type is the `int` type: It behaves like an [integer in ordinary math](https://en.wikipedia.org/wiki/Integer) (i.e., the set $\\mathbb{Z}$) and supports operators in the way we saw in the section on arithmetic operators in [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb#%28Arithmetic%29-Operators)." + "The simplest numerical type is the `int` type: It behaves like an [integer in ordinary math](https://en.wikipedia.org/wiki/Integer) (i.e., the set $\\mathbb{Z}$) and supports operators in the way we saw in the section on arithmetic operators in [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb#%28Arithmetic%29-Operators)." ] }, { @@ -92,7 +92,7 @@ { "data": { "text/plain": [ - "139928698315984" + "139630197982416" ] }, "execution_count": 2, @@ -160,7 +160,7 @@ } }, "source": [ - "A nice feature is using underscores `_` as (thousands) separator in numeric literals. For example, `1_000_000` evaluates to `1000000` in memory." + "A nice feature is using underscores `_` as (thousands) separator in numerical literals. For example, `1_000_000` evaluates to `1000000` in memory." ] }, { @@ -1165,7 +1165,7 @@ } }, "source": [ - "Whereas the boolean literals `True` and `False` are commonly *not* regarded as numeric types, they behave like `1` and `0` in an arithmetic context." + "Whereas the boolean literals `True` and `False` are commonly *not* regarded as numerical types, they behave like `1` and `0` in an arithmetic context." ] }, { @@ -2115,7 +2115,7 @@ { "data": { "text/plain": [ - "139928698489496" + "139630198164096" ] }, "execution_count": 63, @@ -2218,7 +2218,7 @@ } }, "source": [ - "In cases where the dot `.` is unnecessary from a mathematical point of view, we either need to end the number with it nevertheless or use the [float()](https://docs.python.org/3/library/functions.html#float) built-in to cast the number explicitly. [float()](https://docs.python.org/3/library/functions.html#float) can process any numeric object or a properly formatted `str` object." + "In cases where the dot `.` is unnecessary from a mathematical point of view, we either need to end the number with it nevertheless or use the [float()](https://docs.python.org/3/library/functions.html#float) built-in to cast the number explicitly. [float()](https://docs.python.org/3/library/functions.html#float) can process any numerical object or a properly formatted `str` object." ] }, { @@ -2242,7 +2242,7 @@ } ], "source": [ - "1. # on the contrary, 1 creates an int object" + "1. # 1 without a dot creates an int object" ] }, { @@ -3517,7 +3517,7 @@ "source": [ "The [format()](https://docs.python.org/3/library/functions.html#format) function does *not* round a `float` object in the mathematical sense! It just allows us to show an arbitrary number of the digits as stored in memory, and it also does *not* change these.\n", "\n", - "On the contrary, the built-in [round()](https://docs.python.org/3/library/functions.html#round) function creates a *new* numeric object that is a rounded version of the one passed in as the argument. It adheres to the common rules of math.\n", + "On the contrary, the built-in [round()](https://docs.python.org/3/library/functions.html#round) function creates a *new* numerical object that is a rounded version of the one passed in as the argument. It adheres to the common rules of math.\n", "\n", "For example, let's round `1 / 3` to five decimals. The obtained value for `roughly_a_third` is also *imprecise* but different from the \"exact\" representation of `1 / 3` above." ] @@ -4617,7 +4617,7 @@ } }, "source": [ - "The [quantize()](https://docs.python.org/3/library/decimal.html#decimal.Decimal.quantize) method allows us to [quantize](https://www.dictionary.com/browse/quantize) (i.e., \"round\") a `Decimal` number at any precision that is *smaller* than the set precision. It looks at the number of decimals (i.e., to the right of the period) of the numeric argument we pass in.\n", + "The [quantize()](https://docs.python.org/3/library/decimal.html#decimal.Decimal.quantize) method allows us to [quantize](https://www.dictionary.com/browse/quantize) (i.e., \"round\") a `Decimal` number at any precision that is *smaller* than the set precision. It looks at the number of decimals (i.e., to the right of the period) of the numerical argument we pass in.\n", "\n", "For example, as the overall imprecise value of `two` still has an internal precision of `28` digits, we can correctly round it to *four* decimals (i.e., `Decimal(\"0.0001\")` has four decimals)." ] @@ -5498,7 +5498,7 @@ { "data": { "text/plain": [ - "139928697740272" + "139630197402224" ] }, "execution_count": 176, @@ -5636,7 +5636,7 @@ } }, "source": [ - "Alternatively, we may use the [complex()](https://docs.python.org/3/library/functions.html#complex) built-in: This takes two parameters where the second is optional and defaults to `0`. We may either call it with one or two arguments of any numeric type or a `str` object in the format of the previous code cell without any spaces." + "Alternatively, we may use the [complex()](https://docs.python.org/3/library/functions.html#complex) built-in: This takes two parameters where the second is optional and defaults to `0`. We may either call it with one or two arguments of any numerical type or a `str` object in the format of the previous code cell without any spaces." ] }, { @@ -5708,7 +5708,7 @@ } ], "source": [ - "complex(Decimal(\"2.0\"), Fraction(1, 2)) # the arguments may be any numeric type" + "complex(Decimal(\"2.0\"), Fraction(1, 2)) # the arguments may be any numerical type" ] }, { @@ -5743,7 +5743,7 @@ } }, "source": [ - "Arithmetic expressions work with `complex` numbers. They may be mixed with the other numeric types, and the result is always a `complex` number." + "Arithmetic expressions work with `complex` numbers. They may be mixed with the other numerical types, and the result is always a `complex` number." ] }, { @@ -6100,7 +6100,7 @@ } }, "source": [ - "## The Numeric Tower" + "## The Numerical Tower" ] }, { @@ -6111,7 +6111,7 @@ } }, "source": [ - "Analogous to the discussion of containers and iterables in [Chapter 4](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/04_iteration.ipynb#Containers-vs.-Iterables), we contrast the *concrete* numeric data types in this chapter with the *abstract* ideas behind [numbers in mathematics](https://en.wikipedia.org/wiki/Number).\n", + "Analogous to the discussion of containers and iterables in [Chapter 4](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/04_iteration.ipynb#Containers-vs.-Iterables), we contrast the *concrete* numerical data types in this chapter with the *abstract* ideas behind [numbers in mathematics](https://en.wikipedia.org/wiki/Number).\n", "\n", "The figure below summarizes five *major* sets of [numbers in mathematics](https://en.wikipedia.org/wiki/Number) as we know them from high school:\n", "\n", @@ -6149,7 +6149,7 @@ "\n", "For the other types, in particular, the `float` type, the implications of their imprecision are discussed in detail above.\n", "\n", - "The abstract concepts behind the four outer-most mathematical sets are part of Python since [PEP 3141](https://www.python.org/dev/peps/pep-3141/) in 2007. The [numbers](https://docs.python.org/3/library/numbers.html) module in the [standard library](https://docs.python.org/3/library/index.html) defines what Pythonistas call the **numeric tower**, a collection of five **[abstract data types](https://en.wikipedia.org/wiki/Abstract_data_type)**, or **abstract base classes** as they are called in Python jargon:\n", + "The abstract concepts behind the four outer-most mathematical sets are part of Python since [PEP 3141](https://www.python.org/dev/peps/pep-3141/) in 2007. The [numbers](https://docs.python.org/3/library/numbers.html) module in the [standard library](https://docs.python.org/3/library/index.html) defines what programmers call the **[numerical tower](https://en.wikipedia.org/wiki/Numerical_tower)**, a collection of five **[abstract data types](https://en.wikipedia.org/wiki/Abstract_data_type)**, or **abstract base classes** as they are called in Python jargon:\n", "\n", "- `numbers.Number`: \"any number\" (cf., [documentation](https://docs.python.org/3/library/numbers.html#numbers.Number))\n", "- `numbers.Complex`: \"all complex numbers\" (cf., [documentation](https://docs.python.org/3/library/numbers.html#numbers.Complex))\n", @@ -6712,7 +6712,7 @@ } }, "source": [ - "#### Example: [Factorial](https://en.wikipedia.org/wiki/Factorial) (revisited)" + "### Goose Typing" ] }, { @@ -6723,7 +6723,9 @@ } }, "source": [ - "Replacing *concrete* data types with *abstract* ones is particularly valuable in the context of input validation: The revised version of the `factorial()` function below allows its user to *take advantage of duck typing*. If a real but non-integer argument `n` is passed in, `factorial()` tries to cast `n` as an `int` object with the [trunc()](https://docs.python.org/3/library/math.html#math.trunc) function from the [math](https://docs.python.org/3/library/math.html) module in the [standard library](https://docs.python.org/3/library/index.html). [trunc()](https://docs.python.org/3/library/math.html#math.trunc) cuts off all decimals and any *concrete* numeric type implementing the *abstract* `numbers.Real` type supports it (cf., [documentation](https://docs.python.org/3/library/numbers.html#numbers.Real))." + "Replacing *concrete* data types with *abstract* ones is particularly valuable in the context of input validation: The revised version of the `factorial()` function below allows its user to take advantage of *duck typing*: If a real but non-integer argument `n` is passed in, `factorial()` tries to cast `n` as an `int` object with the [trunc()](https://docs.python.org/3/library/math.html#math.trunc) function from the [math](https://docs.python.org/3/library/math.html) module in the [standard library](https://docs.python.org/3/library/index.html). [trunc()](https://docs.python.org/3/library/math.html#math.trunc) cuts off all decimals and any *concrete* numerical type implementing the *abstract* `numbers.Real` type supports it (cf., [documentation](https://docs.python.org/3/library/numbers.html#numbers.Real)).\n", + "\n", + "Two popular and distinguished Pythonistas, [Luciano Ramalho](https://github.com/ramalho) and [Alex Martelli](https://en.wikipedia.org/wiki/Alex_Martelli), coin the term **goose typing** to specifically mean using the built-in [isinstance()](https://docs.python.org/3/library/functions.html#isinstance) function with an *abstract base class* (cf., Chapter 11 in this [book](https://www.amazon.com/Fluent-Python-Concise-Effective-Programming/dp/1491946008) or this [summary](https://dgkim5360.github.io/blog/python/2017/07/duck-typing-vs-goose-typing-pythonic-interfaces/) thereof)." ] }, { @@ -6763,6 +6765,17 @@ "math.trunc(1 / 10)" ] }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "#### Example: [Factorial](https://en.wikipedia.org/wiki/Factorial) (revisited)" + ] + }, { "cell_type": "code", "execution_count": 211, @@ -6893,7 +6906,7 @@ } }, "source": [ - "With the keyword-only argument `strict`, we can control whether or not a passed in `float` object may be rounded. By default, this is not allowed." + "With the keyword-only argument `strict`, we can control whether or not a passed in `float` object may be rounded. By default, this is not allowed and results in a `TypeError`." ] }, { @@ -7002,21 +7015,21 @@ } }, "source": [ - "There exist three numeric types in core Python:\n", - "- `int`: a perfect model for whole numbers (i.e., the set $\\mathbb{Z}$); inherently precise\n", + "There exist three numerical types in core Python:\n", + "- `int`: a near-perfect model for whole numbers (i.e., the set $\\mathbb{Z}$); inherently precise\n", "- `float`: the \"gold\" standard to approximate real numbers (i.e., the set $\\mathbb{R}$); inherently imprecise\n", "- `complex`: layer on top of the `float` type; therefore inherently imprecise\n", "\n", "Furthermore, the [standard library](https://docs.python.org/3/library/index.html) adds two more types that can be used as substitutes for `float` objects:\n", "- `Decimal`: similar to `float` but allows customizing the precision; still inherently imprecise\n", - "- `Fraction`: a perfect model for rational numbers (i.e., the set $\\mathbb{Q}$); built on top of the `int` type and therefore inherently precise\n", + "- `Fraction`: a near-perfect model for rational numbers (i.e., the set $\\mathbb{Q}$); built on top of the `int` type and therefore inherently precise\n", "\n", "The *important* takeaways for the data science practitioner are:\n", "\n", "1. **Do not mix** precise and imprecise data types, and\n", "2. actively expect `nan` results when working with `float` numbers as there are no **loud failures**.\n", "\n", - "The **numeric tower** is Python's way of implementing the various **abstract** ideas of what a number is in mathematics." + "The **numerical tower** is Python's way of implementing the various **abstract** ideas of what numbers are in mathematics." ] } ], diff --git a/05_numbers_review_and_exercises.ipynb b/05_numbers_review_and_exercises.ipynb index 4246965..90983d4 100644 --- a/05_numbers_review_and_exercises.ipynb +++ b/05_numbers_review_and_exercises.ipynb @@ -5,7 +5,7 @@ "metadata": {}, "source": [ "\n", - "# Chapter 5: Numbers" + "# Chapter 5: Bits & Numbers" ] }, { @@ -156,7 +156,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Q9**: How can **abstract data types**, for example, as defined in the **numeric tower**, be helpful in enabling **duck typing**?" + "**Q9**: How can **abstract data types**, for example, as defined in the **numerical tower**, be helpful in enabling **duck typing**?" ] }, { diff --git a/06_text.ipynb b/06_text.ipynb index 4cdd4ab..9092188 100644 --- a/06_text.ipynb +++ b/06_text.ipynb @@ -8,7 +8,7 @@ } }, "source": [ - "# Chapter 6: Text" + "# Chapter 6: Bytes & Text" ] }, { @@ -19,7 +19,7 @@ } }, "source": [ - "In this chapter, we continue the study of the built-in data types. Building on our knowledge of numbers, the next layer consists of textual data that are modeled primarily with the `str` type in Python. `str` objects are naturally more \"complex\" than numeric objects as any text consists of an arbitrary and possibly large number of individual characters that may be chosen from any alphabet in the history of humankind. Luckily, Python abstracts away most of this complexity." + "In this chapter, we continue the study of the built-in data types. Building on our knowledge of numbers, the next layer consists of textual data that are modeled primarily with the `str` type in Python. `str` objects are naturally more \"complex\" than numerical objects as any text consists of an arbitrary and possibly large number of individual characters that may be chosen from any alphabet in the history of humankind. Luckily, Python abstracts away most of this complexity." ] }, { @@ -80,7 +80,7 @@ { "data": { "text/plain": [ - "140148947850512" + "140133793916176" ] }, "execution_count": 2, @@ -124,7 +124,7 @@ } }, "source": [ - "A `str` object evaluates to itself in a literal notation with enclosing **single quotes** `'` by default. In [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb), we already specified the double quotes `\"` convention we stick to in this book. Yet, single quotes `'` and double quotes `\"` are *perfect* substitutes for all `str` objects that do *not* contain any of the two symbols in it. We could have used the reverse convention, as well.\n", + "A `str` object evaluates to itself in a literal notation with enclosing **single quotes** `'` by default. In [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb#Value), we already specified the double quotes `\"` convention we stick to in this book. Yet, single quotes `'` and double quotes `\"` are *perfect* substitutes for all `str` objects that do *not* contain any of the two symbols in it. We could use the reverse convention, as well.\n", "\n", "As [this discussion](https://stackoverflow.com/questions/56011/single-quotes-vs-double-quotes-in-python) shows, many programmers have *strong* opinions about that and make up *new* conventions for their projects. Consequently, the discussion was \"closed as not constructive\" by the moderators." ] @@ -161,7 +161,7 @@ } }, "source": [ - "As the single quote `'` is often used in the English language as a shortener, we could make an argument in favor of using the double quotes `\"`: There are possibly fewer situations like in the two code cells below, in which we must revert to using a `\\` to **escape** a single quote `'` in a text (cf., the section on special characters further below). However, double quotes `\"` are often used as well. So, this argument is somewhat not convincing.\n", + "As the single quote `'` is often used in the English language as a shortener, we could make an argument in favor of using the double quotes `\"`: There are possibly fewer situations like in the two code cells below, in which we must revert to using a `\\` to **escape** a single quote `'` in a text (cf., the [Special Characters](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/06_text.ipynb#Special-Characters) section further below). However, double quotes `\"` are often used as well. So, this argument is somewhat not convincing.\n", "\n", "Many proponents of the single quote `'` usage claim that double quotes `\"` make more **visual noise** on the screen. This argument is also not convincing. On the contrary, one could claim that *two* single quotes `''` look so similar to *one* double quote `\"` that it might not be apparent right away what we are looking at. By sticking to double quotes `\"`, we avoid such danger of confusion.\n", "\n", @@ -226,7 +226,7 @@ } }, "source": [ - "Alternatively, we can use the [str()](https://docs.python.org/3/library/stdtypes.html#str) built-in to **cast** non-`str` objects as a `str`." + "We can always use the [str()](https://docs.python.org/3/library/stdtypes.html#str) built-in to cast non-`str` objects as a `str`." ] }, { @@ -253,6 +253,471 @@ "str(123)" ] }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Another common situation where we obtain `str` objects is when reading the contents of a file with the [open()](https://docs.python.org/3/library/functions.html#open) built-in. In its simplest form, to open a [text file](https://en.wikipedia.org/wiki/Text_file) file in read-only mode, we pass in its path (i.e., \"filename\") as a `str` object." + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "file = open(\"lorem_ipsum.txt\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "[open()](https://docs.python.org/3/library/functions.html#open) returns a **[proxy](https://en.wikipedia.org/wiki/Proxy_pattern)** object of type `TextIOWrapper` that allows us to interact with the file on disk." + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "<_io.TextIOWrapper name='lorem_ipsum.txt' mode='r' encoding='UTF-8'>" + ] + }, + "execution_count": 9, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "file" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "_io.TextIOWrapper" + ] + }, + "execution_count": 10, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "type(file)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "While `file` provides, for example, the [read()](https://docs.python.org/3/library/io.html#io.TextIOBase.read), [readline()](https://docs.python.org/3/library/io.html#io.TextIOBase.readline), and [readlines()](https://docs.python.org/3/library/io.html#io.IOBase.readlines) methods to access its contents, it is also *iterable*, and we may loop over the individual lines with a `for` statement." + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Lorem Ipsum is simply dummy text of the printing and typesetting industry.\n", + "\n", + "Lorem Ipsum has been the industry's standard dummy text ever since the 1500s\n", + "\n", + "when an unknown printer took a galley of type and scrambled it to make a type\n", + "\n", + "specimen book. It has survived not only five centuries but also the leap into\n", + "\n", + "electronic typesetting, remaining essentially unchanged. It was popularised in\n", + "\n", + "the 1960s with the release of Letraset sheets.\n", + "\n" + ] + } + ], + "source": [ + "for line in file:\n", + " print(line)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Once we looped over `file` the first time, it is **exhausted**: That means we do not see any output if we loop over it another time." + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [], + "source": [ + "for line in file:\n", + " print(line)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "After the `for`-loop, `line` is still set to the last line in the file, and we verify that it is indeed a `str` object." + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'the 1960s with the release of Letraset sheets.\\n'" + ] + }, + "execution_count": 13, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "line" + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "str" + ] + }, + "execution_count": 14, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "type(line)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "An important fact is that `file` is still associated with an *open* **[file descriptor](https://en.wikipedia.org/wiki/File_descriptor)**. Without going into any technical details, we note that an operating system can only handle a limited number of \"open files\" at the same time, and, therefore, we should always *close* the file once we are done processing it.\n", + "\n", + "`file` has a `closed` attribute on it that shows us if a file descriptor is open or closed, and with the [close()](https://docs.python.org/3/library/io.html#io.IOBase.close) method, we can \"manually\" close it." + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "False" + ] + }, + "execution_count": 15, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "file.closed" + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "file.close()" + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 17, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "file.closed" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The more Pythonic way is to open a file with the `with` statement (cf., [reference](https://docs.python.org/3/reference/compound_stmts.html#the-with-statement)): The indented code block is said to be executed in the **context** of the header line that acts as a **[context manager](https://docs.python.org/3/reference/datamodel.html?highlight=context%20manager#with-statement-context-managers)**. Such objects may have many different purposes. Here, the context manager created with `with open(...) as file:` mainly ensures that the file descriptor gets automatically closed after the last line in the code block is executed." + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Lorem Ipsum is simply dummy text of the printing and typesetting industry.\n", + "\n", + "Lorem Ipsum has been the industry's standard dummy text ever since the 1500s\n", + "\n", + "when an unknown printer took a galley of type and scrambled it to make a type\n", + "\n", + "specimen book. It has survived not only five centuries but also the leap into\n", + "\n", + "electronic typesetting, remaining essentially unchanged. It was popularised in\n", + "\n", + "the 1960s with the release of Letraset sheets.\n", + "\n" + ] + } + ], + "source": [ + "with open(\"lorem_ipsum.txt\") as file:\n", + " for line in file:\n", + " print(line)" + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 19, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "file.closed" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "To use constructs familiar from [Chapter 3](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/03_conditionals.ipynb#The-try-Statement) to explain what `with open(...) as file:` does, below is a formulation with a `try` statement *equivalent* to the `with` statement above. The `finally`-branch is always executed, even if an exception is raised in the `for`-loop. So, `file` is sure to be closed too, with a somewhat less expressive formulation." + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Lorem Ipsum is simply dummy text of the printing and typesetting industry.\n", + "\n", + "Lorem Ipsum has been the industry's standard dummy text ever since the 1500s\n", + "\n", + "when an unknown printer took a galley of type and scrambled it to make a type\n", + "\n", + "specimen book. It has survived not only five centuries but also the leap into\n", + "\n", + "electronic typesetting, remaining essentially unchanged. It was popularised in\n", + "\n", + "the 1960s with the release of Letraset sheets.\n", + "\n" + ] + } + ], + "source": [ + "try:\n", + " file = open(\"lorem_ipsum.txt\")\n", + " for line in file:\n", + " print(line)\n", + "finally:\n", + " file.close()" + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 21, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "file.closed" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "A subtlety to notice is that there is an empty line printed between each `line`. That is because each `line` ends with a `\"\\n\"` that results in a line break and that is explained further below. To print the text without empty lines in between, we pass a `end=\"\"` argument to the [print()](https://docs.python.org/3/library/functions.html#print) function." + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Lorem Ipsum is simply dummy text of the printing and typesetting industry.\n", + "Lorem Ipsum has been the industry's standard dummy text ever since the 1500s\n", + "when an unknown printer took a galley of type and scrambled it to make a type\n", + "specimen book. It has survived not only five centuries but also the leap into\n", + "electronic typesetting, remaining essentially unchanged. It was popularised in\n", + "the 1960s with the release of Letraset sheets.\n" + ] + } + ], + "source": [ + "with open(\"lorem_ipsum.txt\") as file:\n", + " for line in file:\n", + " print(line, end=\"\")" + ] + }, { "cell_type": "markdown", "metadata": { @@ -274,21 +739,21 @@ "source": [ "The idea of a **sequence** is yet another *abstract* concept.\n", "\n", - "It unifies *four* orthogonal *abstract* concepts into one: Any *concrete* data type, such as `str` in this chapter, is considered a sequence if it simultaneously\n", + "It unifies *four* [orthogonal](https://en.wikipedia.org/wiki/Orthogonality) (i.e., \"independent\") *abstract* concepts into one: Any *concrete* data type, such as `str`, is considered a sequence if it simultaneously\n", "\n", - "1. behaves like a **container** and\n", - "2. an **iterable**, and \n", - "3. comes with a predictable **order** of its\n", + "1. **contains** other \"things,\"\n", + "2. is **iterable**, and \n", + "3. comes with a *predictable* **order** of its\n", "4. **finite** number of elements.\n", "\n", - "Chapter 7 discusses sequences in a generic sense and in greater detail. Here, we keep our focus on the `str` type that historically received its name as it models a \"**[string of characters](https://en.wikipedia.org/wiki/String_%28computer_science%29)**,\" and a \"string\" is more formally called a sequence in the computer science literature.\n", + "[Chapter 7](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/07_sequences.ipynb#Collections-vs.-Sequences) formalizes sequences in great detail. Here, we keep our focus on the `str` type that historically received its name as it models a \"**[string of characters](https://en.wikipedia.org/wiki/String_%28computer_science%29)**,\" and a \"string\" is more formally called a sequence in the computer science literature.\n", "\n", - "Behaving like a sequence, `str` objects may be treated like `list` objects in many cases. For example, the built-in [len()](https://docs.python.org/3/library/functions.html#len) function tells us how many elements (i.e., characters) make up `school`. [len()](https://docs.python.org/3/library/functions.html#len) would not work on an \"*infinite*\" object: As anything modeled in a program must fit into a computer's finite memory at runtime, there cannot exist objects containing a truly infinite number of elements; however, Chapter 7 introduces a *concrete* iterable data type that can be used to model an *infinite* series of elements and that, consequently, has no concept of \"length.\"" + "Behaving like a sequence, `str` objects may be treated like `list` objects in many cases. For example, the built-in [len()](https://docs.python.org/3/library/functions.html#len) function tells us how many elements (i.e., characters) make up `school`. [len()](https://docs.python.org/3/library/functions.html#len) would not work on an \"*infinite*\" object: As anything modeled in a program must fit into a computer's finite memory at runtime, there cannot exist objects containing a truly infinite number of elements; however, [Chapter 7](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/07_sequences.ipynb#Collections-vs.-Sequences#Mapping) introduces *concrete* iterable data types that can be used to model an *infinite* series of elements and that, consequently, have no concept of \"length.\"" ] }, { "cell_type": "code", - "execution_count": 8, + "execution_count": 23, "metadata": { "slideshow": { "slide_type": "slide" @@ -301,7 +766,7 @@ "40" ] }, - "execution_count": 8, + "execution_count": 23, "metadata": {}, "output_type": "execute_result" } @@ -325,7 +790,7 @@ }, { "cell_type": "code", - "execution_count": 9, + "execution_count": 24, "metadata": { "slideshow": { "slide_type": "fragment" @@ -353,12 +818,12 @@ } }, "source": [ - "Being a container, we can check if a given object is a member of a sequence with the `in` operator. In the context of strings, the `in` operator has *two* usages: First, it checks if a *single* character is contained in a `str` object. Second, it may also check if a shorter `str` object, then called a **substring**, is contained in a longer one." + "Being a container, we can check if a given object is a member of a sequence with the `in` operator. In the context of `str` objects, the `in` operator has *two* usages: First, it checks if a *single* character is contained in a `str` object. Second, it may also check if a shorter `str` object, then called a **substring**, is contained in a longer one." ] }, { "cell_type": "code", - "execution_count": 10, + "execution_count": 25, "metadata": { "slideshow": { "slide_type": "fragment" @@ -371,7 +836,7 @@ "True" ] }, - "execution_count": 10, + "execution_count": 25, "metadata": {}, "output_type": "execute_result" } @@ -382,7 +847,7 @@ }, { "cell_type": "code", - "execution_count": 11, + "execution_count": 26, "metadata": { "slideshow": { "slide_type": "-" @@ -395,7 +860,7 @@ "True" ] }, - "execution_count": 11, + "execution_count": 26, "metadata": {}, "output_type": "execute_result" } @@ -406,7 +871,7 @@ }, { "cell_type": "code", - "execution_count": 12, + "execution_count": 27, "metadata": { "slideshow": { "slide_type": "-" @@ -419,7 +884,7 @@ "False" ] }, - "execution_count": 12, + "execution_count": 27, "metadata": {}, "output_type": "execute_result" } @@ -452,7 +917,7 @@ }, { "cell_type": "code", - "execution_count": 13, + "execution_count": 28, "metadata": { "slideshow": { "slide_type": "slide" @@ -465,7 +930,7 @@ "'W'" ] }, - "execution_count": 13, + "execution_count": 28, "metadata": {}, "output_type": "execute_result" } @@ -476,7 +941,7 @@ }, { "cell_type": "code", - "execution_count": 14, + "execution_count": 29, "metadata": { "slideshow": { "slide_type": "-" @@ -489,7 +954,7 @@ "'H'" ] }, - "execution_count": 14, + "execution_count": 29, "metadata": {}, "output_type": "execute_result" } @@ -511,7 +976,7 @@ }, { "cell_type": "code", - "execution_count": 15, + "execution_count": 30, "metadata": { "slideshow": { "slide_type": "slide" @@ -525,7 +990,7 @@ "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mschool\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m1.0\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mschool\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m1.0\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mTypeError\u001b[0m: string indices must be integers" ] } @@ -547,7 +1012,7 @@ }, { "cell_type": "code", - "execution_count": 16, + "execution_count": 31, "metadata": { "slideshow": { "slide_type": "slide" @@ -560,7 +1025,7 @@ "'t'" ] }, - "execution_count": 16, + "execution_count": 31, "metadata": {}, "output_type": "execute_result" } @@ -582,7 +1047,7 @@ }, { "cell_type": "code", - "execution_count": 17, + "execution_count": 32, "metadata": { "slideshow": { "slide_type": "-" @@ -596,7 +1061,7 @@ "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mIndexError\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mschool\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m40\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mschool\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m40\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mIndexError\u001b[0m: string index out of range" ] } @@ -618,7 +1083,7 @@ }, { "cell_type": "code", - "execution_count": 18, + "execution_count": 33, "metadata": { "slideshow": { "slide_type": "slide" @@ -631,7 +1096,7 @@ "'t'" ] }, - "execution_count": 18, + "execution_count": 33, "metadata": {}, "output_type": "execute_result" } @@ -653,7 +1118,7 @@ }, { "cell_type": "code", - "execution_count": 19, + "execution_count": 34, "metadata": { "slideshow": { "slide_type": "fragment" @@ -666,7 +1131,7 @@ "'O'" ] }, - "execution_count": 19, + "execution_count": 34, "metadata": {}, "output_type": "execute_result" } @@ -677,7 +1142,7 @@ }, { "cell_type": "code", - "execution_count": 20, + "execution_count": 35, "metadata": { "slideshow": { "slide_type": "-" @@ -690,7 +1155,7 @@ "'O'" ] }, - "execution_count": 20, + "execution_count": 35, "metadata": {}, "output_type": "execute_result" } @@ -725,7 +1190,7 @@ }, { "cell_type": "code", - "execution_count": 21, + "execution_count": 36, "metadata": { "slideshow": { "slide_type": "slide" @@ -738,7 +1203,7 @@ "'WHU'" ] }, - "execution_count": 21, + "execution_count": 36, "metadata": {}, "output_type": "execute_result" } @@ -760,7 +1225,7 @@ }, { "cell_type": "code", - "execution_count": 22, + "execution_count": 37, "metadata": { "slideshow": { "slide_type": "fragment" @@ -773,7 +1238,7 @@ "'WHU - Otto Beisheim School of Management'" ] }, - "execution_count": 22, + "execution_count": 37, "metadata": {}, "output_type": "execute_result" } @@ -790,12 +1255,12 @@ } }, "source": [ - "For convenience, the indexes do not need to lie in the range from 0 to the string's \"length\" when slicing. This is *not* the case for indexing as the `IndexError` above shows." + "For convenience, the indexes do not need to lie in the range from 0 to the `str` object's \"length\" when slicing. This is *not* the case for indexing as the `IndexError` above shows." ] }, { "cell_type": "code", - "execution_count": 23, + "execution_count": 38, "metadata": { "slideshow": { "slide_type": "fragment" @@ -808,7 +1273,7 @@ "'WHU - Otto Beisheim School of Management'" ] }, - "execution_count": 23, + "execution_count": 38, "metadata": {}, "output_type": "execute_result" } @@ -830,7 +1295,7 @@ }, { "cell_type": "code", - "execution_count": 24, + "execution_count": 39, "metadata": { "slideshow": { "slide_type": "fragment" @@ -843,7 +1308,7 @@ "'WHU - Otto Beisheim School of Management'" ] }, - "execution_count": 24, + "execution_count": 39, "metadata": {}, "output_type": "execute_result" } @@ -865,7 +1330,7 @@ }, { "cell_type": "code", - "execution_count": 25, + "execution_count": 40, "metadata": { "slideshow": { "slide_type": "skip" @@ -878,7 +1343,7 @@ "'WHU Otto Beisheim School'" ] }, - "execution_count": 25, + "execution_count": 40, "metadata": {}, "output_type": "execute_result" } @@ -900,7 +1365,7 @@ }, { "cell_type": "code", - "execution_count": 26, + "execution_count": 41, "metadata": { "slideshow": { "slide_type": "slide" @@ -913,7 +1378,7 @@ "'WU-Ot esemSho fMngmn'" ] }, - "execution_count": 26, + "execution_count": 41, "metadata": {}, "output_type": "execute_result" } @@ -935,7 +1400,7 @@ }, { "cell_type": "code", - "execution_count": 27, + "execution_count": 42, "metadata": { "slideshow": { "slide_type": "fragment" @@ -948,7 +1413,7 @@ "'tnemeganaM fo loohcS miehsieB ottO - UHW'" ] }, - "execution_count": 27, + "execution_count": 42, "metadata": {}, "output_type": "execute_result" } @@ -976,7 +1441,7 @@ } }, "source": [ - "Whereas elements of a `list` object *may* be *re-assigned*, as shortly hinted at in [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb#Who-am-I?-And-how-many?), this is *not* allowed for `str` objects. Once created, they *cannot* be *changed*. Formally, we say that they are **immutable**. In that regard, `str` objects and all the numeric types in [Chapter 5](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/05_numbers.ipynb) are alike.\n", + "Whereas elements of a `list` object *may* be *re-assigned*, as shortly hinted at in [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb#Who-am-I?-And-how-many?), this is *not* allowed for `str` objects. Once created, they *cannot* be *changed*. Formally, we say that they are **immutable**. In that regard, `str` objects and all the numerical types in [Chapter 5](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/05_numbers.ipynb) are alike.\n", "\n", "On the contrary, objects that may be changed after creation, are called **mutable**. We already saw in [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb#Who-am-I?-And-how-many?) how mutable objects are more difficult to reason about for a beginner, in particular, if more than *one* variable point to one. Yet, mutability does have its place in a programmer's toolbox, and we revisit this idea in the next chapters.\n", "\n", @@ -985,7 +1450,7 @@ }, { "cell_type": "code", - "execution_count": 28, + "execution_count": 43, "metadata": { "slideshow": { "slide_type": "slide" @@ -999,7 +1464,7 @@ "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mschool\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m\"E\"\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mschool\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m\"E\"\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mTypeError\u001b[0m: 'str' object does not support item assignment" ] } @@ -1021,7 +1486,7 @@ }, { "cell_type": "code", - "execution_count": 29, + "execution_count": 44, "metadata": { "slideshow": { "slide_type": "slide" @@ -1034,7 +1499,7 @@ }, { "cell_type": "code", - "execution_count": 30, + "execution_count": 45, "metadata": { "slideshow": { "slide_type": "fragment" @@ -1047,7 +1512,7 @@ "'EBS - Otto Beisheim School of Management'" ] }, - "execution_count": 30, + "execution_count": 45, "metadata": {}, "output_type": "execute_result" } @@ -1058,7 +1523,7 @@ }, { "cell_type": "code", - "execution_count": 31, + "execution_count": 46, "metadata": { "slideshow": { "slide_type": "-" @@ -1068,10 +1533,10 @@ { "data": { "text/plain": [ - "140148946876144" + "140133784610832" ] }, - "execution_count": 31, + "execution_count": 46, "metadata": {}, "output_type": "execute_result" } @@ -1082,7 +1547,7 @@ }, { "cell_type": "code", - "execution_count": 32, + "execution_count": 47, "metadata": { "slideshow": { "slide_type": "-" @@ -1092,10 +1557,10 @@ { "data": { "text/plain": [ - "140148947850512" + "140133793916176" ] }, - "execution_count": 32, + "execution_count": 47, "metadata": {}, "output_type": "execute_result" } @@ -1128,7 +1593,7 @@ }, { "cell_type": "code", - "execution_count": 33, + "execution_count": 48, "metadata": { "slideshow": { "slide_type": "slide" @@ -1141,7 +1606,7 @@ }, { "cell_type": "code", - "execution_count": 34, + "execution_count": 49, "metadata": { "slideshow": { "slide_type": "-" @@ -1154,7 +1619,7 @@ "'Hello WHU'" ] }, - "execution_count": 34, + "execution_count": 49, "metadata": {}, "output_type": "execute_result" } @@ -1165,7 +1630,7 @@ }, { "cell_type": "code", - "execution_count": 35, + "execution_count": 50, "metadata": { "slideshow": { "slide_type": "-" @@ -1178,7 +1643,7 @@ "'WHU WHU WHU WHU WHU WHU WHU WHU WHU WHU '" ] }, - "execution_count": 35, + "execution_count": 50, "metadata": {}, "output_type": "execute_result" } @@ -1213,7 +1678,7 @@ }, { "cell_type": "code", - "execution_count": 36, + "execution_count": 51, "metadata": { "slideshow": { "slide_type": "slide" @@ -1226,7 +1691,7 @@ "6" ] }, - "execution_count": 36, + "execution_count": 51, "metadata": {}, "output_type": "execute_result" } @@ -1237,7 +1702,7 @@ }, { "cell_type": "code", - "execution_count": 37, + "execution_count": 52, "metadata": { "slideshow": { "slide_type": "-" @@ -1250,7 +1715,7 @@ "-1" ] }, - "execution_count": 37, + "execution_count": 52, "metadata": {}, "output_type": "execute_result" } @@ -1261,7 +1726,7 @@ }, { "cell_type": "code", - "execution_count": 38, + "execution_count": 53, "metadata": { "slideshow": { "slide_type": "fragment" @@ -1274,7 +1739,7 @@ "11" ] }, - "execution_count": 38, + "execution_count": 53, "metadata": {}, "output_type": "execute_result" } @@ -1296,7 +1761,7 @@ }, { "cell_type": "code", - "execution_count": 39, + "execution_count": 54, "metadata": { "slideshow": { "slide_type": "slide" @@ -1309,7 +1774,7 @@ "12" ] }, - "execution_count": 39, + "execution_count": 54, "metadata": {}, "output_type": "execute_result" } @@ -1320,7 +1785,7 @@ }, { "cell_type": "code", - "execution_count": 40, + "execution_count": 55, "metadata": { "slideshow": { "slide_type": "fragment" @@ -1333,7 +1798,7 @@ "16" ] }, - "execution_count": 40, + "execution_count": 55, "metadata": {}, "output_type": "execute_result" } @@ -1344,7 +1809,7 @@ }, { "cell_type": "code", - "execution_count": 41, + "execution_count": 56, "metadata": { "slideshow": { "slide_type": "-" @@ -1357,7 +1822,7 @@ "-1" ] }, - "execution_count": 41, + "execution_count": 56, "metadata": {}, "output_type": "execute_result" } @@ -1379,7 +1844,7 @@ }, { "cell_type": "code", - "execution_count": 42, + "execution_count": 57, "metadata": { "slideshow": { "slide_type": "slide" @@ -1392,7 +1857,7 @@ "4" ] }, - "execution_count": 42, + "execution_count": 57, "metadata": {}, "output_type": "execute_result" } @@ -1414,7 +1879,7 @@ }, { "cell_type": "code", - "execution_count": 43, + "execution_count": 58, "metadata": { "slideshow": { "slide_type": "fragment" @@ -1427,7 +1892,7 @@ "5" ] }, - "execution_count": 43, + "execution_count": 58, "metadata": {}, "output_type": "execute_result" } @@ -1449,7 +1914,7 @@ }, { "cell_type": "code", - "execution_count": 44, + "execution_count": 59, "metadata": { "slideshow": { "slide_type": "skip" @@ -1462,7 +1927,7 @@ "5" ] }, - "execution_count": 44, + "execution_count": 59, "metadata": {}, "output_type": "execute_result" } @@ -1484,7 +1949,7 @@ }, { "cell_type": "code", - "execution_count": 45, + "execution_count": 60, "metadata": { "slideshow": { "slide_type": "slide" @@ -1497,7 +1962,7 @@ }, { "cell_type": "code", - "execution_count": 46, + "execution_count": 61, "metadata": { "slideshow": { "slide_type": "-" @@ -1507,10 +1972,10 @@ { "data": { "text/plain": [ - "140149036799680" + "140133891261864" ] }, - "execution_count": 46, + "execution_count": 61, "metadata": {}, "output_type": "execute_result" } @@ -1521,7 +1986,7 @@ }, { "cell_type": "code", - "execution_count": 47, + "execution_count": 62, "metadata": { "slideshow": { "slide_type": "fragment" @@ -1534,7 +1999,7 @@ }, { "cell_type": "code", - "execution_count": 48, + "execution_count": 63, "metadata": { "slideshow": { "slide_type": "-" @@ -1544,10 +2009,10 @@ { "data": { "text/plain": [ - "140148946626016" + "140133784910848" ] }, - "execution_count": 48, + "execution_count": 63, "metadata": {}, "output_type": "execute_result" } @@ -1569,7 +2034,7 @@ }, { "cell_type": "code", - "execution_count": 49, + "execution_count": 64, "metadata": { "slideshow": { "slide_type": "fragment" @@ -1582,7 +2047,7 @@ "False" ] }, - "execution_count": 49, + "execution_count": 64, "metadata": {}, "output_type": "execute_result" } @@ -1593,7 +2058,7 @@ }, { "cell_type": "code", - "execution_count": 50, + "execution_count": 65, "metadata": { "slideshow": { "slide_type": "-" @@ -1606,7 +2071,7 @@ "True" ] }, - "execution_count": 50, + "execution_count": 65, "metadata": {}, "output_type": "execute_result" } @@ -1628,7 +2093,7 @@ }, { "cell_type": "code", - "execution_count": 51, + "execution_count": 66, "metadata": { "slideshow": { "slide_type": "slide" @@ -1667,7 +2132,7 @@ }, { "cell_type": "code", - "execution_count": 52, + "execution_count": 67, "metadata": { "slideshow": { "slide_type": "slide" @@ -1680,7 +2145,7 @@ }, { "cell_type": "code", - "execution_count": 53, + "execution_count": 68, "metadata": { "slideshow": { "slide_type": "-" @@ -1693,7 +2158,7 @@ }, { "cell_type": "code", - "execution_count": 54, + "execution_count": 69, "metadata": { "slideshow": { "slide_type": "-" @@ -1706,7 +2171,7 @@ "'This will become a sentence'" ] }, - "execution_count": 54, + "execution_count": 69, "metadata": {}, "output_type": "execute_result" } @@ -1728,7 +2193,7 @@ }, { "cell_type": "code", - "execution_count": 55, + "execution_count": 70, "metadata": { "slideshow": { "slide_type": "fragment" @@ -1741,7 +2206,7 @@ "'a b c d e'" ] }, - "execution_count": 55, + "execution_count": 70, "metadata": {}, "output_type": "execute_result" } @@ -1763,7 +2228,7 @@ }, { "cell_type": "code", - "execution_count": 56, + "execution_count": 71, "metadata": { "slideshow": { "slide_type": "slide" @@ -1776,7 +2241,7 @@ "'This is a sentence'" ] }, - "execution_count": 56, + "execution_count": 71, "metadata": {}, "output_type": "execute_result" } @@ -1809,7 +2274,7 @@ }, { "cell_type": "code", - "execution_count": 57, + "execution_count": 72, "metadata": { "slideshow": { "slide_type": "slide" @@ -1824,7 +2289,7 @@ }, { "cell_type": "code", - "execution_count": 58, + "execution_count": 73, "metadata": { "slideshow": { "slide_type": "fragment" @@ -1837,7 +2302,7 @@ "True" ] }, - "execution_count": 58, + "execution_count": 73, "metadata": {}, "output_type": "execute_result" } @@ -1848,7 +2313,7 @@ }, { "cell_type": "code", - "execution_count": 59, + "execution_count": 74, "metadata": { "slideshow": { "slide_type": "-" @@ -1861,7 +2326,7 @@ "False" ] }, - "execution_count": 59, + "execution_count": 74, "metadata": {}, "output_type": "execute_result" } @@ -1878,12 +2343,12 @@ } }, "source": [ - "One way to fix this is to only compare lower-case strings." + "One way to fix this is to only compare lower-cased `str` objects." ] }, { "cell_type": "code", - "execution_count": 60, + "execution_count": 75, "metadata": { "slideshow": { "slide_type": "fragment" @@ -1896,7 +2361,7 @@ "True" ] }, - "execution_count": 60, + "execution_count": 75, "metadata": {}, "output_type": "execute_result" } @@ -1918,7 +2383,7 @@ }, { "cell_type": "code", - "execution_count": 61, + "execution_count": 76, "metadata": { "slideshow": { "slide_type": "slide" @@ -1985,9 +2450,9 @@ } }, "source": [ - "The previous code cell shows an example of a so-called **f-string**, as introduced by [PEP 498](https://www.python.org/dev/peps/pep-0498/) only in 2016.\n", + "The previous code cell shows an example of a so-called **f-string**, as introduced by [PEP 498](https://www.python.org/dev/peps/pep-0498/) only in 2016, that is passed as the argument to the [print()](https://docs.python.org/3/library/functions.html#print) function.\n", "\n", - "So far, we have used the built-in [print()](https://docs.python.org/3/library/functions.html#print) function only with \"plain\" `str` objects (e.g., `\"example\"`) or variables. Sometimes, it is more convenient to fill a value determined at runtime in a \"draft\" `str` object. This is called **string interpolation**. There are three ways to achieve that in Python, but only two are commonly used." + "The \"f\" stands for \"formatted\", and we can think of the `str` object as a text \"draft\" that is filled in with values determined at runtime. This concept is formally called **string interpolation**, and there are three ways to achieve that in Python." ] }, { @@ -2009,12 +2474,12 @@ } }, "source": [ - "f-strings are the new and most readable way. Prepend the literal notation with an `f`, and put variables/expressions within curly braces. The latter are then filled in when the [print()](https://docs.python.org/3/library/functions.html#print) function is executed." + "f-strings, formally called **[formatted string literals](https://docs.python.org/3/reference/lexical_analysis.html#formatted-string-literals)**, are the least recently added and most readable way: We prepend a `str` in literal notation with an `f`, and put variables, or more generally, expressions, within curly braces. These are then filled in when a `str` object is evaluated." ] }, { "cell_type": "code", - "execution_count": 62, + "execution_count": 77, "metadata": { "slideshow": { "slide_type": "slide" @@ -2028,7 +2493,7 @@ }, { "cell_type": "code", - "execution_count": 63, + "execution_count": 78, "metadata": { "slideshow": { "slide_type": "-" @@ -2036,15 +2501,18 @@ }, "outputs": [ { - "name": "stdout", - "output_type": "stream", - "text": [ - "Hello Alexander! Good morning.\n" - ] + "data": { + "text/plain": [ + "'Hello Alexander! Good morning.'" + ] + }, + "execution_count": 78, + "metadata": {}, + "output_type": "execute_result" } ], "source": [ - "print(f\"Hello {name}! Good {time_of_day}.\")" + "f\"Hello {name}! Good {time_of_day}.\"" ] }, { @@ -2055,12 +2523,12 @@ } }, "source": [ - "Separated by a colon `:`, various formatting options are available. In the beginning, only the ability to round is useful, and this can be achieved by adding `:.2f` to the variable name to cast the number as a float and round it to two digits." + "Separated by a colon `:`, various formatting options are available. In the beginning, the ability to round may be particularly useful: This can be achieved by adding `:.2f` to the variable name inside the curly braces, which casts the number as a `float` and rounds it to two digits. The `:.2f` is a so-called format specifier, and there exists a whole **[format specification mini-language](https://docs.python.org/3/library/string.html#formatspec)** to govern how specifiers work." ] }, { "cell_type": "code", - "execution_count": 64, + "execution_count": 79, "metadata": { "slideshow": { "slide_type": "slide" @@ -2073,7 +2541,7 @@ }, { "cell_type": "code", - "execution_count": 65, + "execution_count": 80, "metadata": { "slideshow": { "slide_type": "-" @@ -2081,20 +2549,23 @@ }, "outputs": [ { - "name": "stdout", - "output_type": "stream", - "text": [ - "Pi is 3.14\n" - ] + "data": { + "text/plain": [ + "'Pi is 3.14'" + ] + }, + "execution_count": 80, + "metadata": {}, + "output_type": "execute_result" } ], "source": [ - "print(f\"Pi is {pi:.2f}\")" + "f\"Pi is {pi:.2f}\"" ] }, { "cell_type": "code", - "execution_count": 66, + "execution_count": 81, "metadata": { "slideshow": { "slide_type": "-" @@ -2102,15 +2573,18 @@ }, "outputs": [ { - "name": "stdout", - "output_type": "stream", - "text": [ - "Pi is 3.142\n" - ] + "data": { + "text/plain": [ + "'Pi is 3.142'" + ] + }, + "execution_count": 81, + "metadata": {}, + "output_type": "execute_result" } ], "source": [ - "print(f\"Pi is {pi:.3f}\")" + "f\"Pi is {pi:.3f}\"" ] }, { @@ -2121,7 +2595,7 @@ } }, "source": [ - "### format() Method" + "### [format()](https://docs.python.org/3/library/stdtypes.html#str.format) Method" ] }, { @@ -2132,12 +2606,12 @@ } }, "source": [ - "`str` objects also provide a [format()](https://docs.python.org/3/library/stdtypes.html#str.format) method that accepts an arbitrary number of *positional* arguments that are inserted into the `str` object in the same order replacing curly brackets. See the official [documentation](https://docs.python.org/3/library/string.html#formatspec) for a full reference. This is the more traditional way of string interpolation and many code examples on the internet use it. f-strings are the officially recommended way going forward, but the usage of the [format()](https://docs.python.org/3/library/stdtypes.html#str.format) method is most likely not going down any time soon." + "`str` objects also provide a [format()](https://docs.python.org/3/library/stdtypes.html#str.format) method that accepts an arbitrary number of *positional* arguments that are inserted into the `str` object in the same order replacing empty curly brackets. String interpolation with the [format()](https://docs.python.org/3/library/stdtypes.html#str.format) method is a more traditional and probably the most common way one as of today. While f-strings are the recommended way going forward, usage of the [format()](https://docs.python.org/3/library/stdtypes.html#str.format) method is likely not declining any time soon." ] }, { "cell_type": "code", - "execution_count": 67, + "execution_count": 82, "metadata": { "slideshow": { "slide_type": "slide" @@ -2145,15 +2619,18 @@ }, "outputs": [ { - "name": "stdout", - "output_type": "stream", - "text": [ - "Hello Alexander! Good morning.\n" - ] + "data": { + "text/plain": [ + "'Hello Alexander! Good morning.'" + ] + }, + "execution_count": 82, + "metadata": {}, + "output_type": "execute_result" } ], "source": [ - "print(\"Hello {}! Good {}.\".format(name, time_of_day))" + "\"Hello {}! Good {}.\".format(name, time_of_day)" ] }, { @@ -2164,12 +2641,12 @@ } }, "source": [ - "Use index numbers if the order is different in the `str` object." + "We may use index numbers inside the curly braces if the order is different in the `str` object." ] }, { "cell_type": "code", - "execution_count": 68, + "execution_count": 83, "metadata": { "slideshow": { "slide_type": "fragment" @@ -2177,15 +2654,18 @@ }, "outputs": [ { - "name": "stdout", - "output_type": "stream", - "text": [ - "Good morning, Alexander\n" - ] + "data": { + "text/plain": [ + "'Good morning, Alexander'" + ] + }, + "execution_count": 83, + "metadata": {}, + "output_type": "execute_result" } ], "source": [ - "print(\"Good {1}, {0}\".format(name, time_of_day))" + "\"Good {1}, {0}\".format(name, time_of_day)" ] }, { @@ -2196,12 +2676,12 @@ } }, "source": [ - "The [format()](https://docs.python.org/3/library/stdtypes.html#str.format) method may alternatively be used with *keyword* arguments as well. Then, we must put the keyword names within the curly brackets." + "The [format()](https://docs.python.org/3/library/stdtypes.html#str.format) method may alternatively be used with *keyword* arguments as well. Then, we must put the keywords' names within the curly brackets." ] }, { "cell_type": "code", - "execution_count": 69, + "execution_count": 84, "metadata": { "slideshow": { "slide_type": "fragment" @@ -2209,15 +2689,18 @@ }, "outputs": [ { - "name": "stdout", - "output_type": "stream", - "text": [ - "Hello Alexander! Good morning.\n" - ] + "data": { + "text/plain": [ + "'Hello Alexander! Good morning.'" + ] + }, + "execution_count": 84, + "metadata": {}, + "output_type": "execute_result" } ], "source": [ - "print(\"Hello {name}! Good {time}.\".format(name=name, time=time_of_day))" + "\"Hello {name}! Good {time}.\".format(name=name, time=time_of_day)" ] }, { @@ -2228,12 +2711,60 @@ } }, "source": [ - "Numbers are treated as in the f-strings case." + "Format specifiers work as in the f-string case." ] }, { "cell_type": "code", - "execution_count": 70, + "execution_count": 85, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'Pi is 3.14'" + ] + }, + "execution_count": 85, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "\"Pi is {:.2f}\".format(pi)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "### `%` Operator" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The `%` operator that we saw in the context of modulo division in [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb#%28Arithmetic%29-Operators) is overloaded with string interpolation when its first operand is a `str` object. The second operand consists of all expressions to be filled in. Format specifiers work with a `%` instead of curly braces and according to a different set of rules referred to as **[printf-style string formatting](https://docs.python.org/3/library/stdtypes.html#printf-style-string-formatting)**. So, `{:.2f}` becomes `%.2f`.\n", + "\n", + "This way of string interpolation is the oldest and originates from the [C language](https://en.wikipedia.org/wiki/C_%28programming_language%29). It is still widely spread, but we should use one of the other two ways instead. We show it here mainly for completeness sake." + ] + }, + { + "cell_type": "code", + "execution_count": 86, "metadata": { "slideshow": { "slide_type": "skip" @@ -2241,15 +2772,53 @@ }, "outputs": [ { - "name": "stdout", - "output_type": "stream", - "text": [ - "Pi is 3.14\n" - ] + "data": { + "text/plain": [ + "'Pi is 3.14'" + ] + }, + "execution_count": 86, + "metadata": {}, + "output_type": "execute_result" } ], "source": [ - "print(\"Pi is {:.2f}\".format(pi))" + "\"Pi is %.2f\" % pi" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "To insert more than one expression, we must list them in order and between parenthesis `(` and `)`. As [Chapter 7](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/07_sequences.ipynb#The-tuple-Type) reveals, this literal syntax creates an object of type `tuple`. Also, to format an expression as text, we use the format specifier `%s`." + ] + }, + { + "cell_type": "code", + "execution_count": 87, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'Hello Alexander! Good morning.'" + ] + }, + "execution_count": 87, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "\"Hello %s! Good %s.\" % (name, time_of_day)" ] }, { @@ -2271,12 +2840,14 @@ } }, "source": [ - "Some symbols have a special meaning within `str` objects. Popular examples are the newline `\\n` and tab `\\t` \"characters.\" The backslash symbol `\\` is also referred to as an **escape character** in this context, indicating that the following character has a meaning other than its literal meaning." + "Some symbols have a special meaning within `str` objects. Popular examples are the newline `\\n` and tab `\\t` \"characters.\" The backslash symbol `\\` is also referred to as an **escape character** in this context, indicating that the following character has a meaning other than its literal meaning.\n", + "\n", + "The built-in [print()](https://docs.python.org/3/library/functions.html#print) function then \"prints\" out these special characters accordingly." ] }, { "cell_type": "code", - "execution_count": 71, + "execution_count": 88, "metadata": { "slideshow": { "slide_type": "slide" @@ -2299,7 +2870,7 @@ }, { "cell_type": "code", - "execution_count": 72, + "execution_count": 89, "metadata": { "slideshow": { "slide_type": "fragment" @@ -2331,7 +2902,7 @@ }, { "cell_type": "code", - "execution_count": 73, + "execution_count": 90, "metadata": { "slideshow": { "slide_type": "fragment" @@ -2350,6 +2921,41 @@ "print(\"\\U0001f604\")" ] }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Outside the [print()](https://docs.python.org/3/library/functions.html#print) function, the special characters are not treated any different from non-special ones." + ] + }, + { + "cell_type": "code", + "execution_count": 91, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'This is a sentence\\nthat is printed\\non three lines.'" + ] + }, + "execution_count": 91, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "\"This is a sentence\\nthat is printed\\non three lines.\"" + ] + }, { "cell_type": "markdown", "metadata": { @@ -2376,7 +2982,7 @@ }, { "cell_type": "code", - "execution_count": 74, + "execution_count": 92, "metadata": { "slideshow": { "slide_type": "slide" @@ -2404,12 +3010,12 @@ } }, "source": [ - "Some strings even produce a `SyntaxError` because the `\\U` *cannot* be interpreted as a unicode code point." + "Some `str` objects even produce a `SyntaxError` because the `\\U` *cannot* be interpreted as a unicode code point." ] }, { "cell_type": "code", - "execution_count": 75, + "execution_count": 93, "metadata": { "slideshow": { "slide_type": "fragment" @@ -2418,10 +3024,10 @@ "outputs": [ { "ename": "SyntaxError", - "evalue": "(unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \\UXXXXXXXX escape (, line 1)", + "evalue": "(unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \\UXXXXXXXX escape (, line 1)", "output_type": "error", "traceback": [ - "\u001b[0;36m File \u001b[0;32m\"\"\u001b[0;36m, line \u001b[0;32m1\u001b[0m\n\u001b[0;31m print(\"C:\\Users\\Administrator\\Desktop\\Project_Folder\")\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \\UXXXXXXXX escape\n" + "\u001b[0;36m File \u001b[0;32m\"\"\u001b[0;36m, line \u001b[0;32m1\u001b[0m\n\u001b[0;31m print(\"C:\\Users\\Administrator\\Desktop\\Project_Folder\")\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \\UXXXXXXXX escape\n" ] } ], @@ -2442,7 +3048,7 @@ }, { "cell_type": "code", - "execution_count": 76, + "execution_count": 94, "metadata": { "slideshow": { "slide_type": "slide" @@ -2463,7 +3069,7 @@ }, { "cell_type": "code", - "execution_count": 77, + "execution_count": 95, "metadata": { "slideshow": { "slide_type": "-" @@ -2490,12 +3096,12 @@ } }, "source": [ - "However, this is tedious to remember and type. Luckily, Python allows treating any string literal as a \"raw\" string, and this is indicated in the string literal by a `r` prefix." + "However, this is tedious to remember and type. Luckily, Python allows treating any string literal as \"raw,\" and this is indicated in the string literal by the prefix `r`." ] }, { "cell_type": "code", - "execution_count": 78, + "execution_count": 96, "metadata": { "slideshow": { "slide_type": "slide" @@ -2516,7 +3122,7 @@ }, { "cell_type": "code", - "execution_count": 79, + "execution_count": 97, "metadata": { "slideshow": { "slide_type": "-" @@ -2559,7 +3165,7 @@ }, { "cell_type": "code", - "execution_count": 80, + "execution_count": 98, "metadata": { "slideshow": { "slide_type": "slide" @@ -2568,10 +3174,10 @@ "outputs": [ { "ename": "SyntaxError", - "evalue": "EOL while scanning string literal (, line 1)", + "evalue": "EOL while scanning string literal (, line 1)", "output_type": "error", "traceback": [ - "\u001b[0;36m File \u001b[0;32m\"\"\u001b[0;36m, line \u001b[0;32m1\u001b[0m\n\u001b[0;31m \"\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m EOL while scanning string literal\n" + "\u001b[0;36m File \u001b[0;32m\"\"\u001b[0;36m, line \u001b[0;32m1\u001b[0m\n\u001b[0;31m \"\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m EOL while scanning string literal\n" ] } ], @@ -2594,7 +3200,7 @@ }, { "cell_type": "code", - "execution_count": 81, + "execution_count": 99, "metadata": { "slideshow": { "slide_type": "fragment" @@ -2616,12 +3222,12 @@ } }, "source": [ - "Linebreaks are kept and implicitly converted into `\\n` characters." + "Line breaks are kept and implicitly converted into `\\n` characters." ] }, { "cell_type": "code", - "execution_count": 82, + "execution_count": 100, "metadata": { "slideshow": { "slide_type": "fragment" @@ -2634,7 +3240,7 @@ "'\\nI am a multi-line string\\nconsisting of 4 lines.\\n'" ] }, - "execution_count": 82, + "execution_count": 100, "metadata": {}, "output_type": "execute_result" } @@ -2656,7 +3262,7 @@ }, { "cell_type": "code", - "execution_count": 83, + "execution_count": 101, "metadata": { "slideshow": { "slide_type": "fragment" @@ -2686,12 +3292,12 @@ } }, "source": [ - "Using the [split()](https://docs.python.org/3/library/stdtypes.html#str.split) method with the optional *sep* argument, we confirm that `multi_line` consists of *four* lines with the first and last linebreaks being the first and last characters in the `str` object." + "Using the [split()](https://docs.python.org/3/library/stdtypes.html#str.split) method with the optional *sep* argument, we confirm that `multi_line` consists of *four* lines with the first and last line breaks being the first and last characters in the `str` object." ] }, { "cell_type": "code", - "execution_count": 84, + "execution_count": 102, "metadata": { "slideshow": { "slide_type": "slide" @@ -2714,6 +3320,82 @@ " print(i, line)" ] }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The next code cell puts several constructs from this chapter together to create a multi-line `str` object `content`: The `with` statement provides a context that ensures `file` is not left open. Then, the [readlines()](https://docs.python.org/3/library/io.html#io.IOBase.readlines) method returns the contents of `file` as a `list` object holding as many `str` objects as there are lines in the file on disk. Lastly, we concatenate these together with the [join()](https://docs.python.org/3/library/stdtypes.html#str.join) method to obtain `content`. We do so on an empty `str` object `\"\"` as each line already ends with a `\"\\n\"`." + ] + }, + { + "cell_type": "code", + "execution_count": 103, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "with open(\"lorem_ipsum.txt\") as file:\n", + " content = \"\".join(file.readlines())" + ] + }, + { + "cell_type": "code", + "execution_count": 104, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "\"Lorem Ipsum is simply dummy text of the printing and typesetting industry.\\nLorem Ipsum has been the industry's standard dummy text ever since the 1500s\\nwhen an unknown printer took a galley of type and scrambled it to make a type\\nspecimen book. It has survived not only five centuries but also the leap into\\nelectronic typesetting, remaining essentially unchanged. It was popularised in\\nthe 1960s with the release of Letraset sheets.\\n\"" + ] + }, + "execution_count": 104, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "content" + ] + }, + { + "cell_type": "code", + "execution_count": 105, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Lorem Ipsum is simply dummy text of the printing and typesetting industry.\n", + "Lorem Ipsum has been the industry's standard dummy text ever since the 1500s\n", + "when an unknown printer took a galley of type and scrambled it to make a type\n", + "specimen book. It has survived not only five centuries but also the leap into\n", + "electronic typesetting, remaining essentially unchanged. It was popularised in\n", + "the 1960s with the release of Letraset sheets.\n", + "\n" + ] + } + ], + "source": [ + "print(content)" + ] + }, { "cell_type": "markdown", "metadata": { @@ -2735,7 +3417,7 @@ "source": [ "Textual data is modeled with the **immutable** `str` type.\n", "\n", - "The `str` type supports *four* orthogonal **abstract concepts** that together make up the idea of a **sequence**: Every `str` object is an iterable container of a finite number of ordered characters." + "The `str` type supports *four* orthogonal **abstract concepts** that together constitute the idea of a **sequence**: Every `str` object is an iterable container of a finite number of ordered characters." ] } ], diff --git a/06_text_review_and_exercises.ipynb b/06_text_review_and_exercises.ipynb index 304a91d..40ac8c4 100644 --- a/06_text_review_and_exercises.ipynb +++ b/06_text_review_and_exercises.ipynb @@ -5,7 +5,7 @@ "metadata": {}, "source": [ "\n", - "# Chapter 6: Text" + "# Chapter 6: Bytes & Text" ] }, { @@ -54,7 +54,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Q2**: What is a direct consequence of the `str` type's property of being **ordered**? What operations could we *not* do with it if it were *unordered*?" + "**Q2**: What is a direct consequence of the `str` type's property of being an **ordered** sequence? What operations could we *not* do with it if it were *unordered*?" ] }, { @@ -87,7 +87,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Q4**: Describe in your own words what we mean with **string interpolation**!" + "**Q4**: Describe in your own words what we mean by **string interpolation**!" ] }, { @@ -115,7 +115,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Q5**: **Triple-double** quotes `\"\"\"` and **triple-single** quotes `'''` create a *new* object of type `text` that model so-called **multi-line strings**." + "**Q5**: **Triple-double** quotes `\"\"\"` and **triple-single** quotes `'''` create a *new* object of type `text` that models so-called **multi-line strings**." ] }, { @@ -143,7 +143,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Q7**: Indexing into a `str` object with a *negative* index **fails silently**: It does *not* raise an error but also does *not* do anything useful." + "**Q7**: Indexing into a `str` object with a *negative* index **fails silently**: It does *neither* raise an error *nor* do anything useful." ] }, { diff --git a/lorem_ipsum.txt b/lorem_ipsum.txt new file mode 100644 index 0000000..c0e21ab --- /dev/null +++ b/lorem_ipsum.txt @@ -0,0 +1,6 @@ +Lorem Ipsum is simply dummy text of the printing and typesetting industry. +Lorem Ipsum has been the industry's standard dummy text ever since the 1500s +when an unknown printer took a galley of type and scrambled it to make a type +specimen book. It has survived not only five centuries but also the leap into +electronic typesetting, remaining essentially unchanged. It was popularised in +the 1960s with the release of Letraset sheets.