diff --git a/00_start_up.ipynb b/00_start_up.ipynb index 29175db..9ace23c 100644 --- a/00_start_up.ipynb +++ b/00_start_up.ipynb @@ -19,11 +19,9 @@ } }, "source": [ - "This book is set up to be a *thorough* introduction to programming in [Python](https://www.python.org/).\n", + "This book is a *thorough* introduction to programming in [Python](https://www.python.org/).\n", "\n", - "It teaches the concepts behind and the syntax of the core Python language as defined by the [Python Software Foundation](https://www.python.org/psf/) in the official [language reference](https://docs.python.org/3/reference/index.html) and introduces additions to the language as distributed with the [standard library](https://docs.python.org/3/library/index.html) that come with every installation.\n", - "\n", - "Furthermore, some very popular third-party libraries like [numpy](https://www.numpy.org/), [pandas](https://pandas.pydata.org/), [matplotlib](https://matplotlib.org/), and others are portrayed." + "It teaches the concepts behind and the syntax of the core Python language as defined by the [Python Software Foundation](https://www.python.org/psf/) in the official [language reference](https://docs.python.org/3/reference/index.html). Furthermore, it introduces commonly used functionalities from the [standard library](https://docs.python.org/3/library/index.html) and popular third-party libraries like [numpy](https://www.numpy.org/), [pandas](https://pandas.pydata.org/), [matplotlib](https://matplotlib.org/), and others." ] }, { @@ -56,7 +54,7 @@ } }, "source": [ - "To be suitable for *total beginners*, there are *no* formal prerequisites. It is only expected that the student has:\n", + "This book is suitable for *total beginners*, and there are *no* formal prerequisites. The student only needs to have:\n", "\n", "- a *solid* understanding of the **English language**,\n", "- knowledge of **basic mathematics** from high school,\n", @@ -94,7 +92,7 @@ } }, "source": [ - "This includes but is not limited to topics such as:\n", + "These include but are not limited to topics such as:\n", "- linear algebra\n", "- statistics & econometrics\n", "- data cleaning & wrangling\n", @@ -108,7 +106,7 @@ "- quantitative marketing (e.g., customer segmentation)\n", "- quantitative supply chain management (e.g., forecasting)\n", "- management science & decision models\n", - "- backend / API / web development (to serve data products to clients)" + "- backend/API/web development (to serve data products to clients)" ] }, { @@ -130,9 +128,9 @@ } }, "source": [ - "The term **[data science](https://en.wikipedia.org/wiki/Data_science)** is rather vague and does actually *not* refer to an academic discipline. Instead the term was popularized by the tech industry who also coined non-meaningful job titles such as \"[rockstar](https://www.quora.com/Why-are-engineers-called-rockstars-and-ninjas)\" or \"[ninja developers](https://www.quora.com/Why-are-engineers-called-rockstars-and-ninjas)\". Most *serious* definitions describe the field as being **multi-disciplinary** *integrating* scientific methods, algorithms, and systems thinking to extract knowledge from (structured and unstructured) data *and* also emphasize the importance of **[domain knowledge](https://en.wikipedia.org/wiki/Domain_knowledge)**.\n", + "The term **[data science](https://en.wikipedia.org/wiki/Data_science)** is rather vague and does *not* refer to an academic discipline. Instead, the term was popularized by the tech industry, who also coined non-meaningful job titles such as \"[rockstar](https://www.quora.com/Why-are-engineers-called-rockstars-and-ninjas)\" or \"[ninja developers](https://www.quora.com/Why-are-engineers-called-rockstars-and-ninjas).\" Most *serious* definitions describe the field as being **multi-disciplinary** *integrating* scientific methods, algorithms, and systems thinking to extract knowledge from (structured and unstructured) data *and* also emphasize the importance of **[domain knowledge](https://en.wikipedia.org/wiki/Domain_knowledge)**.\n", "\n", - "Recently, this integration aspect feeds back into the academic world. The [MIT](https://www.mit.edu/), for example, created the new [Stephen A. Schwarzman College of Computing](http://computing.mit.edu) for [artifical intelligence](https://en.wikipedia.org/wiki/Artificial_intelligence) (with a 1 billion dollar initial investment) where students undergo a \"bilingual\" curriculum with half the classes in quantitative and method-centric fields (like the ones mentioned above) and the other half in domains such as biology, business, chemistry, politics, (art) history, or linguistics (cf., the [official Q&As](http://computing.mit.edu/faq/) or this [NYT article](https://www.nytimes.com/2018/10/15/technology/mit-college-artificial-intelligence.html)). Their strategists see a future where programming skills are just as naturally embedded into every students' studies as are nowadays subjects like calculus, statistics, or academic writing. Then, programming literacy is not just another \"nice to have\" skill but a prerequisite (or an enabler) to understanding more advanced topics in the actual domains studied. This could make teaching easier for top-notch researchers who use a lot of programming in their day-to-day lifes answering big questions: The student and the teacher are then more likely to \"speak\" the same language." + "Recently, this integration aspect feeds back into the academic world. The [MIT](https://www.mit.edu/), for example, created the new [Stephen A. Schwarzman College of Computing](http://computing.mit.edu) for [artificial intelligence](https://en.wikipedia.org/wiki/Artificial_intelligence) (with a 1 billion dollar initial investment) where students undergo a \"bilingual\" curriculum with half the classes in quantitative and method-centric fields (like the ones mentioned above) and the other half in domains such as biology, business, chemistry, politics, (art) history, or linguistics (cf., the [official Q&As](http://computing.mit.edu/faq/) or this [NYT article](https://www.nytimes.com/2018/10/15/technology/mit-college-artificial-intelligence.html)). Their strategists see a future where programming skills are just as naturally embedded into every students' curricula as are nowadays subjects like calculus, statistics, or academic writing. Then, programming literacy is not just another \"nice to have\" skill but a prerequisite, or an enabler, to understanding more advanced topics in the actual domains studied. Top-notch researchers who use programming in their day-to-day lives could then teach students more efficiently in their \"language.\"" ] }, { @@ -154,11 +152,11 @@ } }, "source": [ - "To follow this book, a working installation of **Python 3.6** or higher is needed.\n", + "To \"read\" this book in the most meaningful way, a working installation of **Python 3.6** or higher is needed.\n", "\n", - "A popular and beginner friendly way is to install the [Anaconda Distribution](https://www.anaconda.com/distribution/) that not only ships Python and the standard library but comes pre-packaged with a lot of third-party libraries from the so-called \"scientific stack\". Just go to the [download](https://www.anaconda.com/download/) page and install the latest version (i.e., *2019-07* with Python 3.7 at the time of this writing) for your operating system.\n", + "A popular and beginner-friendly way is to install the [Anaconda Distribution](https://www.anaconda.com/distribution/) that not only ships Python and the standard library but comes pre-packaged with a lot of third-party libraries from the so-called \"scientific stack.\" Just go to the [download](https://www.anaconda.com/download/) page and install the latest version (i.e., *2019-07* with Python 3.7 at the time of this writing) for your operating system.\n", "\n", - "Then, among others, you will find an entry \"Jupyter Notebook\" in your start menu like below. Click on it and a new tab in your web browser will open where you can switch between folders as you could in your computer's default file browser." + "Then, among others, you find an entry \"Jupyter Notebook\" in your start menu like below. Click on it to open a new tab in your web browser where you can switch between folders as you could in your computer's default file browser." ] }, { @@ -180,7 +178,7 @@ } }, "source": [ - "To download the materials accompanying this book as a ZIP file, open this [GitHub repository](https://github.com/webartifex/intro-to-python) in a web browser and click on the green \"Clone or download\" button on the right. Then, unpack the ZIP file into a folder of your choosing (ideally somewhere within your personal user folder so that the files show up right away)." + "To download the materials accompanying this book as a ZIP file, open this [GitHub repository](https://github.com/webartifex/intro-to-python) in a web browser and click on the green \"Clone or download\" button on the right. Then, unpack the ZIP file into a folder of your choosing (ideally somewhere within your user folder so that the files show up right away)." ] }, { @@ -202,7 +200,7 @@ } }, "source": [ - "Python can also be installed in a \"pure\" way as obtained from its core development team. However, this is somewhat too \"advanced\" for a beginner who would then also be responsible for setting up all the third-party libraries needed to actually view this document. Plus, many of the involveld steps must be done in a [terminal](https://en.wikipedia.org/wiki/Terminal_emulator) window, which tends to be a bit intimidating for most beginners." + "Python can also be installed in a \"pure\" way as obtained from its core development team. However, this is somewhat too \"advanced\" for a beginner who would then also be responsible for setting up all the third-party libraries needed to view this document. Plus, many of the involved steps are typed in a [terminal](https://en.wikipedia.org/wiki/Terminal_emulator) window, which tends to be a bit intimidating for most beginners." ] }, { @@ -224,7 +222,7 @@ } }, "source": [ - "For more \"couragous\" beginners wanting to learn how to accomplish this, here is a rough sketch of the aspects to know. First, all Python realeases are available for free on the official [download](https://www.python.org/downloads/) page for any supported operating system. Choose the one you want, then download and install it (cf., the [instruction notes](https://wiki.python.org/moin/BeginnersGuide/Download)). As this only includes core Python and the standard library, the beginner then needs to learn about the [pip](https://pip.pypa.io/en/stable/) module, which is to be used inside a terminal. With the command `python -m pip install jupyter`, all necessary third-party libraries can be installed (cf., more background [here](https://jupyter.readthedocs.io/en/latest/install.html)). However, this would be done in a *system-wide* fashion and is not recommended. Instead, the best practice is to create a so-called **virtual environment** with the [venv](https://docs.python.org/3/library/venv.html) module with which the installed third-party packages can be *isolated* on a per-project basis (the command `python -m venv env-name` creates a virtual enviroment called \"env-name\"). This tactic is employed to avoid a situation known as **[dependency hell](https://en.wikipedia.org/wiki/Dependency_hell)** Once created, the virtual environment must then be activated each time before resuming work in each terminal (with the command `source env-name/bin/activate`). While there exist convenience tools that automate parts of this (e.g., [poetry](https://poetry.eustace.io/docs/) or [virtualenvwrapper](https://virtualenvwrapper.readthedocs.io/en/latest/)), it only distracts a beginner from actually studying the Python language. Yet, it is still worthwhile to have heard about these terms and concepts as many online resources often implicitly assume the user to know about them." + "For more \"courageous\" beginners wanting to learn how to accomplish this, here is a rough sketch of the aspects to know. First, all Python releases are available for free on the official [download](https://www.python.org/downloads/) page for any supported operating system. Choose the latest one applicable, then download and install it (cf., the [instruction notes](https://wiki.python.org/moin/BeginnersGuide/Download)). As this only includes core Python and the standard library, the beginner then needs to learn about the [pip](https://pip.pypa.io/en/stable/) module. With the command `python -m pip install jupyter`, all necessary third-party libraries are installed (cf., more background [here](https://jupyter.readthedocs.io/en/latest/install.html)). However, this would be done in a *system-wide* fashion and is not recommended. Instead, the best practice is to create a so-called **virtual environment** with the [venv](https://docs.python.org/3/library/venv.html) module with which the installed third-party packages are *isolated* on a per-project basis (the command `python -m venv env-name` creates a virtual environment called \"env-name\"). This tactic is employed to avoid a situation known as **[dependency hell](https://en.wikipedia.org/wiki/Dependency_hell)**. Once created, the virtual environment must then be activated each time before resuming work in each terminal (with the command `source env-name/bin/activate`). While there exist convenience tools that automate parts of this (e.g., [poetry](https://poetry.eustace.io/docs/) or [virtualenvwrapper](https://virtualenvwrapper.readthedocs.io/en/latest/)), it only distracts a beginner from studying the Python language. Yet, it is still worthwhile to have heard about these terms and concepts as many online resources often implicitly assume the user to know about them." ] }, { @@ -250,9 +248,9 @@ "\n", "\"Jupyter\" is an [acronym](https://en.wikipedia.org/wiki/Acronym) derived from the names of the three major programming languages **[Julia](https://julialang.org/)**, **[Python](https://www.python.org)**, and **[R](https://www.r-project.org/)**, all of which play significant roles in the world of data science. The Jupyter Project's idea is to serve as an integrating platform such that different programming languages and software packages can be used together within the same project easily.\n", "\n", - "Furthermore, Jupyter notebooks have become a de-facto standard for communicating and exchanging results in the data science community (both in academia and business) and often provide a more intuitive alternative to terminal based ways of running Python (e.g., the default [Python interpreter](https://docs.python.org/3/tutorial/interpreter.html) as shown above or a more advanced interactive version like [IPython](https://ipython.org/)) or even a full-fledged [Integrated Development Environment](https://en.wikipedia.org/wiki/Integrated_development_environment) (e.g., the commercial [PyCharm](https://www.jetbrains.com/pycharm/) or the free [Spyder](https://github.com/spyder-ide/spyder)).\n", + "Furthermore, Jupyter notebooks have become a de-facto standard for communicating and exchanging results in the data science community (both in academia and business) and often provide a more intuitive alternative to terminal-based ways of running Python (e.g., the default [Python interpreter](https://docs.python.org/3/tutorial/interpreter.html) as shown above or a more advanced interactive version like [IPython](https://ipython.org/)) or even a full-fledged [Integrated Development Environment](https://en.wikipedia.org/wiki/Integrated_development_environment) (e.g., the commercial [PyCharm](https://www.jetbrains.com/pycharm/) or the free [Spyder](https://github.com/spyder-ide/spyder)).\n", "\n", - "In particular, they allow to mix plain English text with Python code cells. The plain text can be formatted using the [Markdown](https://guides.github.com/features/mastering-markdown/) language and mathematical expressions can be typeset with [LaTeX](https://www.overleaf.com/learn/latex/Free_online_introduction_to_LaTeX_%28part_1%29). Lastly, we can include pictures, plots, and even videos. Because of these features, the notebooks developed for this book come in a self-contained \"tutorial\" style that enables students to learn and review the material on their own." + "In particular, they allow mixing plain English text with Python code cells. The plain text can be formatted using the [Markdown](https://guides.github.com/features/mastering-markdown/) language, and mathematical expressions can be typeset with [LaTeX](https://www.overleaf.com/learn/latex/Free_online_introduction_to_LaTeX_%28part_1%29). Lastly, we can include pictures, plots, and even videos. Because of these features, the notebooks developed for this book come in a self-contained \"tutorial\" style that enables students to learn and review the material on their own." ] }, { @@ -274,15 +272,15 @@ } }, "source": [ - "A Jupyter notebook consists of cells that have a type associated with them. So far, only cells of type \"Markdown\" have been used, which is the default way to present (formatted) text.\n", + "A Jupyter notebook consists of cells that have a type associated with them. So far, only cells of type \"Markdown\" have been used, which is the default way to present formatted text.\n", "\n", - "The next cell below is an example of a \"Code\" cell containing a line of actual Python code: it simply outputs the text \"Hello world\" when executed. To edit an existing code cell, just enter into it with a mouse click. You know that you are \"in\" a code cell when you see the frame of the code cell turn green.\n", + "The next cell below is an example of a \"Code\" cell containing a line of actual Python code: it merely outputs the text \"Hello world\" when executed. To edit an existing code cell, enter into it with a mouse click. You know that you are \"in\" a code cell when you see the frame of the code cell turn green.\n", "\n", - "Besides this **edit mode** there is also a so-called **command mode** that you can reach by hitting the \"Escape\" key after entering a code cell, which turns the frame's color into blue. Using the Enter\" and \"Escape\" keys you can now switch between the two modes.\n", + "Besides this **edit mode**, there is also a so-called **command mode** that you can reach by hitting the \"Escape\" key after entering a code cell, which turns the frame's color blue. Using the \"Enter\" and \"Escape\" keys, you can now switch between the two modes.\n", "\n", - "To *execute* (or *run*) a code cell, hold the \"Control\" key and press \"Enter\". Note how you do not go to the subsequent cell. Alternatively, you can hold the \"Shift\" key and press \"Enter\", which executes the cell and places your focus on the next cell right after.\n", + "To *execute*, or \"*run*,\" a code cell, hold the \"Control\" key and press \"Enter.\" Note how you do not go to the subsequent cell. Alternatively, you can hold the \"Shift\" key and press \"Enter,\" which executes the cell and places your focus on the next cell right after.\n", "\n", - "Similarly, a Markdown cell is also in either the edit or command mode. For example, simply double-click on the text you are just reading, which takes you into edit mode. Now you could change the formatting (e.g., make a word printed in *italics* or **bold** with single or double asterisks) and then \"execute\" the cell to actually render the text as specified.\n", + "Similarly, a Markdown cell is also in either edit or command mode. For example, double-click on the text you are just reading, which takes you into edit mode. Now you could change the formatting (e.g., make a word printed in *italics* or **bold** with single or double asterisks) and then \"execute\" the cell to render the text as specified.\n", "\n", "To change a cell's type, choose either \"Code\" or \"Markdown\" in the navigation bar at the top." ] @@ -316,7 +314,7 @@ } }, "source": [ - "Sometimes a code cell starts with an exclamation mark `!`. Then, the Jupyter notebook behaves as if you just typed the following command right into a terminal. The cell below asks `python` to show its version number. This is actually *not* Python code but a command in the [Shell](https://en.wikipedia.org/wiki/Shell_script) language. The `!` is useful to execute short commands without leaving a Jupyter notebook." + "Sometimes a code cell starts with an exclamation mark `!`. Then, the Jupyter notebook behaves as if the following command were typed directly into a terminal. The cell below asks `python` to show its version number and is *not* Python code but a command in the [Shell](https://en.wikipedia.org/wiki/Shell_script) language. The `!` is useful to execute short commands without leaving a Jupyter notebook." ] }, { @@ -359,15 +357,15 @@ } }, "source": [ - "In this book *programming* is \"defined\" as:\n", + "In this book, *programming* is \"defined\" as:\n", "\n", - "- a **structured** way of **problem solving**\n", - "- by **expressing** the steps of a **computation / process**\n", + "- a **structured** way of **problem-solving**\n", + "- by **expressing** the steps of a **computation/process**\n", "- and thereby **documenting** the process in a formal way\n", "\n", "Programming is always **concrete** and based on a **particular case**.\n", "\n", - "Programming definitely exhibits elements of an **art** or a **craft** as we will hear programmers call code \"beautiful\" or \"ugly\" or talk about the \"expressive\" power of an application." + "It exhibits elements of a form of **art** or a **craft** as we hear programmers call code \"beautiful\" or \"ugly\" or talk about the \"expressive\" power of an application." ] }, { @@ -385,7 +383,7 @@ "- develops and analyses **algorithms** and **data structures**,\n", "- and **proves** the **correctness** of a program\n", "\n", - "In a sense, a computer scientist does not need to know a programming language to work. In fact, many computer scientists only know how to produce \"ugly\" code in the eyes of professional programmers." + "In a sense, a computer scientist does not need to know a programming language to work, and many computer scientists only know how to produce \"ugly\"-looking code in the eyes of professional programmers." ] }, { @@ -396,7 +394,7 @@ } }, "source": [ - "*IT* or *information technology* is a term that has many meanings to many people. Often, it has something to do with hardware and devices both of which are out of scope for programmers and computer scientists. In fact, many people from the two aforementioned fields are more than happy if their printer and internet just work as they do not know a lot more about that than non-technical people." + "*IT* or *information technology* is a term that has many meanings to many people. Often, it has something to do with hardware or physical devices, both of which are out of scope for programmers and computer scientists. Many computer scientists and programmers are more than happy if their printer and internet connection work as they often do not know a lot more about that than non-technical people." ] }, { @@ -434,8 +432,8 @@ "- [Guido van Rossum](https://en.wikipedia.org/wiki/Guido_van_Rossum) (Python’s **[Benevolent Dictator for Life](https://en.wikipedia.org/wiki/Benevolent_dictator_for_life)**) was bored during a week around Christmas 1989 and started Python as a hobby project \"that would keep \\[him\\] occupied\" for some days\n", "- the idea was to create a **general-purpose scripting** language that would allow fast **prototyping** and would **run on every operating system**\n", "- Python grew through the 90s as van Rossum promoted it via his \"Computer Programming for Everybody\" initiative that had the **goal to encourage a basic level of coding literacy** as an equal knowledge alongside English literacy and math skills\n", - "- to become more independent from its creator the next major version **Python 2** (released in 2000; still in heavy use as of today) was **open-sourced** from the get-go which attracted a **large and global community of programmers** that **contributed** their individual knowledge and best practices in their free time to make Python even better\n", - "- **Python 3** resulted from a major overhaul of the language in 2008 taking into account the **learnings from almost two decades**, streamlining the language, and getting ready for the age of **big data**\n", + "- to become more independent from its creator the next major version **Python 2** (released in 2000; still in heavy use as of today) was **open-sourced** from the get-go which attracted a **large and global community of programmers** that **contributed** their expertise and best practices in their free time to make Python even better\n", + "- **Python 3** resulted from a significant overhaul of the language in 2008 taking into account the **learnings from almost two decades**, streamlining the language, and getting ready for the age of **big data**\n", "- the language is named after the sketch comedy group [Monty Python](https://en.wikipedia.org/wiki/Monty_Python)" ] }, @@ -458,7 +456,7 @@ } }, "source": [ - "Python is a **general-purpose** programming language that allows for **fast development**, is **easy to read**, **open-source**, long established, unifies the knowledge of **hundreds of thousands of experts** around the world, runs on basically every machine, and can handle the complexities of applications involving **big data**." + "Python is a **general-purpose** programming language that allows for **fast development**, is **easy to read**, **open-source**, long-established, unifies the knowledge of **hundreds of thousands of experts** around the world, runs on basically every machine, and can handle the complexities of applications involving **big data**." ] }, { @@ -480,9 +478,9 @@ } }, "source": [ - "Couldn't a company like Google, Facebook, or Microsoft come up with a better programming language? The following argument provides hints why this cannot really be the case.\n", + "Couldn't a company like Google, Facebook, or Microsoft come up with a better programming language? The following argument provides hints on why this cannot be the case.\n", "\n", - "Wouldn't it be weird if professors and scholars of English literature and language studies dictated how we'd have to speak in day-to-day casual conversations or how authors of poesy and novels should use language constructs to achieve a certain type of mood? If you agree with that premise, it makes sense to assume that even programming languages should evolve in a \"natural\" way as users just *use* the language over time and in new and unpredictable contexts making up their own new conventions." + "Wouldn't it be weird if professors and scholars of English literature and language studies dictated how we'd have to speak in day-to-day casual conversations or how authors of poesy and novels should use language constructs to achieve a particular type of mood? If you agree with that premise, it makes sense to assume that even programming languages should evolve in a \"natural\" way as users *use* the language over time and in new and unpredictable contexts creating new conventions." ] }, { @@ -493,7 +491,7 @@ } }, "source": [ - "Loose *communities* are the main building block around which open-source software is built. Someone starts a project (like Guido) and makes it free to use for anybody (e.g., on a code-sharing platform like [GitHub](https://github.com/)). People find it useful enough to solve one of their daily problems and start using it. They see how a project could be improved and provide new use cases (via the popularized concept of a \"[pull request](https://help.github.com/articles/about-pull-requests/)\"). The project grows both in lines of code and number of people using it. After a while, people start local user groups to share their same interests and meet on a regular basis (e.g., this is a big market for companies like [Meetup](https://www.meetup.com/) or non-profits like [PyData](https://pydata.org/)). Out of these local and usually monthly meetups grow yearly conferences on the country or even continental level (e.g., the original [PyCon](https://us.pycon.org/) in the US, [EuroPython](https://europython.eu/), or [PyCon.DE](https://de.pycon.org/)). The content presented at these conferences is made publicly available via GitHub and YouTube (e.g., [PyCon 2019](https://www.youtube.com/channel/UCxs2IIVXaEHHA4BtTiWZ2mQ) or [EuroPython](http://europython.tv/)) and serves as references on what people are working on and introductions to the endless number of specialized fields." + "Loose *communities* are the primary building block around which open-source software projects are built. Someone, like Guido, starts a project and makes it free to use for anybody (e.g., on a code-sharing platform like [GitHub](https://github.com/)). People find it useful enough to solve one of their daily problems and start using it. They see how a project could be improved and provide new use cases (via the popularized concept of a \"[pull request](https://help.github.com/articles/about-pull-requests/)\"). The project grows both in lines of code and people using it. After a while, people start local user groups to share their same interests and meet regularly (e.g., this is a big market for companies like [Meetup](https://www.meetup.com/) or non-profits like [PyData](https://pydata.org/)). Out of these local and usually monthly meetups grow yearly conferences on the country or even continental level (e.g., the original [PyCon](https://us.pycon.org/) in the US, [EuroPython](https://europython.eu/), or [PyCon.DE](https://de.pycon.org/)). The content presented at these conferences is made publicly available via GitHub and YouTube (e.g., [PyCon 2019](https://www.youtube.com/channel/UCxs2IIVXaEHHA4BtTiWZ2mQ) or [EuroPython](http://europython.tv/)) and serves as references on what people are working on and introductions to the endless number of specialized fields." ] }, { @@ -504,11 +502,11 @@ } }, "source": [ - "While these communities are rather loose and constantly changing, smaller in-groups, often democratically organized and elected (e.g., the Python Software Foundation), take care of, for example, the development of the \"core\" Python language itself.\n", + "While these communities are somewhat loose and continuously changing, smaller in-groups, often democratically organized and elected (e.g., the [Python Software Foundation](https://www.python.org/psf/)), take care of, for example, the development of the \"core\" Python language itself.\n", "\n", - "Interestingly, Python is just a specification (i.e., a set of rules) as to what is allowed and what not. The current version of Python can always be lookep up in the [Python Language Reference](https://docs.python.org/3/reference/index.html). In order to make changes to that, anyone can make a so-called **[Python Enhancement Proposal](https://www.python.org/dev/peps/)**, or **PEP** for short, where it needs to be specified what exact changes are to be made and argued why that is a good thing to do. These PEPs are reviewed by the [core developers](https://devguide.python.org/coredev/) and anyone interested and are then either accepted, modified, or rejected if, for example, the change introduces internal inconsistencies. This is very similar to the **double blind peer review** established in academia. In fact, many of the contributors held or hold positions in academia, one more indicator of the high quality standards in the Python community. To learn more about PEPs, check out [PEP 1](https://www.python.org/dev/peps/pep-0001/) that describes the entire process.\n", + "Interestingly, Python is just a specification (i.e., a set of rules) as to what is allowed and what not. The current version of Python can always be looked up in the [Python Language Reference](https://docs.python.org/3/reference/index.html). To make changes to that, anyone can make a so-called **[Python Enhancement Proposal](https://www.python.org/dev/peps/)**, or **PEP** for short, where it needs to be specified what exact changes are to be made and argued why that is a good thing to do. These PEPs are reviewed by the [core developers](https://devguide.python.org/coredev/) and interested people and are then either accepted, modified, or rejected if, for example, the change introduces internal inconsistencies. This process is similar to the **double-blind peer review** established in academia. Many of the contributors held or hold positions in academia, one more indicator of the high quality standards in the Python community. To learn more about PEPs, check out [PEP 1](https://www.python.org/dev/peps/pep-0001/) that describes the entire process.\n", "\n", - "In total, no one single entity can control how the language evolves and the users' needs and ideas always feed back to the language specification via a quality controlled and \"democratic\" process." + "In total, no one single entity can control how the language evolves, and the users' needs and ideas always feed back to the language specification via a quality controlled and \"democratic\" process." ] }, { @@ -519,7 +517,7 @@ } }, "source": [ - "Besides being free as in **\"free beer\"**, a major benefit of open-source is that one can always **look up how something works in detail** (that is the literal meaning of *open* source). This is a huge benefit compared to commercial languages (e.g., MATLAB) since a programmer can always continue to **study best practices** or how things are done. Along this way, many **errors are uncovered** as well. Furthermore, if one runs an open-source application, one can be reasonably sure that no bad people built in a \"backdoor\". [Free software](https://en.wikipedia.org/wiki/Free_software) is consequently free of charge but brings many other freedoms with it, most notable the freedom to change the code." + "Besides being free as in **\"free beer**,\" a major benefit of open-source is that one can always **look up how something works in detail**: That is the literal meaning of *open* source and different as compared to commercial languages (e.g., MATLAB) since a programmer can always continue to **study best practices** or find out how things are implemented. Along this way, many **errors are uncovered** as well. Furthermore, if one runs an open-source application, one can be reasonably sure that no bad people built in a \"backdoor.\" [Free software](https://en.wikipedia.org/wiki/Free_software) is consequently free of charge but brings many other freedoms with it, most notably the freedom to change the code." ] }, { @@ -541,13 +539,13 @@ } }, "source": [ - "The \"weird\" thing is that the default Python implementation is actually written in the C language.\n", + "The \"weird\" thing is that the default Python implementation is written in the C language.\n", "\n", - "[C](https://en.wikipedia.org/wiki/C_%28programming_language%29) and [C++](https://en.wikipedia.org/wiki/C%2B%2B) (check this [introduction](https://www.learncpp.com/)) are wide-spread and long established (i.e., since the 1970s) programming languages that are employed in many mission critical softwares (e.g., operating systems themselves, low latency databases and web servers, nuclear reactor control systems, airplanes, ...). They are fast mainly because the programmer not only needs to come up with the **business logic** but also manage the computer's memory \"manually\" (and the knowledge necessary to do that is not easy to learn).\n", + "[C](https://en.wikipedia.org/wiki/C_%28programming_language%29) and [C++](https://en.wikipedia.org/wiki/C%2B%2B) (cf., this [introduction](https://www.learncpp.com/)) are wide-spread and long-established (i.e., since the 1970s) programming languages employed in many mission-critical software systems (e.g., operating systems themselves, low latency databases and web servers, nuclear reactor control systems, airplanes, ...). They are fast, mainly because the programmer not only needs to come up with the **business logic** but also manage the computer's memory \"manually\" (and the knowledge necessary to do that is not easy to learn).\n", "\n", - "In contrast, Python automatically manages the memory for the programmer. This speeds up the development process a lot. So speed here is really a trade-off of application run time vs. engineering / development time and in many applications the program's run time is not that important (e.g., what if C needs 0.001 seconds in a case where Python needs 0.1 seconds to train a linear regression model?). When the requirements change and computing speed becomes an issue, the Python community offers many third-party libraries (usually also written in C) where specific problems can be solved in near-C time.\n", + "In contrast, Python automatically manages the memory for the programmer. So, speed here is a trade-off between application run time and engineering/development time. Often, the program's run time is not that important: For example, what if C needs 0.001 seconds in a case where Python needs 0.1 seconds to train a linear regression model? When the requirements change and computing speed becomes an issue, the Python community offers many third-party libraries (usually also written in C) where specific problems can be solved in near-C time.\n", "\n", - "**In a nutshell**: While it is of course true that C is a lot faster than Python when it comes to **pure computation time**, in many cases this does not really matter as the **significantly shorter development cycles** are the more important cost factor in a rapidly changing business world." + "**In a nutshell**: While it is, of course, true that C is a lot faster than Python when it comes to **pure computation time**, often, this does not matter as the **significantly shorter development cycles** are the more significant cost factor in a rapidly changing business world." ] }, { @@ -580,25 +578,23 @@ } }, "source": [ - "While it is usually not the best argument to quote authorative figures like the pope, we will do this in this section anyways:\n", + "While it is usually not the best argument to quote authoritative figures like the pope, we briefly look at who uses Python here and leave it up to the reader to decide if this is convincing or not:\n", "\n", "- **[Massachusetts Institute of Technology](https://www.mit.edu/)**\n", " - teaches Python in its [introductory course](https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-0001-introduction-to-computer-science-and-programming-in-python-fall-2016/) to computer science independent of the student's major\n", - " - replaced the infamous course on the [Scheme](https://groups.csail.mit.edu/mac/projects/scheme/) language ([source](https://news.ycombinator.com/item?id=602307))\n", + " - replaced the infamous course on the [Scheme](https://groups.csail.mit.edu/mac/projects/scheme/) language (cf., [source](https://news.ycombinator.com/item?id=602307))\n", "- **[Google](https://www.google.com/)**\n", - " - used the strategy **\"Python where we can, C++ where we must\"** from its early days on to stay flexible in a rapidly changing environment ([source](https://stackoverflow.com/questions/2560310/heavy-usage-of-python-at-google))\n", - " - the very first web-crawler was written in **Java and so difficult** to maintain that it was **re-written in Python** right away ([source](https://www.amazon.com/Plex-Google-Thinks-Works-Shapes/dp/1416596585/ref=sr_1_1?ie=UTF8&qid=1539101827&sr=8-1&keywords=in+the+plex))\n", + " - used the strategy **\"Python where we can, C++ where we must\"** from its early days on to stay flexible in a rapidly changing environment (cf., [source](https://stackoverflow.com/questions/2560310/heavy-usage-of-python-at-google))\n", + " - the very first web-crawler was written in **Java and so difficult to maintain** that it was **re-written in Python** right away (cf., [source](https://www.amazon.com/Plex-Google-Thinks-Works-Shapes/dp/1416596585/ref=sr_1_1?ie=UTF8&qid=1539101827&sr=8-1&keywords=in+the+plex))\n", " - Python and C++, Java, and Go are the only four server-side languages to be deployed to production\n", " - Guido van Rossom was hired by Google from 2005 to 2012 to advance the language there\n", - "- **[NASA](https://www.nasa.gov/)** open-sources a lot of its projects, many of which are written in Python and regard working with really big data ([source](https://code.nasa.gov/language/python/))\n", + "- **[NASA](https://www.nasa.gov/)** open-sources many of its projects, often written in Python and regarding analyses with big data (cf., [source](https://code.nasa.gov/language/python/))\n", "- **[Facebook](https://facebook.com/)** uses Python besides C++ and its legacy PHP (a language for building websites; the \"cool kid\" from the early 2000s)\n", - "- **[Instagram](https://instagram.com/)** operates the largest installation of the popular **web framework [Django](https://www.djangoproject.com/)** ([source](https://instagram-engineering.com/web-service-efficiency-at-instagram-with-python-4976d078e366))\n", - "- **[Spotify](https://spotify.com/)** bases its data science on Python ([source](https://labs.spotify.com/2013/03/20/how-we-use-python-at-spotify/))\n", - "- **[Netflix](https://netflix.com/)** also runs its predictive models on Python ([source](https://medium.com/netflix-techblog/python-at-netflix-86b6028b3b3e))\n", - "- **[Dropbox](https://dropbox.com/)** \"stole\" Guido van Rossom from Google to help scale the platform ([source](https://medium.com/dropbox-makers/guido-van-rossum-on-finding-his-way-e018e8b5f6b1))\n", - "- **[JPMorgan Chase](https://www.jpmorganchase.com/)** requires new employees to learn Python as part of the onboarding process starting with the 2018 intake ([source](https://www.ft.com/content/4c17d6ce-c8b2-11e8-ba8f-ee390057b8c9?segmentId=a7371401-027d-d8bf-8a7f-2a746e767d56))\n", - "- ...\n", - "- ... this list is intentionally shortened as Python is simply very popular ..." + "- **[Instagram](https://instagram.com/)** operates the largest installation of the popular **web framework [Django](https://www.djangoproject.com/)** (cf., [source](https://instagram-engineering.com/web-service-efficiency-at-instagram-with-python-4976d078e366))\n", + "- **[Spotify](https://spotify.com/)** bases its data science on Python (cf., [source](https://labs.spotify.com/2013/03/20/how-we-use-python-at-spotify/))\n", + "- **[Netflix](https://netflix.com/)** also runs its predictive models on Python (cf., [source](https://medium.com/netflix-techblog/python-at-netflix-86b6028b3b3e))\n", + "- **[Dropbox](https://dropbox.com/)** \"stole\" Guido van Rossom from Google to help scale the platform (cf., [source](https://medium.com/dropbox-makers/guido-van-rossum-on-finding-his-way-e018e8b5f6b1))\n", + "- **[JPMorgan Chase](https://www.jpmorganchase.com/)** requires new employees to learn Python as part of the onboarding process starting with the 2018 intake (cf., [source](https://www.ft.com/content/4c17d6ce-c8b2-11e8-ba8f-ee390057b8c9?segmentId=a7371401-027d-d8bf-8a7f-2a746e767d56))" ] }, { @@ -631,7 +627,7 @@ } }, "source": [ - "As the graph below shows, neither Google's very own language **[Go](https://golang.org/)** nor **[R](https://www.r-project.org/)**, a very popular language in the niche of statistics and data science, can compete with Python's year-to-year growth." + "As the graph below shows, neither Google's very own language **[Go](https://golang.org/)** nor **[R](https://www.r-project.org/)**, a domain-specific language in the niche of statistics, can compete with Python's year-to-year growth." ] }, { @@ -686,13 +682,13 @@ } }, "source": [ - "Simply put, **always be coding**.\n", + "**A**lways **b**e **c**oding.\n", "\n", - "Programming is more than just writing code into a text file. It means reading through parts of documentation, blogs with best practices, and tutorials, or researching some problem on Stack Overflow while trying to implement some feature in the application at hand. Also, it means using command line tools to automate some part of the work, or manage different versions of a software simultaneously using software like **[git](https://git-scm.com/)**. In short, programming involves a lot of \"muscle memory\". This can only be built and kept up through near-daily usage.\n", + "Programming is more than just writing code into a text file. It means reading through parts of the [documentation](https://docs.python.org/), blogs with best practices, and tutorials, or researching problems on Stack Overflow while trying to implement features in the application at hand. Also, it means using command-line tools to automate some part of the work or manage different versions of a program, for example, with **[git](https://git-scm.com/)**. In short, programming involves a lot of \"muscle memory,\" which can only be built and kept up through near-daily usage.\n", "\n", - "Further, many aspects of software architecture and best practices can only be understood after having implemented some requirement for the very first time. Coding also means \"breaking\" things to find out what makes them actually work to begin with.\n", + "Further, many aspects of software architecture and best practices can only be understood after having implemented some requirements for the very first time. Coding also means \"breaking\" things to find out what makes them work in the first place.\n", "\n", - "Coding is learned best by just doing it for some time on a daily or at least regular basis and not right before some task is due, just like learning a \"real\" language." + "Coding is learned best by just doing it for some time on a daily or at least a regular basis and not right before some task is due, just like learning a \"real\" language." ] }, { @@ -716,12 +712,12 @@ "source": [ "[Y Combinator](https://www.ycombinator.com/)'s co-founder [Paul Graham](https://en.wikipedia.org/wiki/Paul_Graham_%28programmer%29) wrote a very popular and often cited [article](http://www.paulgraham.com/makersschedule.html) where he divides every person into belonging to one of two groups:\n", "\n", - "- **Managers**: People that need to organize things and command others (like a \"boss\"). Their schedule is usually organized by the hour or even by 30 minute intervals.\n", + "- **Managers**: People that need to organize things and command others (like a \"boss\"). Their schedule is usually organized by the hour or even 30-minute intervals.\n", "- **Makers**: People that create things (like programmers, artists, or writers). Such people think in half days or full days.\n", "\n", - "Have you ever wondered why so many tech nerds work all the nights and sleep during \"weird\" hours? This is mainly because many programming related tasks require a certain \"flow\" state of one's mind. This is hard to achieve when one can get interupted, even if it is only for one short question. Graham describes that only knowing that one has an appointment in three hours can cause a programmer to not get into a flow state.\n", + "Have you ever wondered why so many tech people work during nights and sleep at \"weird\" times? The reason is that many programming related tasks require a \"flow\" state in one's mind that is hard to achieve when one can get interrupted, even if it is only for one short question. Graham describes that only knowing that one has an appointment in three hours can cause a programmer to not get into a flow state.\n", "\n", - "As a result, do not set aside a certain amount of time for learning something but rather plan in an **entire evening** or a **rainy Sunday** where you can work on a problem in an **open end** setting. And do not be surprised any more to hear a \"I looked at it over the weekend\" from a programmer." + "As a result, do not set aside a certain amount of time for learning something but rather plan in an **entire evening** or a **rainy Sunday** where you can work on a problem in an **open end** setting. And do not be surprised anymore to hear \"I looked at it over the weekend\" from a programmer." ] }, { @@ -743,17 +739,17 @@ } }, "source": [ - "When being asked the above question, most programmers will answer something that can be classified into one of two broader groups.\n", + "When being asked the above question, most programmers answer something that can be classified into one of two broader groups.\n", "\n", - "**1) Toy Problem / Case Study / Prototype**: Pick some problem that you want to write a programatic solution for, break it down into smaller packages, and solve them \"backwards\".\n", + "**1) Toy Problem, Case Study, or Prototype**: Pick some problem, break it down into smaller sub-problems, and solve them with an end in mind.\n", "\n", - "**2) Books / Video Tutorials / Courses**: Research the best book / blog post / video tutorial for something and work it through from start to end.\n", + "**2) Books, Video Tutorials, and Courses**: Research the best book, blog, video, or tutorial for something and work it through from start to end.\n", "\n", "The truth is that you need to iterate between these two phases.\n", "\n", - "Building a prototype will always reveal issues no book or tutorial can think of before. Data is never clean as it comes. Some algorithm from a text book must be adapted to a very specific but important aspect of a case study. It is important to learn to \"ship a product\", i.e., to finish some project to the very end because only then you will have looked at all the aspects.\n", + "Building a prototype always reveals issues no book or tutorial can think of before. Data is never as clean as it should be. An algorithm from a textbook must be adapted to a peculiar aspect of a case study. It is essential to learn to \"ship a product\" because only then will one have looked at all the aspects.\n", "\n", - "The major downside of this approach is that you likely learn bad \"patterns\" as whatever you learn is overfitted to the case and you do not get the big picture or mental concepts behind a solution. This is a gap well written books can fill in (e.g., check the Python / Programming books by [Packt](https://www.packtpub.com/packt/offers/free-learning/) or [O’Reilly](https://www.oreilly.com/))." + "The major downside of this approach is that one likely learns bad \"patterns\" overfitted to the case at hand, and one does not get the big picture or mental concepts behind a solution. This gap can be filled in by well-written books: For example, check the Python/programming books offered by [Packt](https://www.packtpub.com/packt/offers/free-learning/) or [O’Reilly](https://www.oreilly.com/)." ] }, { @@ -778,11 +774,11 @@ "**Part 1: Expressing Logic**\n", "\n", "- What is a programming language? What kind of words exist?\n", - " 1. Elements of a Program\n", - " 2. Functions & Modularization\n", + " 1. [Elements of a Program](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb)\n", + " 2. [Functions & Modularization](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/02_functions.ipynb)\n", "- What is the flow of execution? How can we form sentences from words?\n", - " 3. Conditionals & Exceptions\n", - " 4. Recursion & Looping" + " 3. [Conditionals & Exceptions](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/03_conditionals.ipynb)\n", + " 4. [Recursion & Looping](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/04_iteration.ipynb)" ] }, { @@ -796,12 +792,12 @@ "**Part 2: Managing Data and Memory**\n", "\n", "- How is data stored in memory?\n", - " 5. Numbers\n", + " 5. [Numbers](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/05_numbers.ipynb)\n", " 6. Text\n", " 7. Sequences\n", " 8. Mappings & Sets\n", " 9. Arrays\n", - "- How can we create our own data types?\n", + "- How can we create custom data types?\n", " 10. Object-Orientation" ] }, diff --git a/00_start_up_review_and_exercises.ipynb b/00_start_up_review_and_exercises.ipynb index 2f4c334..109a2fc 100644 --- a/00_start_up_review_and_exercises.ipynb +++ b/00_start_up_review_and_exercises.ipynb @@ -53,7 +53,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Q2**: Explain what is a *pull request* and elaborate how this concept fits to a *distributed* organization of work!" + "**Q2**: Explain what is a *pull request* and elaborate on how this concept fits a *distributed* organization of work!" ] }, { @@ -81,7 +81,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Q4**: What is a major advantage of a \"slow\" programming language like Python over a faster one like C?" + "**Q4**: What is a significant advantage of a \"slow\" programming language like Python over a faster one like C?" ] }, { @@ -109,7 +109,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Q5**: Python has been the fastest growing *major* programming language in recent years." + "**Q5**: Python has been the fastest-growing *major* programming language in recent years." ] }, { @@ -137,7 +137,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Q7**: Python was originally designed for highly intensive numerical computing, in particular for use cases from physics and astronomy." + "**Q7**: Python was initially designed for highly intensive numerical computing, in particular for use cases from physics and astronomy." ] }, { @@ -151,7 +151,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Q8**: JavaScript is a special subset of the Java language." + "**Q8**: JavaScript is a subset of the Java language." ] }, { @@ -165,7 +165,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Q9**: Python is *free software*. That means it will never cost anything." + "**Q9**: Python is *free software*. That means it does not cost anything." ] }, { @@ -179,7 +179,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Q10**: The main purpose of PEPs is to regulate how code should be documented and/or styled." + "**Q10**: The primary purpose of PEPs is to regulate how code should be documented and styled." ] }, { @@ -228,7 +228,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Q12**: The quote \"Education is what remains after one has forgotten what one has learned in school\" is attributed to Albert Einstein. Use a special Markdown syntax to display the author and his quote appropriately!" + "**Q12**: The quote \"Education is what remains after one has forgotten what one has learned in school\" is attributed to Albert Einstein. Display the author and his quote appropriately!" ] }, { diff --git a/01_elements.ipynb b/01_elements.ipynb index 382d757..c941e43 100644 --- a/01_elements.ipynb +++ b/01_elements.ipynb @@ -20,11 +20,11 @@ } }, "source": [ - "Do you remember how you first learned to speak in your mother tounge? Probably not. No one's memory goes back that far. Your earliest memory as a child should probably be around the age of three or four years old when you could already say simple things and interact with your environment. Although you did not know any grammar rules yet, other people just understood what you said. Well, most of the time.\n", + "Do you remember how you first learned to speak in your mother tongue? Probably not. No one's memory goes back that far. Your earliest memory as a child should probably be around the age of three or four years old when you could already say simple things and interact with your environment. Although you did not know any grammar rules yet, other people just understood what you said. Well, most of the time.\n", "\n", - "It is intuitively best to take the very mindset of a small child when learing a foreign language and we do so for learning the Python language as well. This first chapter introduces simplistic examples and we better just accept them as they are without knowing any of the \"grammar\" rules yet. Then, we analyze them in parts and slowly build up our understanding.\n", + "It is intuitively best to take the very mindset of a small child when learning a foreign language. This first chapter introduces simplistic examples, and we accept them as they are *without* knowing any of the \"grammar\" rules yet. Then, we analyze them in parts and slowly build up our understanding.\n", "\n", - "Consequently, if parts of this chapter do not make sense right away, let's not worry too much. Besides introducing some basics that we need to understand, it also serves as an outlook for what is to come. So, many terms and concepts used here will be covered in great detail in following chapters." + "Consequently, if parts of this chapter do not make sense right away, let's not worry too much. Besides introducing the basic elements, it also serves as an outlook for what is to come. So, many terms and concepts used here are deconstructed in great detail in the following chapters." ] }, { @@ -75,7 +75,7 @@ } }, "source": [ - "To verify that something happened in our computer's memory, we simply **reference** `numbers` and observe that Python indeed knows about it." + "To verify that something happened in our computer's memory, we **reference** `numbers`." ] }, { @@ -116,11 +116,11 @@ "\n", "The `if number % 2 == 0` may look disturbing at first sight. Both the `%` and `==` must have an unintuitive meaning here. Luckily, the **comment** in the same line after the `#` symbol has the answer: The program only does something if the current `number` is even.\n", "\n", - "In particular, it increases `count` by $1$ and adds the current `number` onto the [running](https://en.wikipedia.org/wiki/Running_total) `total`. Both `count` and `number` were initially set to $0$ and the single `=` symbol reads as \"... is *set* equal to ...\". It could not indicate a mathematical equation as, for example, `count` is generally not equal to `count + 1`.\n", + "In particular, it increases `count` by $1$ and adds the current `number` onto the [running](https://en.wikipedia.org/wiki/Running_total) `total`. Both `count` and `number` are initially set to $0$ and the single `=` symbol reads as \"... is *set* equal to ...\". It could not indicate a mathematical equation as, for example, `count` is generally not equal to `count + 1`.\n", "\n", - "Lastly, the `average` is calculated as the ratio of the final **values** of `total` and `count`. Overall, we divide the sum of all even numbers by the count of all even numbers, which is exactly what we are looking for.\n", + "Lastly, the `average` is calculated as the ratio of the final **values** of `total` and `count`. Overall, we divide the sum of all even numbers by their count: This is nothing but the definition of an average.\n", "\n", - "We also observe how the lines of code \"within\" the `for` and `if` **statements** are **indented** and *aligned* with multiples of **four spaces**: This shows immediately how the lines relate to each other." + "The lines of code \"within\" the `for` and `if` **statements** are **indented** and **aligned** with multiples of **four spaces**: This shows immediately how the lines relate to each other." ] }, { @@ -133,8 +133,8 @@ }, "outputs": [], "source": [ - "count = 0 # initialize variables to keep track of the sum\n", - "total = 0 # so far and the count of the even numbers\n", + "count = 0 # initialize variables to keep track of the\n", + "total = 0 # running total and the count of even numbers\n", "\n", "for number in numbers:\n", " if number % 2 == 0: # only look at even numbers\n", @@ -152,7 +152,7 @@ } }, "source": [ - "We do not see any **output** yet but can obtain the value of `average` by simply referencing it again." + "We do not see any **output** yet but obtain the value of `average` by referencing it again." ] }, { @@ -249,7 +249,7 @@ } }, "source": [ - "To visualize something before the end of the cell, we use the [print()](https://docs.python.org/3/library/functions.html#print) built-in **function**." + "To visualize something before the end of the cell, we use the built-in [print()](https://docs.python.org/3/library/functions.html#print) **function**. Here, the parentheses `()` indicate that we execute code defined somewhere else." ] }, { @@ -283,7 +283,7 @@ } }, "source": [ - "Outside Jupyter notebooks, the semicolon `;` is actually used as a **seperator** between several statements that would otherwise have to be on a line on their own. However, it as *not* considered good practice to use it as it makes code less readable." + "Outside Jupyter notebooks, the semicolon `;` is used as a **separator** between statements that must otherwise be on a line on their own. However, it is *not* considered good practice to use it as it makes code less readable." ] }, { @@ -327,11 +327,11 @@ } }, "source": [ - "Python comes with many **[operators](https://docs.python.org/3/reference/lexical_analysis.html#operators)** built in: They are **tokens** (i.e., \"symbols\") that have a special meaning to the Python interpreter.\n", + "Python comes with many built-in **[operators](https://docs.python.org/3/reference/lexical_analysis.html#operators)**: They are **tokens** (i.e., \"symbols\") that have a special meaning to the Python interpreter.\n", "\n", - "The arithmetic operators either \"operate\" with the number immediately following them (= **unary** operators; e.g., negation) or \"process\" the two numbers \"around\" them (= **binary** operators; e.g., addition). But we will see many exceptions from that as well.\n", + "The arithmetic operators either \"operate\" with the number immediately following them (= **unary** operators; e.g., negation) or \"process\" the two numbers \"around\" them (= **binary** operators; e.g., addition).\n", "\n", - "By definition, operators have **no** permanent **side effects** in the computer's memory. Although the code cells in this section do indeed create *new* numbers in memory (e.g., `77 + 13` creates `90`), they are immediately \"forgotten\" as they are not stored in a **variable** like `numbers` or `average` above. We will continue this thought further below when we compare **expressions** with **statements**.\n", + "By definition, operators have **no** permanent **side effects** in the computer's memory. Although the code cells in this section do indeed create *new* numbers in memory (e.g., `77 + 13` creates `90`), they are immediately \"forgotten\" as they are not stored in a **variable** like `numbers` or `average` above. We continue this thought further below when we compare **expressions** with **statements**.\n", "\n", "Let's see some examples of operators. We start with the binary `+` and the `-` operators for addition and subtraction. Binary operators are designed to resemble what mathematicians call [infix notation](https://en.wikipedia.org/wiki/Infix_notation) and have the expected meaning." ] @@ -392,7 +392,7 @@ } }, "source": [ - "The `-` operator can be used as a unary operator as well. Then it just flips the sign of a number." + "The `-` operator is used as a unary operator as well. Then it just flips the sign of a number." ] }, { @@ -427,7 +427,7 @@ } }, "source": [ - "When we compare the output of the `*` and `/` operators for multiplication and division, we note the subtle *difference* between the $42$ and the $42.0$. This is a first illustration of the concept of a **data type**." + "When we compare the output of the `*` and `/` operators for multiplication and division, we note the subtle *difference* between the $42$ and the $42.0$: They are the *same* number represented as a *different* **data type**." ] }, { @@ -486,7 +486,7 @@ } }, "source": [ - "The so-called **floor division operator** `//` always rounds down to the next integer and is thus also called **integer division operator**. This is a first example of an operator we commonly do not know from high school mathematics." + "The so-called **floor division operator** `//` always \"rounds\" to an integer and is thus also called **integer division operator**. It is an example of an arithmetic operator we commonly do not know from high school mathematics." ] }, { @@ -545,12 +545,47 @@ } }, "source": [ - "To obtain the remainder of a division, we use the **modulo operator** `%`." + "Even though it appears that the `//` operator **truncates** (i.e., \"cuts off\") the decimals so as to effectively \"rounding\" down (i.e., the `42.5` became `42` in the previous code cell), this is *not* the case: The result is always \"rounded\" towards minus infinity!" ] }, { "cell_type": "code", "execution_count": 16, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "-43" + ] + }, + "execution_count": 16, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "-85 // 2" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "To obtain the remainder of a division, we use the **modulo operator** `%`." + ] + }, + { + "cell_type": "code", + "execution_count": 17, "metadata": { "slideshow": { "slide_type": "slide" @@ -563,7 +598,7 @@ "1" ] }, - "execution_count": 16, + "execution_count": 17, "metadata": {}, "output_type": "execute_result" } @@ -580,12 +615,14 @@ } }, "source": [ - "Note that the remainder is $0$ if a number is divisable by another." + "The remainder is $0$ *only if* a number is *divisible* by another.\n", + "\n", + "A popular convention in both, computer science and mathematics, is to abbreviate \"only if\" as **iff**, which is short for \"**[if and only if](https://en.wikipedia.org/wiki/If_and_only_if)**.\" The iff means that a remainder of $0$ implies that a number is divisible by another but also that a number divisible by another implies a remainder of $0$. The implication goes in *both* directions!" ] }, { "cell_type": "code", - "execution_count": 17, + "execution_count": 18, "metadata": { "slideshow": { "slide_type": "fragment" @@ -598,7 +635,7 @@ "0" ] }, - "execution_count": 17, + "execution_count": 18, "metadata": {}, "output_type": "execute_result" } @@ -620,7 +657,7 @@ }, { "cell_type": "code", - "execution_count": 18, + "execution_count": 19, "metadata": { "slideshow": { "slide_type": "fragment" @@ -633,7 +670,7 @@ "3" ] }, - "execution_count": 18, + "execution_count": 19, "metadata": {}, "output_type": "execute_result" } @@ -644,7 +681,7 @@ }, { "cell_type": "code", - "execution_count": 19, + "execution_count": 20, "metadata": { "slideshow": { "slide_type": "-" @@ -657,7 +694,7 @@ "23" ] }, - "execution_count": 19, + "execution_count": 20, "metadata": {}, "output_type": "execute_result" } @@ -674,12 +711,12 @@ } }, "source": [ - "The [divmod()](https://docs.python.org/3/library/functions.html#divmod) built-in function combines the integer and modulo divisions into one operation. However, this is not an operator but a function. Also observe that [divmod()](https://docs.python.org/3/library/functions.html#divmod) returns a \"pair\" of integers." + "The [divmod()](https://docs.python.org/3/library/functions.html#divmod) built-in function combines the integer and modulo divisions into one operation. However, this is not an operator but a function. Also, [divmod()](https://docs.python.org/3/library/functions.html#divmod) returns a \"pair\" of integers and not just one object." ] }, { "cell_type": "code", - "execution_count": 20, + "execution_count": 21, "metadata": { "slideshow": { "slide_type": "fragment" @@ -692,7 +729,7 @@ "(4, 2)" ] }, - "execution_count": 20, + "execution_count": 21, "metadata": {}, "output_type": "execute_result" } @@ -709,12 +746,12 @@ } }, "source": [ - "Raising a number to a power is performed with the **exponentiation operator** `**`. This is different from the `^` operator many other programming languages use and that also exists in Python with a *different* meaning." + "Raising a number to a power is performed with the **exponentiation operator** `**`. It is different from the `^` operator other programming languages may use and that also exists in Python with a *different* meaning." ] }, { "cell_type": "code", - "execution_count": 21, + "execution_count": 22, "metadata": { "slideshow": { "slide_type": "slide" @@ -727,7 +764,7 @@ "8" ] }, - "execution_count": 21, + "execution_count": 22, "metadata": {}, "output_type": "execute_result" } @@ -744,12 +781,12 @@ } }, "source": [ - "The normal [order of precedence](https://docs.python.org/3/reference/expressions.html#operator-precedence) from mathematics applies (i.e., \"PEMDAS\" rule)." + "The standard [order of precedence](https://docs.python.org/3/reference/expressions.html#operator-precedence) from mathematics applies (i.e., \"PEMDAS\" rule)." ] }, { "cell_type": "code", - "execution_count": 22, + "execution_count": 23, "metadata": { "slideshow": { "slide_type": "fragment" @@ -762,7 +799,7 @@ "18" ] }, - "execution_count": 22, + "execution_count": 23, "metadata": {}, "output_type": "execute_result" } @@ -784,7 +821,7 @@ }, { "cell_type": "code", - "execution_count": 23, + "execution_count": 24, "metadata": { "slideshow": { "slide_type": "-" @@ -797,7 +834,7 @@ "18" ] }, - "execution_count": 23, + "execution_count": 24, "metadata": {}, "output_type": "execute_result" } @@ -808,7 +845,7 @@ }, { "cell_type": "code", - "execution_count": 24, + "execution_count": 25, "metadata": { "slideshow": { "slide_type": "-" @@ -821,7 +858,7 @@ "81" ] }, - "execution_count": 24, + "execution_count": 25, "metadata": {}, "output_type": "execute_result" } @@ -838,12 +875,12 @@ } }, "source": [ - "Some programmers also use \"style\" conventions. For example, we can play with the **whitespace**, which is an umbrella term that refers to any non-printable sign like spaces, tabs, or the like. However, parentheses convey a much clearer picture." + "Some programmers also use \"style\" conventions. For example, we might play with the **whitespace**, which is an umbrella term that refers to any non-printable sign like spaces, tabs, or the like. However, parentheses convey a much clearer picture." ] }, { "cell_type": "code", - "execution_count": 25, + "execution_count": 26, "metadata": { "slideshow": { "slide_type": "skip" @@ -856,7 +893,7 @@ "18" ] }, - "execution_count": 25, + "execution_count": 26, "metadata": {}, "output_type": "execute_result" } @@ -873,7 +910,7 @@ } }, "source": [ - "There are many more non-mathematical operators that are introduced throughout this book together with the concepts they implement. Some of these are already shown in the next section." + "There exist many non-mathematical operators that are introduced throughout this book, together with the concepts they implement. They often come in a form different from the unary and binary mentioned above." ] }, { @@ -897,14 +934,14 @@ "source": [ "Python is a so-called **object-oriented** language, which is a paradigm of organizing a program's memory.\n", "\n", - "An **object** can be viewed as a \"bag\" of $0$s and $1$s in a distinct memory location that not only portrayes the idea of a certain **value** but also has some associated rules as to how this value is treated and may be worked with.\n", + "An **object** may be viewed as a \"bag\" of $0$s and $1$s in a distinct memory location. The concrete $0$s and $1$s in a bag portray the idea of the object's **value**, and there exist different **types** of bags that each come with associated rules as to how the $0$s and $1$s are interpreted and may be worked with.\n", "\n", - "An object *always* has **three** main characteristics. Let's look at the following examples and work them out." + "So, an object *always* has **three** main characteristics. Let's look at the following examples and work them out." ] }, { "cell_type": "code", - "execution_count": 26, + "execution_count": 27, "metadata": { "slideshow": { "slide_type": "slide" @@ -941,7 +978,7 @@ }, { "cell_type": "code", - "execution_count": 27, + "execution_count": 28, "metadata": { "slideshow": { "slide_type": "slide" @@ -951,31 +988,7 @@ { "data": { "text/plain": [ - "140658972730512" - ] - }, - "execution_count": 27, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "id(a)" - ] - }, - { - "cell_type": "code", - "execution_count": 28, - "metadata": { - "slideshow": { - "slide_type": "-" - } - }, - "outputs": [ - { - "data": { - "text/plain": [ - "140658972907392" + "139829040614096" ] }, "execution_count": 28, @@ -984,7 +997,7 @@ } ], "source": [ - "id(b)" + "id(a)" ] }, { @@ -999,7 +1012,7 @@ { "data": { "text/plain": [ - "140658972586992" + "139829040789352" ] }, "execution_count": 29, @@ -1007,6 +1020,30 @@ "output_type": "execute_result" } ], + "source": [ + "id(b)" + ] + }, + { + "cell_type": "code", + "execution_count": 30, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "139829040474736" + ] + }, + "execution_count": 30, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "id(c)" ] @@ -1019,12 +1056,12 @@ } }, "source": [ - "These addresses are really not that meaningful for anything other than checking if two variables actually **point** at the same object. This may be helpful as, for example, different objects in memory may of course have the same value." + "These addresses are *not* meaningful for anything other than checking if two variables **point** at the same object. Let's create a second variable `d` and also set it to `789`." ] }, { "cell_type": "code", - "execution_count": 30, + "execution_count": 31, "metadata": { "slideshow": { "slide_type": "slide" @@ -1043,12 +1080,12 @@ } }, "source": [ - "`a` and `d` indeed have the same value as can be checked with the **equality operator** `==`. The resulting `True` (and the `False` below) is yet another data type, a so-called **boolean**. We will look into that closely in [Chapter 3](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/03_conditionals.ipynb)." + "`a` and `d` indeed have the same value as is checked with the **equality operator** `==`. The resulting `True` (and the `False` further below) is yet another data type, a so-called **boolean**. We look into them closely in [Chapter 3](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/03_conditionals.ipynb)." ] }, { "cell_type": "code", - "execution_count": 31, + "execution_count": 32, "metadata": { "slideshow": { "slide_type": "-" @@ -1061,7 +1098,7 @@ "True" ] }, - "execution_count": 31, + "execution_count": 32, "metadata": {}, "output_type": "execute_result" } @@ -1078,12 +1115,12 @@ } }, "source": [ - "On the contrary, `a` and `d` are different objects as can be seen with the **identity operator** `is`: they are stored at seperate addresses in the memory." + "On the contrary, `a` and `d` are different objects as the **identity operator** `is` shows: they are stored at separate addresses in the memory." ] }, { "cell_type": "code", - "execution_count": 32, + "execution_count": 33, "metadata": { "slideshow": { "slide_type": "-" @@ -1096,7 +1133,7 @@ "False" ] }, - "execution_count": 32, + "execution_count": 33, "metadata": {}, "output_type": "execute_result" } @@ -1129,7 +1166,7 @@ }, { "cell_type": "code", - "execution_count": 33, + "execution_count": 34, "metadata": { "slideshow": { "slide_type": "slide" @@ -1142,7 +1179,7 @@ "int" ] }, - "execution_count": 33, + "execution_count": 34, "metadata": {}, "output_type": "execute_result" } @@ -1153,7 +1190,7 @@ }, { "cell_type": "code", - "execution_count": 34, + "execution_count": 35, "metadata": { "slideshow": { "slide_type": "-" @@ -1166,7 +1203,7 @@ "float" ] }, - "execution_count": 34, + "execution_count": 35, "metadata": {}, "output_type": "execute_result" } @@ -1185,14 +1222,12 @@ "source": [ "Different types imply different behaviors for the objects. The `b` object, for example, can be \"asked\" if it could also be interpreted as an `int` with the [is_integer()](https://docs.python.org/3/library/stdtypes.html#float.is_integer) \"functionality\" that comes with every `float` object.\n", "\n", - "Formally, we call such type-specific functionalities **methods** (to differentiate them from functions) and we will eventually fully introduce them when we talk about object-orientation in Chapter 10. For now, it suffices to know that we can access them using the **dot operator** `.`. Of course `b` could be converted into an `int`, which the boolean value `True` tells us.\n", - "\n", - "Also note how the `.` operator is neiter a unary nor a binary operator as specified above." + "Formally, we call such type-specific functionalities **methods** (to differentiate them from functions) and we formally introduce them in Chapter 10. For now, it suffices to know that we access them using the **dot operator** `.`. Of course, `b` could be converted into an `int`, which the boolean value `True` tells us." ] }, { "cell_type": "code", - "execution_count": 35, + "execution_count": 36, "metadata": { "slideshow": { "slide_type": "fragment" @@ -1205,7 +1240,7 @@ "True" ] }, - "execution_count": 35, + "execution_count": 36, "metadata": {}, "output_type": "execute_result" } @@ -1222,12 +1257,12 @@ } }, "source": [ - "For an `int` object this [is_integer()](https://docs.python.org/3/library/stdtypes.html#float.is_integer) check does not make sense as we know it is an `int` to begin with. This is why we see the `AttributeError` below as `a` does not even know what `is_integer()` means." + "For an `int` object, this [is_integer()](https://docs.python.org/3/library/stdtypes.html#float.is_integer) check does *not* make sense as we already know it is an `int`: We see the `AttributeError` below as `a` does not even know what `is_integer()` means." ] }, { "cell_type": "code", - "execution_count": 36, + "execution_count": 37, "metadata": { "slideshow": { "slide_type": "-" @@ -1241,7 +1276,7 @@ "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mAttributeError\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0ma\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mis_integer\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0ma\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mis_integer\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mAttributeError\u001b[0m: 'int' object has no attribute 'is_integer'" ] } @@ -1258,12 +1293,12 @@ } }, "source": [ - "The `c` object is a so-called **string** type (i.e., `str`), which we can view as Python's way of representing \"text\". Strings also come with their own behaviors, for example, to convert a text to lower or upper case." + "The `c` object is a so-called **string** type (i.e., `str`), which is Python's way of representing \"text.\" Strings also come with peculiar behaviors, for example, to convert a text to lower or upper case." ] }, { "cell_type": "code", - "execution_count": 37, + "execution_count": 38, "metadata": { "slideshow": { "slide_type": "slide" @@ -1276,7 +1311,7 @@ "str" ] }, - "execution_count": 37, + "execution_count": 38, "metadata": {}, "output_type": "execute_result" } @@ -1287,7 +1322,7 @@ }, { "cell_type": "code", - "execution_count": 38, + "execution_count": 39, "metadata": { "slideshow": { "slide_type": "fragment" @@ -1300,7 +1335,7 @@ "'python rocks'" ] }, - "execution_count": 38, + "execution_count": 39, "metadata": {}, "output_type": "execute_result" } @@ -1311,7 +1346,7 @@ }, { "cell_type": "code", - "execution_count": 39, + "execution_count": 40, "metadata": { "slideshow": { "slide_type": "-" @@ -1324,7 +1359,7 @@ "'PYTHON ROCKS'" ] }, - "execution_count": 39, + "execution_count": 40, "metadata": {}, "output_type": "execute_result" } @@ -1335,7 +1370,7 @@ }, { "cell_type": "code", - "execution_count": 40, + "execution_count": 41, "metadata": { "slideshow": { "slide_type": "skip" @@ -1348,7 +1383,7 @@ "'Python Rocks'" ] }, - "execution_count": 40, + "execution_count": 41, "metadata": {}, "output_type": "execute_result" } @@ -1376,14 +1411,14 @@ } }, "source": [ - "Almost trivially, every object also has a value to which it \"evaluates\" when referenced. We think of the value as the **conceptual idea** of what the $0$s and $1$s in memory mean to us humans. A machine does not really see beyond the $0$s and $1$s. At least, not yet.\n", + "Almost trivially, every object also has a value to which it \"evaluates\" when referenced. We think of the value as the **conceptual idea** of what the $0$s and $1$s in memory mean to humans as machines cannot see beyond $0$s and $1$s.\n", "\n", - "For built-in data types, Python prints out the object's value in a so-called **[literal](https://docs.python.org/3/reference/lexical_analysis.html#literals)** notation. This basically means that we can just copy & paste the output back into a code cell to create a *new* object with the *same* value." + "For built-in data types, Python prints out the object's value as a so-called **[literal](https://docs.python.org/3/reference/lexical_analysis.html#literals)**: This means that we can copy and paste the output back into a code cell to create a *new* object with the *same* value." ] }, { "cell_type": "code", - "execution_count": 41, + "execution_count": 42, "metadata": { "slideshow": { "slide_type": "slide" @@ -1396,7 +1431,7 @@ "789" ] }, - "execution_count": 41, + "execution_count": 42, "metadata": {}, "output_type": "execute_result" } @@ -1407,7 +1442,7 @@ }, { "cell_type": "code", - "execution_count": 42, + "execution_count": 43, "metadata": { "slideshow": { "slide_type": "-" @@ -1420,7 +1455,7 @@ "42.0" ] }, - "execution_count": 42, + "execution_count": 43, "metadata": {}, "output_type": "execute_result" } @@ -1437,12 +1472,12 @@ } }, "source": [ - "In this book, we follow the convention of creating strings with **double quotes** `\"` instead of the **single quotes** `'` to which Python defaults in its literal output for `str` objects. Both types of quotes can be used interchangebly." + "In this book, we follow the convention of creating strings with **double quotes** `\"` instead of the **single quotes** `'` to which Python defaults in its literal output for `str` objects. Both types of quotes may be used interchangeably." ] }, { "cell_type": "code", - "execution_count": 43, + "execution_count": 44, "metadata": { "slideshow": { "slide_type": "-" @@ -1455,7 +1490,7 @@ "'Python rocks'" ] }, - "execution_count": 43, + "execution_count": 44, "metadata": {}, "output_type": "execute_result" } @@ -1509,12 +1544,12 @@ "source": [ "If we do not follow the rules, the code cannot be **parsed** correctly, i.e., the program does not even start to run but **raises** a **syntax error** indicated as `SyntaxError` in the output. Computers are very dumb in the sense that the slightest syntax error leads to the machine not understanding our code.\n", "\n", - "For example, if we wanted to write an accounting program that adds up currencies, we would have to model dollar prices as `float` objects as the dollar symbol cannot be read by Python." + "For example, if we wanted to write an accounting program that adds up currencies, we would have to model dollar prices as `float` objects as the dollar symbol cannot be understood by Python." ] }, { "cell_type": "code", - "execution_count": 44, + "execution_count": 45, "metadata": { "slideshow": { "slide_type": "slide" @@ -1523,10 +1558,10 @@ "outputs": [ { "ename": "SyntaxError", - "evalue": "invalid syntax (, line 1)", + "evalue": "invalid syntax (, line 1)", "output_type": "error", "traceback": [ - "\u001b[0;36m File \u001b[0;32m\"\"\u001b[0;36m, line \u001b[0;32m1\u001b[0m\n\u001b[0;31m 3.99 $ + 10.40 $\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m invalid syntax\n" + "\u001b[0;36m File \u001b[0;32m\"\"\u001b[0;36m, line \u001b[0;32m1\u001b[0m\n\u001b[0;31m 3.99 $ + 10.40 $\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m invalid syntax\n" ] } ], @@ -1542,12 +1577,12 @@ } }, "source": [ - "Python requires certain symbols at certain places (e.g., a `:` is missing here) ..." + "Python requires certain symbols at certain places (e.g., a `:` is missing here)." ] }, { "cell_type": "code", - "execution_count": 45, + "execution_count": 46, "metadata": { "slideshow": { "slide_type": "slide" @@ -1556,10 +1591,10 @@ "outputs": [ { "ename": "SyntaxError", - "evalue": "invalid syntax (, line 1)", + "evalue": "invalid syntax (, line 1)", "output_type": "error", "traceback": [ - "\u001b[0;36m File \u001b[0;32m\"\"\u001b[0;36m, line \u001b[0;32m1\u001b[0m\n\u001b[0;31m for number in numbers\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m invalid syntax\n" + "\u001b[0;36m File \u001b[0;32m\"\"\u001b[0;36m, line \u001b[0;32m1\u001b[0m\n\u001b[0;31m for number in numbers\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m invalid syntax\n" ] } ], @@ -1576,12 +1611,12 @@ } }, "source": [ - "... and relies on whitespace (i.e., indentation) unlike many other programming languages. An `IndentationError` is just a special type of a `SyntaxError`." + "Furthermore, it relies on whitespace (i.e., indentation), unlike many other programming languages. The `IndentationError` below is just a particular type of a `SyntaxError`." ] }, { "cell_type": "code", - "execution_count": 46, + "execution_count": 47, "metadata": { "slideshow": { "slide_type": "slide" @@ -1590,10 +1625,10 @@ "outputs": [ { "ename": "IndentationError", - "evalue": "expected an indented block (, line 2)", + "evalue": "expected an indented block (, line 2)", "output_type": "error", "traceback": [ - "\u001b[0;36m File \u001b[0;32m\"\"\u001b[0;36m, line \u001b[0;32m2\u001b[0m\n\u001b[0;31m print(number)\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mIndentationError\u001b[0m\u001b[0;31m:\u001b[0m expected an indented block\n" + "\u001b[0;36m File \u001b[0;32m\"\"\u001b[0;36m, line \u001b[0;32m2\u001b[0m\n\u001b[0;31m print(number)\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mIndentationError\u001b[0m\u001b[0;31m:\u001b[0m expected an indented block\n" ] } ], @@ -1621,7 +1656,7 @@ } }, "source": [ - "Syntax errors as above are easy to find as the code will not even run to begin with.\n", + "Syntax errors are easy to find as the code does not even run in the first place.\n", "\n", "However, there are also so-called **runtime errors**, often called **exceptions**, that occur whenever otherwise (i.e., syntactically) correct code does not run because of invalid input.\n", "\n", @@ -1630,7 +1665,7 @@ }, { "cell_type": "code", - "execution_count": 47, + "execution_count": 48, "metadata": { "slideshow": { "slide_type": "slide" @@ -1644,7 +1679,7 @@ "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mZeroDivisionError\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0;36m1\u001b[0m \u001b[0;34m/\u001b[0m \u001b[0;36m0\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0;36m1\u001b[0m \u001b[0;34m/\u001b[0m \u001b[0;36m0\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mZeroDivisionError\u001b[0m: division by zero" ] } @@ -1672,14 +1707,14 @@ } }, "source": [ - "So-called **semantic errors**, on the contrary, can be very hard to spot as they do *not* crash the program. The only way to find such errors is to run a program with test input for which we know the answer already and can thus check the output. However, testing software is a whole discipline on its own and often very hard to do in practice.\n", + "So-called **semantic errors**, on the contrary, are hard to spot as they do *not* crash the program. The only way to find such errors is to run a program with test input for which we can predict the output. However, testing software is a whole discipline on its own and often very hard to do in practice.\n", "\n", - "The cell below copies our introductory example from above with a \"tiny\" error. How fast could you have spotted it without the comment?" + "The cell below copies our first example from above with a \"tiny\" error. How fast could you have spotted it without the comment?" ] }, { "cell_type": "code", - "execution_count": 48, + "execution_count": 49, "metadata": { "code_folding": [], "slideshow": { @@ -1701,7 +1736,7 @@ }, { "cell_type": "code", - "execution_count": 49, + "execution_count": 50, "metadata": { "slideshow": { "slide_type": "-" @@ -1714,7 +1749,7 @@ "3.0" ] }, - "execution_count": 49, + "execution_count": 50, "metadata": {}, "output_type": "execute_result" } @@ -1731,7 +1766,7 @@ } }, "source": [ - "Finding errors in a systematic way is called **debugging**. For the history of the term, see this [article](https://en.wikipedia.org/wiki/Debugging)." + "Systematically finding errors is called **debugging**. For the history of the term, see this [article](https://en.wikipedia.org/wiki/Debugging)." ] }, { @@ -1753,14 +1788,14 @@ } }, "source": [ - "Adhering to just syntax rules is therefore *never* enough. Over time, **best practices** and common **style guides** were created to make it less likely for a developer to mess up a program and also to allow \"onboarding\" him as a contributor to an established code base, often called **legacy code**, faster. These rules are not enforced by Python itself: Badly styled and un-readable code will still run. At the very least, Python programs should be styled according to [PEP 8](https://www.python.org/dev/peps/pep-0008/) and documented \"inline\" (i.e., in the code itself) according to [PEP 257](https://www.python.org/dev/peps/pep-0257/).\n", + "Thus, adhering to just syntax rules is *never* enough. Over time, **best practices** and **style guides** were created to make it less likely for a developer to mess up a program and also to allow \"onboarding\" him as a contributor to an established code base, often called **legacy code**, faster. These rules are not enforced by Python itself: Badly styled code still runs. At the very least, Python programs should be styled according to [PEP 8](https://www.python.org/dev/peps/pep-0008/) and documented \"inline\" (i.e., in the code itself) according to [PEP 257](https://www.python.org/dev/peps/pep-0257/).\n", "\n", - "An easier to read version of PEP 8 can be found [here](https://pep8.org/). The video below features a well known \"[Pythonista](https://en.wiktionary.org/wiki/Pythonista)\" talking about the importance of code style." + "An easier to read version of PEP 8 is [here](https://pep8.org/). The video below features a well known \"[Pythonista](https://en.wiktionary.org/wiki/Pythonista)\" talking about the importance of code style." ] }, { "cell_type": "code", - "execution_count": 50, + "execution_count": 51, "metadata": { "slideshow": { "slide_type": "skip" @@ -1782,10 +1817,10 @@ " " ], "text/plain": [ - "" + "" ] }, - "execution_count": 50, + "execution_count": 51, "metadata": {}, "output_type": "execute_result" } @@ -1803,12 +1838,12 @@ } }, "source": [ - "For example, while the above code to calculate the average of the even numbers from 1 through 10 is correct, a Pythonista would re-write it in a more \"Pythonic\" way and use the [sum()](https://docs.python.org/3/library/functions.html#sum) and [len()](https://docs.python.org/3/library/functions.html#len) (= \"length\") built-in functions (cf., [Chapter 2](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/02_functions.ipynb)) as well as a so-called **list comprehension** (cf., Chapter 7). Pythonic code runs faster in many cases and is less error prone." + "For example, while the above code to calculate the average of the even numbers from 1 through 10 is correct, a Pythonista would re-write it in a more \"Pythonic\" way and use the [sum()](https://docs.python.org/3/library/functions.html#sum) and [len()](https://docs.python.org/3/library/functions.html#len) (= \"length\") built-in functions (cf., [Chapter 2](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/02_functions.ipynb)) as well as a so-called **list comprehension** (cf., Chapter 7). Pythonic code runs faster in many cases and is less error-prone." ] }, { "cell_type": "code", - "execution_count": 51, + "execution_count": 52, "metadata": { "slideshow": { "slide_type": "slide" @@ -1821,7 +1856,7 @@ }, { "cell_type": "code", - "execution_count": 52, + "execution_count": 53, "metadata": { "slideshow": { "slide_type": "-" @@ -1834,7 +1869,7 @@ }, { "cell_type": "code", - "execution_count": 53, + "execution_count": 54, "metadata": { "slideshow": { "slide_type": "-" @@ -1847,7 +1882,7 @@ "[2, 4, 6, 8, 10]" ] }, - "execution_count": 53, + "execution_count": 54, "metadata": {}, "output_type": "execute_result" } @@ -1858,7 +1893,7 @@ }, { "cell_type": "code", - "execution_count": 54, + "execution_count": 55, "metadata": { "slideshow": { "slide_type": "-" @@ -1871,7 +1906,7 @@ }, { "cell_type": "code", - "execution_count": 55, + "execution_count": 56, "metadata": { "slideshow": { "slide_type": "-" @@ -1884,7 +1919,7 @@ "6.0" ] }, - "execution_count": 55, + "execution_count": 56, "metadata": {}, "output_type": "execute_result" } @@ -1901,12 +1936,12 @@ } }, "source": [ - "To get a rough overview of the mindsets of a typical Python programmer, check these rules by an early Python core developer that are deemed so important that they are actually included in every Python installation." + "To get a rough overview of the mindsets of a typical Python programmer, see these rules by an early Python core developer deemed so important that they are included in every Python installation." ] }, { "cell_type": "code", - "execution_count": 56, + "execution_count": 57, "metadata": { "slideshow": { "slide_type": "slide" @@ -1977,7 +2012,7 @@ "source": [ "Observe that you can run the code cells in a Jupyter notebook in any arbitrary order.\n", "\n", - "That means, for example, that a variable defined towards the bottom could accidently be referenced at the top of the notebook. This happens easily if we iteratively built a program and go back and forth between cells.\n", + "That means, for example, that a variable defined towards the bottom could accidentally be referenced at the top of the notebook. This happens quickly when we iteratively built a program and go back and forth between cells.\n", "\n", "As a good practice, it is recommended to click on \"Kernel\" > \"Restart & Run All\" in the navigation bar once a notebook is finished. That restarts the Python process forgetting any **state** (i.e., all variables) and ensures that the notebook runs top to bottom without any errors the next time it is opened." ] @@ -2001,11 +2036,11 @@ } }, "source": [ - "While this book is built with Jupyter notebooks, it is important to understand that \"real\" programs are almost always just \"linear\" (= top to bottom) sequences of instructions but instead can take many different **flows of execution**.\n", + "While this book is built with Jupyter notebooks, it is crucial to understand that \"real\" programs are almost always just \"linear\" (= top to bottom) sequences of instructions but instead may take many different **flows of execution**.\n", "\n", - "At the same time, for a beginner's course it is often easier to just code in a linear fashion.\n", + "At the same time, for a beginner's course, it is often easier to code linearly.\n", "\n", - "In real data science projects one would probably employ a mixed approach and put re-usable code into so-called Python modules (i.e., *.py* files; cf., [Chapter 2](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/02_functions.ipynb)) and then use Jupyter notebooks to built up a linear report or story line for a business argument to be made." + "In real data science projects, one would probably employ a mixed approach and put re-usable code into so-called Python modules (i.e., *.py* files; cf., [Chapter 2](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/02_functions.ipynb)) and then use Jupyter notebooks to build up a linear report or storyline for a business argument to be made." ] }, { @@ -2029,12 +2064,12 @@ "source": [ "**Variables** are created with the **[assignment statement](https://docs.python.org/3/reference/simple_stmts.html#assignment-statements)** `=`, which is *not* an operator, mainly because of its side effect of making a **[name](https://docs.python.org/3/reference/lexical_analysis.html#identifiers)** point to an object in memory.\n", "\n", - "We will read the terms **variable**, **name**, and **identifier** used interchangebly in many Python related texts. In this book, we adopt the following convention: First, we treat *name* and *identifier* as perfect synonyms but only use the term *name* in the text for clarity. Second, whereas *name* only refers to a string of letters, numbers, and some other symbols, a *variable* refers to the combination of a *name* and a *pointer* to some object in memory." + "We read the terms **variable**, **name**, and **identifier** used interchangebly in many Python-related texts. In this book, we adopt the following convention: First, we treat *name* and *identifier* as perfect synonyms but only use the term *name* in the text for clarity. Second, whereas *name* only refers to a string of letters, numbers, and some other symbols, a *variable* refers to the combination of a *name* and a *pointer* to some object in memory." ] }, { "cell_type": "code", - "execution_count": 57, + "execution_count": 58, "metadata": { "slideshow": { "slide_type": "slide" @@ -2054,12 +2089,12 @@ } }, "source": [ - "When referenced, a variable evaluates to the value of the object it points to. Colloquially, we could say that `a` evaluates to `20.0` here but this would not be a full description of what is really going on in memory. We will see some more colloquial jargons in this section but should always relate this to what Python actually does in memory." + "When referenced, a variable evaluates to the value of the object it points to. Colloquially, we could say that `a` evaluates to `20.0`, but this would not be an accurate description of what is going on in memory. We see some more colloquialisms in this section but should always relate this to what Python actually does in memory." ] }, { "cell_type": "code", - "execution_count": 58, + "execution_count": 59, "metadata": { "slideshow": { "slide_type": "fragment" @@ -2072,7 +2107,7 @@ "20.0" ] }, - "execution_count": 58, + "execution_count": 59, "metadata": {}, "output_type": "execute_result" } @@ -2089,12 +2124,12 @@ } }, "source": [ - "A variable can be **re-assigned** as often as we wish. Thereby, we could also assign an object of a *different* type. Because this is allowed, Python is said to be a **dynamically typed** language. On the contrary, a **statically typed** language like C also allows re-assignment but only with objects of the *same* type. This subtle distinction is one reason why Python is slower at execution than C: As it runs a program, it needs to figure out an object's type each time it is referenced. But as mentioned before, this can be mitigated with third-party libraries." + "A variable may be **re-assigned** as often as we wish. Thereby, we could also assign an object of a *different* type. Because this is allowed, Python is said to be a **dynamically typed** language. On the contrary, a **statically typed** language like C also allows re-assignment but only with objects of the *same* type. This subtle distinction is one reason why Python is slower at execution than C: As it runs a program, it needs to figure out an object's type each time it is referenced. But as mentioned before, this is mitigated with third-party libraries." ] }, { "cell_type": "code", - "execution_count": 59, + "execution_count": 60, "metadata": { "slideshow": { "slide_type": "fragment" @@ -2107,7 +2142,7 @@ }, { "cell_type": "code", - "execution_count": 60, + "execution_count": 61, "metadata": { "slideshow": { "slide_type": "-" @@ -2120,7 +2155,7 @@ "20" ] }, - "execution_count": 60, + "execution_count": 61, "metadata": {}, "output_type": "execute_result" } @@ -2137,12 +2172,12 @@ } }, "source": [ - "If we want to re-assign a variable while referencing its \"old\" (i.e., current) object, we can also **update** it using a so-called **[augmented assignment statement](https://docs.python.org/3/reference/simple_stmts.html#augmented-assignment-statements)** (*not* operator), originally introduced with [PEP 203](https://www.python.org/dev/peps/pep-0203/). This implicitly inserts the currently mapped object as the first operand on the right-hand side." + "If we want to re-assign a variable while referencing its \"old\" (i.e., current) object, we may also **update** it using a so-called **[augmented assignment statement](https://docs.python.org/3/reference/simple_stmts.html#augmented-assignment-statements)** (*not* operator), introduced with [PEP 203](https://www.python.org/dev/peps/pep-0203/): The currently mapped object is implicitly inserted as the first operand on the right-hand side." ] }, { "cell_type": "code", - "execution_count": 61, + "execution_count": 62, "metadata": { "slideshow": { "slide_type": "slide" @@ -2155,7 +2190,7 @@ }, { "cell_type": "code", - "execution_count": 62, + "execution_count": 63, "metadata": { "slideshow": { "slide_type": "-" @@ -2168,7 +2203,7 @@ }, { "cell_type": "code", - "execution_count": 63, + "execution_count": 64, "metadata": { "slideshow": { "slide_type": "-" @@ -2181,7 +2216,7 @@ }, { "cell_type": "code", - "execution_count": 64, + "execution_count": 65, "metadata": { "slideshow": { "slide_type": "-" @@ -2194,7 +2229,7 @@ "42" ] }, - "execution_count": 64, + "execution_count": 65, "metadata": {}, "output_type": "execute_result" } @@ -2211,12 +2246,12 @@ } }, "source": [ - "Variables can be **[de-referenced](https://docs.python.org/3/reference/simple_stmts.html#the-del-statement)** (i.e., \"deleted\") with the `del` statement. This does *not* delete the object to which a variable points to. It merely removes the variable's name from the \"global list of all names\"." + "Variables are **[de-referenced](https://docs.python.org/3/reference/simple_stmts.html#the-del-statement)** (i.e., \"deleted\") with the `del` statement. It does *not* delete the object to which a variable points to but merely removes the variable's name from the \"global list of all names.\"" ] }, { "cell_type": "code", - "execution_count": 65, + "execution_count": 66, "metadata": { "slideshow": { "slide_type": "slide" @@ -2229,7 +2264,7 @@ "789" ] }, - "execution_count": 65, + "execution_count": 66, "metadata": {}, "output_type": "execute_result" } @@ -2240,7 +2275,7 @@ }, { "cell_type": "code", - "execution_count": 66, + "execution_count": 67, "metadata": { "slideshow": { "slide_type": "-" @@ -2259,12 +2294,12 @@ } }, "source": [ - "If we refer to an unknown name, a *runtime* error occurs, namely a `NameError`. The `Name` in `NameError` gives a hint as to why we prefer the term *name* over *identifier*: Python just uses it more often in its error messages." + "If we refer to an unknown name, a *runtime* error occurs, namely a `NameError`. The `Name` in `NameError` gives a hint as to why we prefer the term *name* over *identifier*: Python uses it more often in its error messages." ] }, { "cell_type": "code", - "execution_count": 67, + "execution_count": 68, "metadata": { "slideshow": { "slide_type": "-" @@ -2278,7 +2313,7 @@ "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mb\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mb\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mNameError\u001b[0m: name 'b' is not defined" ] } @@ -2295,12 +2330,12 @@ } }, "source": [ - "Some variables magically exist when we start a Python process or are added by Jupyter. We can safely ignore the former until Chapter 10 and the latter for good." + "Some variables magically exist when we start a Python process or are added by Jupyter. We may safely ignore the former until Chapter 10 and the latter for good." ] }, { "cell_type": "code", - "execution_count": 68, + "execution_count": 69, "metadata": { "slideshow": { "slide_type": "skip" @@ -2313,7 +2348,7 @@ "'__main__'" ] }, - "execution_count": 68, + "execution_count": 69, "metadata": {}, "output_type": "execute_result" } @@ -2335,7 +2370,7 @@ }, { "cell_type": "code", - "execution_count": 69, + "execution_count": 70, "metadata": { "slideshow": { "slide_type": "slide" @@ -2366,15 +2401,15 @@ " '_23',\n", " '_24',\n", " '_25',\n", - " '_27',\n", + " '_26',\n", " '_28',\n", " '_29',\n", - " '_31',\n", + " '_30',\n", " '_32',\n", " '_33',\n", " '_34',\n", " '_35',\n", - " '_37',\n", + " '_36',\n", " '_38',\n", " '_39',\n", " '_4',\n", @@ -2382,16 +2417,17 @@ " '_41',\n", " '_42',\n", " '_43',\n", - " '_49',\n", + " '_44',\n", " '_5',\n", " '_50',\n", - " '_53',\n", - " '_55',\n", - " '_58',\n", - " '_60',\n", - " '_64',\n", + " '_51',\n", + " '_54',\n", + " '_56',\n", + " '_59',\n", + " '_61',\n", " '_65',\n", - " '_68',\n", + " '_66',\n", + " '_69',\n", " '_9',\n", " '__',\n", " '___',\n", @@ -2471,6 +2507,7 @@ " '_i68',\n", " '_i69',\n", " '_i7',\n", + " '_i70',\n", " '_i8',\n", " '_i9',\n", " '_ih',\n", @@ -2492,7 +2529,7 @@ " 'total']" ] }, - "execution_count": 69, + "execution_count": 70, "metadata": {}, "output_type": "execute_result" } @@ -2520,14 +2557,14 @@ } }, "source": [ - "It is important to understand that *several* variables can point to the *same* object in memory. This can be counter-intuitive in the beginning and lead to many hard to track down bugs.\n", + "It is *crucial* to understand that *several* variables may point to the *same* object in memory. Not having this in mind may lead to many hard to track down bugs.\n", "\n", - "This makes `b` point to whatever object `a` is pointing to." + "Make `b` point to whatever object `a` is pointing to." ] }, { "cell_type": "code", - "execution_count": 70, + "execution_count": 71, "metadata": { "slideshow": { "slide_type": "slide" @@ -2538,30 +2575,6 @@ "b = a" ] }, - { - "cell_type": "code", - "execution_count": 71, - "metadata": { - "slideshow": { - "slide_type": "-" - } - }, - "outputs": [ - { - "data": { - "text/plain": [ - "42" - ] - }, - "execution_count": 71, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "a" - ] - }, { "cell_type": "code", "execution_count": 72, @@ -2582,6 +2595,30 @@ "output_type": "execute_result" } ], + "source": [ + "a" + ] + }, + { + "cell_type": "code", + "execution_count": 73, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "42" + ] + }, + "execution_count": 73, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ "b" ] @@ -2594,14 +2631,14 @@ } }, "source": [ - "For \"simple\" types like `int` or `float` this will never cause troubles.\n", + "For \"simple\" types like `int` or `float` this never causes troubles.\n", "\n", - "Let's \"change the value\" of `a`. Really, let's create a *new* `123` object and make `a` point to it." + "Let's \"change the value\" of `a`. To be precise, let's create a *new* `123` object and make `a` point to it." ] }, { "cell_type": "code", - "execution_count": 73, + "execution_count": 74, "metadata": { "slideshow": { "slide_type": "fragment" @@ -2614,7 +2651,7 @@ }, { "cell_type": "code", - "execution_count": 74, + "execution_count": 75, "metadata": { "slideshow": { "slide_type": "-" @@ -2627,7 +2664,7 @@ "123" ] }, - "execution_count": 74, + "execution_count": 75, "metadata": {}, "output_type": "execute_result" } @@ -2644,12 +2681,12 @@ } }, "source": [ - "`b` \"is still the same\" as before. Really, `b` still points to the same object as before." + "`b` \"is still the same\" as before. To be precise, `b` still points to the *same object* as before." ] }, { "cell_type": "code", - "execution_count": 75, + "execution_count": 76, "metadata": { "slideshow": { "slide_type": "-" @@ -2662,7 +2699,7 @@ "42" ] }, - "execution_count": 75, + "execution_count": 76, "metadata": {}, "output_type": "execute_result" } @@ -2679,12 +2716,12 @@ } }, "source": [ - "However, if a variable points to an object of a more \"complex\" type (e.g., `list`), \"weird\" things can happen." + "However, if a variable points to an object of a more \"complex\" type (e.g., `list`), \"weird\" things happen." ] }, { "cell_type": "code", - "execution_count": 76, + "execution_count": 77, "metadata": { "slideshow": { "slide_type": "slide" @@ -2697,7 +2734,7 @@ }, { "cell_type": "code", - "execution_count": 77, + "execution_count": 78, "metadata": { "slideshow": { "slide_type": "skip" @@ -2710,7 +2747,7 @@ "list" ] }, - "execution_count": 77, + "execution_count": 78, "metadata": {}, "output_type": "execute_result" } @@ -2721,7 +2758,7 @@ }, { "cell_type": "code", - "execution_count": 78, + "execution_count": 79, "metadata": { "slideshow": { "slide_type": "-" @@ -2742,14 +2779,14 @@ "source": [ "Let's change the first element of `x`.\n", "\n", - "Chapter 7 discusses lists in more depth. For now, let's just view a `list` object as some sort of **container** that holds an arbitrary number of pointers to other objects and treat the brackets `[]` attached to it as just another operator, called the **indexing operator**. `x[0]` instructs Python to first follow the pointer from the global list of all names to the `x` object. Then, it follows the first pointer it finds there to the `1` object. The indexing operator must be an operator as we merely read the first element and do not change anything in memory.\n", + "Chapter 7 discusses lists in more depth. For now, let's view a `list` object as some sort of **container** that holds an arbitrary number of pointers to other objects and treat the brackets `[]` attached to it as just another operator, called the **indexing operator**. `x[0]` instructs Python to first follow the pointer from the global list of all names to the `x` object. Then, it follows the first pointer it finds there to the `1` object. The indexing operator must be an operator as we merely read the first element and do not change anything in memory.\n", "\n", "Note how Python **begins counting at 0**. This is not the case for many other languages, for example, [MATLAB](https://en.wikipedia.org/wiki/MATLAB), [R](https://en.wikipedia.org/wiki/R_%28programming_language%29), or [Stata](https://en.wikipedia.org/wiki/Stata). To understand why this makes sense, see this short [note](https://www.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/EWD831.html) by one of the all-time greats in computer science, the late [Edsger Dijkstra](https://en.wikipedia.org/wiki/Edsger_W._Dijkstra)." ] }, { "cell_type": "code", - "execution_count": 79, + "execution_count": 80, "metadata": { "slideshow": { "slide_type": "fragment" @@ -2762,7 +2799,7 @@ "1" ] }, - "execution_count": 79, + "execution_count": 80, "metadata": {}, "output_type": "execute_result" } @@ -2779,12 +2816,12 @@ } }, "source": [ - "To change the first entry in the list, we use the assignment statement `=` again. Here, this does actually *not* create a *new* variable (or overwrite an existing one) but only changes the object to which the first pointer in the `x` list points to. As we only change parts of the `x` object, we say that we **mutate** (i.e., \"change\") its **state**. To use the bag analogy from above, we keep the same bag but \"flip\" some of the $0$s into $1$s and some of the $1$s into $0$s." + "To change the first entry in the list, we use the assignment statement `=` again. Here, this does *not* create a *new* variable (or overwrite an existing one) but only changes the object to which the first pointer in the `x` list points to. As we only change parts of the `x` object, we say that we **mutate** (i.e., \"change\") its **state**. To use the bag analogy from above, we keep the same bag but \"flip\" some of the $0$s into $1$s and some of the $1$s into $0$s." ] }, { "cell_type": "code", - "execution_count": 80, + "execution_count": 81, "metadata": { "slideshow": { "slide_type": "slide" @@ -2797,7 +2834,7 @@ }, { "cell_type": "code", - "execution_count": 81, + "execution_count": 82, "metadata": { "slideshow": { "slide_type": "-" @@ -2810,7 +2847,7 @@ "[99, 2, 3]" ] }, - "execution_count": 81, + "execution_count": 82, "metadata": {}, "output_type": "execute_result" } @@ -2827,12 +2864,12 @@ } }, "source": [ - "The changes made to the object `x` is pointing to can also be seen through the `y` variable!" + "The changes made to the object `x` is pointing to are also seen through the `y` variable!" ] }, { "cell_type": "code", - "execution_count": 82, + "execution_count": 83, "metadata": { "slideshow": { "slide_type": "fragment" @@ -2845,7 +2882,7 @@ "[99, 2, 3]" ] }, - "execution_count": 82, + "execution_count": 83, "metadata": {}, "output_type": "execute_result" } @@ -2864,11 +2901,11 @@ "source": [ "The illustrated difference in behavior has to do with the fact that integers and floats are **immutable** types while lists are **mutable**.\n", "\n", - "In the first case, an object cannot be changed \"in place\" once it is created in memory. When we assigned `123` to the already existing `a`, we actually did not change the $0$s and $1$s in the object `a` pointed to before the assignment but created a new integer object and made `a` point to it while the `b` variable is *not* affected.\n", + "In the first case, an object cannot be changed \"in place\" once it is created in memory. When we assigned `123` to the already existing `a`, we did not change the $0$s and $1$s in the object `a` pointed to before the assignment but created a new integer object and made `a` point to it while the `b` variable is *not* affected.\n", "\n", "In the second case, `x[0] = 99` creates a *new* integer object `99` and merely changes the first pointer in the `x` list.\n", "\n", - "In general, the assignment statement creates (or overwrites) a variable and makes it point to whatever object is on the right-hand side *only if* the left-hand side is a *pure* name (i.e., it contains no operators like the indexing operator in the example). Otherwise, it mutates some already existing object. And we always have to expect that the latter might have more than one variable pointing at it.\n", + "In general, the assignment statement creates a new name and makes it point to whatever object is on the right-hand side *iff* the left-hand side is a *pure* name (i.e., it contains no operators like the indexing operator in the example). Otherwise, it *mutates* an already existing object. And, we always must expect that the latter might have more than one variable pointing to it.\n", "\n", "Visualizing what is going on in the memory with a tool like [PythonTutor](http://pythontutor.com/visualize.html#code=x%20%3D%20%5B1,%202,%203%5D%0Ay%20%3D%20x%0Ax%5B0%5D%20%3D%2099%0Aprint%28y%5B0%5D%29&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false) might be helpful for a beginner." ] @@ -2905,9 +2942,9 @@ } }, "source": [ - "Variable names may contain upper and lower case letters, numbers, and underscores (\"\\_\") and be as long as we want them to be. However, they must not begin with a number. Also, they must not be any of Python's built-in **[keywords](https://docs.python.org/3/reference/lexical_analysis.html#keywords)**.\n", + "Variable names may contain upper and lower case letters, numbers, and underscores (i.e., `_`) and be as long as we want them to be. However, they must not begin with a number. Also, they must not be any of Python's built-in **[keywords](https://docs.python.org/3/reference/lexical_analysis.html#keywords)**.\n", "\n", - "Variable names are usually chosen such that they do not need any more documentation and are self-explanatory. A very common convention is to use so-called **[snake\\_case](https://en.wikipedia.org/wiki/Snake_case)**: Keep everything lowercase and use underscores to seperate words.\n", + "Variable names should be chosen such that they do not need any more documentation and are self-explanatory. A widespread convention is to use so-called **[snake\\_case](https://en.wikipedia.org/wiki/Snake_case)**: Keep everything lowercase and use underscores to separate words.\n", "\n", "See this [link](https://en.wikipedia.org/wiki/Naming_convention_%28programming%29#Python_and_Ruby) for a comparison of different naming conventions." ] @@ -2925,7 +2962,7 @@ }, { "cell_type": "code", - "execution_count": 83, + "execution_count": 84, "metadata": { "slideshow": { "slide_type": "slide" @@ -2938,7 +2975,7 @@ }, { "cell_type": "code", - "execution_count": 84, + "execution_count": 85, "metadata": { "slideshow": { "slide_type": "-" @@ -2951,7 +2988,7 @@ }, { "cell_type": "code", - "execution_count": 85, + "execution_count": 86, "metadata": { "slideshow": { "slide_type": "-" @@ -2964,7 +3001,7 @@ }, { "cell_type": "code", - "execution_count": 86, + "execution_count": 87, "metadata": { "slideshow": { "slide_type": "-" @@ -2988,7 +3025,7 @@ }, { "cell_type": "code", - "execution_count": 87, + "execution_count": 88, "metadata": { "slideshow": { "slide_type": "skip" @@ -3001,7 +3038,7 @@ }, { "cell_type": "code", - "execution_count": 88, + "execution_count": 89, "metadata": { "slideshow": { "slide_type": "skip" @@ -3014,7 +3051,7 @@ }, { "cell_type": "code", - "execution_count": 89, + "execution_count": 90, "metadata": { "slideshow": { "slide_type": "skip" @@ -3027,7 +3064,7 @@ }, { "cell_type": "code", - "execution_count": 90, + "execution_count": 91, "metadata": { "slideshow": { "slide_type": "skip" @@ -3036,10 +3073,10 @@ "outputs": [ { "ename": "SyntaxError", - "evalue": "can't assign to operator (, line 1)", + "evalue": "can't assign to operator (, line 1)", "output_type": "error", "traceback": [ - "\u001b[0;36m File \u001b[0;32m\"\"\u001b[0;36m, line \u001b[0;32m1\u001b[0m\n\u001b[0;31m address@work = \"Burgplatz 2, Vallendar\"\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m can't assign to operator\n" + "\u001b[0;36m File \u001b[0;32m\"\"\u001b[0;36m, line \u001b[0;32m1\u001b[0m\n\u001b[0;31m address@work = \"Burgplatz 2, Vallendar\"\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m can't assign to operator\n" ] } ], @@ -3055,12 +3092,12 @@ } }, "source": [ - "If a variable name collides with a built-in name, just add a trailing underscore." + "If a variable name collides with a built-in name, we add a trailing underscore." ] }, { "cell_type": "code", - "execution_count": 91, + "execution_count": 92, "metadata": { "slideshow": { "slide_type": "skip" @@ -3079,12 +3116,12 @@ } }, "source": [ - "Variables with leading and trailing double underscores, referred to as **dunder** in Python jargon, are used for important built-in functionalities. Do *not* use this style for custom variables unless you know exactly what you are doing!" + "Variables with leading and trailing double underscores, referred to as **dunder** in Python jargon, are used for built-in functionalities. Do *not* use this style for custom variables unless you exactly know what you are doing!" ] }, { "cell_type": "code", - "execution_count": 92, + "execution_count": 93, "metadata": { "slideshow": { "slide_type": "skip" @@ -3097,7 +3134,7 @@ "'__main__'" ] }, - "execution_count": 92, + "execution_count": 93, "metadata": {}, "output_type": "execute_result" } @@ -3125,12 +3162,12 @@ } }, "source": [ - "This PyCon talk by [Ned Batchelder](https://nedbatchelder.com/), a well-known Pythonista and the organizer of the [Python User Group](https://www.meetup.com/bostonpython/) in Boston, summarizes all situations where some sort of variable assignment is done in Python. The content is intermediate and therefore it is ok if you do not understand everything at this point. However, the contents should be known by everyone claiming to be proficient in Python." + "This PyCon talk by [Ned Batchelder](https://nedbatchelder.com/), a well-known Pythonista and the organizer of the [Python User Group](https://www.meetup.com/bostonpython/) in Boston, summarizes all situations where some sort of assignment is done in Python. The content is intermediate, and, thus, it might be worthwhile to come back to this talk at a later point in time. However, the contents should be known by everyone claiming to be proficient in Python." ] }, { "cell_type": "code", - "execution_count": 93, + "execution_count": 94, "metadata": { "slideshow": { "slide_type": "skip" @@ -3152,10 +3189,10 @@ " " ], "text/plain": [ - "" + "" ] }, - "execution_count": 93, + "execution_count": 94, "metadata": {}, "output_type": "execute_result" } @@ -3186,16 +3223,16 @@ "source": [ "An **[expression](https://docs.python.org/3/reference/expressions.html)** is any syntactically correct **combination** of **variables** and **literals** with **operators**.\n", "\n", - "In simple words, anything that can be used on the right-hand side of an assignment statement without creating a `SyntaxError` is an expression.\n", + "In simple words, anything that may be used on the right-hand side of an assignment statement without creating a `SyntaxError` is an expression.\n", "\n", - "What we said about individual operators above, namely that they have *no* side effects, should have been put here to begin with. The examples in the section on operators above were actually all expressions!\n", + "What we said about individual operators above, namely that they have *no* side effects, should have been put here, to begin with. The code cells in the section on operators above are all expressions!\n", "\n", "The simplest possible expressions contain only one variable or literal." ] }, { "cell_type": "code", - "execution_count": 94, + "execution_count": 95, "metadata": { "slideshow": { "slide_type": "slide" @@ -3208,7 +3245,7 @@ "123" ] }, - "execution_count": 94, + "execution_count": 95, "metadata": {}, "output_type": "execute_result" } @@ -3219,7 +3256,7 @@ }, { "cell_type": "code", - "execution_count": 95, + "execution_count": 96, "metadata": {}, "outputs": [ { @@ -3228,7 +3265,7 @@ "42" ] }, - "execution_count": 95, + "execution_count": 96, "metadata": {}, "output_type": "execute_result" } @@ -3246,7 +3283,7 @@ }, { "cell_type": "code", - "execution_count": 96, + "execution_count": 97, "metadata": { "slideshow": { "slide_type": "-" @@ -3259,7 +3296,7 @@ "165" ] }, - "execution_count": 96, + "execution_count": 97, "metadata": {}, "output_type": "execute_result" } @@ -3276,12 +3313,12 @@ } }, "source": [ - "The definition of an expression is **recursive**. So, here the sub-expression `a + b` is combined with the literal `3` by the operator `**` to form the full expression." + "The definition of an expression is **recursive**. So, the sub-expression `a + b` is combined with the literal `3` by the operator `**` to form the full expression." ] }, { "cell_type": "code", - "execution_count": 97, + "execution_count": 98, "metadata": { "slideshow": { "slide_type": "-" @@ -3294,7 +3331,7 @@ "4492125" ] }, - "execution_count": 97, + "execution_count": 98, "metadata": {}, "output_type": "execute_result" } @@ -3316,7 +3353,7 @@ }, { "cell_type": "code", - "execution_count": 98, + "execution_count": 99, "metadata": { "slideshow": { "slide_type": "-" @@ -3329,7 +3366,7 @@ "3" ] }, - "execution_count": 98, + "execution_count": 99, "metadata": {}, "output_type": "execute_result" } @@ -3346,12 +3383,12 @@ } }, "source": [ - "When not used as a delimiter, parentheses also constitute an operator, namely the **call operator** `()`. We have seen this syntax above when we \"called\" (i.e., executed) built-in functions and methods." + "When not used as a delimiter, parentheses also constitute an operator, namely the **call operator** `()`. We saw this syntax above when we \"called\" (i.e., executed) built-in functions and methods." ] }, { "cell_type": "code", - "execution_count": 99, + "execution_count": 100, "metadata": { "slideshow": { "slide_type": "-" @@ -3364,7 +3401,7 @@ "104" ] }, - "execution_count": 99, + "execution_count": 100, "metadata": {}, "output_type": "execute_result" } @@ -3392,12 +3429,12 @@ } }, "source": [ - "Python **overloads** certain operators. For example you can not only add numbers but also strings. This is called **string concatenation**." + "Python **overloads** certain operators. For example, you may not only \"add\" numbers but also strings: This is called **string concatenation**." ] }, { "cell_type": "code", - "execution_count": 100, + "execution_count": 101, "metadata": { "slideshow": { "slide_type": "slide" @@ -3411,7 +3448,7 @@ }, { "cell_type": "code", - "execution_count": 101, + "execution_count": 102, "metadata": { "slideshow": { "slide_type": "fragment" @@ -3424,7 +3461,7 @@ "'Hi class'" ] }, - "execution_count": 101, + "execution_count": 102, "metadata": {}, "output_type": "execute_result" } @@ -3446,7 +3483,7 @@ }, { "cell_type": "code", - "execution_count": 102, + "execution_count": 103, "metadata": { "slideshow": { "slide_type": "fragment" @@ -3459,7 +3496,7 @@ "'Hi Hi Hi Hi Hi Hi Hi Hi Hi Hi Hi Hi Hi Hi Hi Hi Hi Hi Hi Hi Hi Hi Hi Hi Hi Hi Hi Hi Hi Hi Hi Hi Hi Hi Hi Hi Hi Hi Hi Hi Hi Hi '" ] }, - "execution_count": 102, + "execution_count": 103, "metadata": {}, "output_type": "execute_result" } @@ -3487,14 +3524,14 @@ } }, "source": [ - "A **[statement](https://docs.python.org/3/reference/simple_stmts.html)** is anything that changes the state of a program or has some other side effect. Statements do not just evaluate to a value like expressions; instead, they create or change values.\n", + "A **[statement](https://docs.python.org/3/reference/simple_stmts.html)** is anything that *changes* the *state of a program* or has another *side effect*. Statements, unlike expressions, do not just evaluate to a value; instead, they create or change values.\n", "\n", - "Most notably of course are the `=` and `del` statements." + "Most notably, of course, are the `=` and `del` statements." ] }, { "cell_type": "code", - "execution_count": 103, + "execution_count": 104, "metadata": { "slideshow": { "slide_type": "slide" @@ -3507,7 +3544,7 @@ }, { "cell_type": "code", - "execution_count": 104, + "execution_count": 105, "metadata": { "slideshow": { "slide_type": "-" @@ -3526,12 +3563,12 @@ } }, "source": [ - "The built-in [print()](https://docs.python.org/3/library/functions.html#print) function is regarded a \"statement\" as well. In fact, it used to be a real statement in Python 2 and has all the necessary properties. It is a bit of a corner case as expressions are also \"printed\" in a Jupyter notebook when evaluated last in a code cell." + "The built-in [print()](https://docs.python.org/3/library/functions.html#print) function is regarded as a \"statement\" as well. It used to be an actual statement in Python 2 and has all the necessary properties. It is a bit of a corner case as expressions are also \"printed\" in a Jupyter notebook when evaluated last in a code cell." ] }, { "cell_type": "code", - "execution_count": 105, + "execution_count": 106, "metadata": { "slideshow": { "slide_type": "skip" @@ -3569,16 +3606,16 @@ } }, "source": [ - "We can use the `#` symbol to write comments in plain English right into the code.\n", + "We use the `#` symbol to write comments in plain English right into the code.\n", "\n", - "As a good practice, comments should not describe what happens (this should be evident by reading the code, otherwise it is most likely badly written code) but why something happens.\n", + "As a good practice, comments should not describe *what* happens (this should be evident by reading the code; otherwise, it is most likely badly written code) but *why* something happens.\n", "\n", - "Comments can be either added at the end of a line of code (by convention seperated with two spaces) or be on a line on their own." + "Comments may be added either at the end of a line of code, by convention separated with two spaces, or on a line on their own." ] }, { "cell_type": "code", - "execution_count": 106, + "execution_count": 107, "metadata": { "slideshow": { "slide_type": "slide" @@ -3600,13 +3637,13 @@ } }, "source": [ - "But let's think wisely if we really need to use a comment.\n", - "The second cell is a lot more \"Pythonic\"." + "But let's think wisely if we need to use a comment.\n", + "The second cell is a lot more Pythonic." ] }, { "cell_type": "code", - "execution_count": 107, + "execution_count": 108, "metadata": { "slideshow": { "slide_type": "fragment" @@ -3619,7 +3656,7 @@ }, { "cell_type": "code", - "execution_count": 108, + "execution_count": 109, "metadata": { "slideshow": { "slide_type": "-" @@ -3649,7 +3686,7 @@ } }, "source": [ - "We end each chapter with a summary of the main points. The essence in this first chapter is that just like a sentence in a real language like English can be decomposed into its parts (subject, predicate, objects, ...) the same can be done with programming languages." + "We end each chapter with a summary of the main points. The essence in this first chapter is that just like a sentence in a real language like English may be decomposed into its parts (e.g., subject, predicate, and objects), the same may be done with programming languages." ] }, { @@ -3670,18 +3707,18 @@ " - (numeric) data from a CSV file\n", " - text entered on a command line\n", " - (relational) data obtained from a database\n", - " - ...\n", + " - etc.\n", "\n", "\n", "- output (examples)\n", " - result of a computation (e.g., statistical summary of a sample dataset)\n", " - a \"side effect\" (e.g., a transformation of raw input data into cleaned data)\n", " - a physical \"behavior\" (e.g., a robot moving or a document printed)\n", - " - ...\n", + " - etc.\n", "\n", "\n", "- objects\n", - " - distinct and well-contained areas / parts of the memory that hold the actual data\n", + " - distinct and well-contained areas/parts of the memory that hold the actual data\n", " - the concept by which Python manages the memory for us\n", " - can be classified into objects of the same **type** (i.e., same abstract \"structure\" but different concrete data)\n", " - built-in objects (incl. **literals**) vs. user-defined objects (cf., Chapter 8)\n", @@ -3696,13 +3733,13 @@ "\n", "- operators\n", " - special built-in symbols that perform operations with objects in memory\n", - " - usually operate with one or two objects\n", + " - usually, operate with one or two objects\n", " - e.g., addition `+`, subtraction `-`, multiplication `*`, and division `/` all take two objects whereas the negation `-` only takes one\n", "\n", "\n", "- expressions\n", " - **combinations** of **variables** (incl. **literals**) and **operators**\n", - " - do *not* change the involved objects / state of the program\n", + " - do *not* change the involved objects/state of the program\n", " - evaluate to a **value** (i.e., the \"result\" of the expression, usually a new object)\n", " - e.g., `x + 2` evaluates to the (new) object `3` and `1 - 1.0` to `0.0`\n", "\n", @@ -3710,7 +3747,7 @@ "- statements\n", " - instructions that **\"do\" something** and **have side effects** in memory\n", " - re-map names to different objects and *change* the state of the program\n", - " - usually work with expressions\n", + " - usually, work with expressions\n", " - e.g., the assignment statement `=` makes a name point to an object\n", "\n", "\n", @@ -3727,8 +3764,8 @@ "\n", "- flow control (cf., [Chapter 3](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/03_conditionals.ipynb))\n", " - expression of **business logic** or an **algorithm**\n", - " - conditional execution of a small **branch** within a program (i.e., `if` statements)\n", - " - repetitive execution of parts of a program (i.e., `for`-loops and `while`-loops)" + " - conditional execution of a small **branch** within a program (e.g., `if` statements)\n", + " - repetitive execution of parts of a program (e.g., `for`-loops)" ] } ], diff --git a/01_elements_review_and_exercises.ipynb b/01_elements_review_and_exercises.ipynb index 8940f24..965b8be 100644 --- a/01_elements_review_and_exercises.ipynb +++ b/01_elements_review_and_exercises.ipynb @@ -40,7 +40,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Q1**: Elaborate on how **modulo division** might be a very useful operation to know!" + "**Q1**: Elaborate on how **modulo division** might be a handy operation to know!" ] }, { diff --git a/02_functions.ipynb b/02_functions.ipynb index 681838a..941c71b 100644 --- a/02_functions.ipynb +++ b/02_functions.ipynb @@ -19,11 +19,11 @@ } }, "source": [ - "In [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb) we typed the **business logic** of our little program to calculate the mean of a subset of a list of numbers right into the code cells. Then, we executed them one after another. We had no way of **re-using** the code except for either re-executing the cells or copying and pasting their contents into other cells. And whenever we find ourselves doing repetitive manual work, we can be sure that there must be a way of automating what we are doing.\n", + "In [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb), we typed the **business logic** of our little program to calculate the mean of a subset of a list of numbers right into the code cells. Then, we executed them one after another. We had no way of **re-using** the code except for either re-executing the cells or copying and pasting their contents into other cells. And whenever we find ourselves doing repetitive manual work, we can be sure that there must be a way of automating what we are doing.\n", "\n", "At the same time, we executed built-in functions (e.g., [print()](https://docs.python.org/3/library/functions.html#print), [sum()](https://docs.python.org/3/library/functions.html#sum), [len()](https://docs.python.org/3/library/functions.html#len), [id()](https://docs.python.org/3/library/functions.html#id), or [type()](https://docs.python.org/3/library/functions.html#type)) that obviously must be re-using the same parts inside core Python every time we use them.\n", "\n", - "This chapter shows how Python offers language constructs that let us **define** our own functions that we can then **call** just like the built-in ones." + "This chapter shows how Python offers language constructs that let us **define** functions ourselves that we may then **call** just like the built-in ones." ] }, { @@ -45,19 +45,19 @@ } }, "source": [ - "So-called **[user-defined functions](https://docs.python.org/3/reference/compound_stmts.html#function-definitions)** can be created with the `def` statement. To extend an already familiar example, we re-use the introductory example from [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb) in its final Pythonic version and transform it into the function `average_evens()` below. \n", + "So-called **[user-defined functions](https://docs.python.org/3/reference/compound_stmts.html#function-definitions)** may be created with the `def` statement. To extend an already familiar example, we re-use the introductory example from [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb) in its final Pythonic version and transform it into the function `average_evens()` below. \n", "\n", - "A function's **name** must be chosen according to the same naming rules as for ordinary variables. In fact, Python manages function names just like variables. In this book, we further adopt the convention of ending function names with parentheses \"`()`\" in text cells for faster comprehension when reading (i.e., `average_evens()` vs. `average_evens`). These are not actually part of the name but must always be written out in the `def` statement for syntactic reasons.\n", + "A function's **name** must be chosen according to the same naming rules as ordinary variables as Python manages function names like variables. In this book, we further adopt the convention of ending function names with parentheses `()` in text cells for faster comprehension when reading (i.e., `average_evens()` vs. `average_evens`). These are *not* part of the name but must always be written out in the `def` statement for syntactic reasons.\n", "\n", - "Functions may define an arbitrary number of **parameters** as inputs that can then be referenced within the indented **code block**: They are simply listed within the parentheses in the `def` statement (i.e., `numbers` below). \n", + "Functions may define an arbitrary number of **parameters** as inputs that can then be referenced within the indented **code block**: They are listed within the parentheses in the `def` statement (i.e., `numbers` below). \n", "\n", - "The code block is often also called a function's **body** while the first line with the `def` in it is the **header** and must end with a colon.\n", + "The code block is also called a function's **body**, while the first line starting with `def` and ending with a colon is the **header**.\n", "\n", "Together, the name and the list of parameters are also referred to as the function's **[signature](https://en.wikipedia.org/wiki/Type_signature)** (i.e., `average_evens(numbers)` below).\n", "\n", "A function may come with an *explicit* **[return value](https://docs.python.org/3/reference/simple_stmts.html#the-return-statement)** (i.e., \"result\" or \"output\") specified with the `return` statement: Functions that have one are considered **fruitful**; otherwise, they are **void**. Functions of the latter kind are still useful because of their **side effects** (e.g., the [print()](https://docs.python.org/3/library/functions.html#print) built-in). Strictly speaking, they also have an *implicit* return value of `None` that is different from the `False` we saw in Chapter 1.\n", "\n", - "To maintain good coding practices, a function should define a **docstring** that describes what it does in a short subject line, what parameters it expects (i.e., their types), and what it returns (if anything). A docstring is a syntactically valid multi-line string (i.e., type `str`) defined with **triple-double quotes** (strings are covered in depth in Chapter 6). Good standards as to how to format a docstring are [PEP 257](https://www.python.org/dev/peps/pep-0257/) and section 3.8 in [Google's Python Style Guide](https://github.com/google/styleguide/blob/gh-pages/pyguide.md)." + "A function should define a **docstring** that describes what it does in a short subject line, what parameters it expects (i.e., their types), and what it returns (if anything). A docstring is a syntactically valid multi-line string (i.e., type `str`) defined within **triple-double quotes** `\"\"\"`. Strings are covered in depth in [Chapter 6](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/06_text.ipynb). Widely adopted standards as to how to format a docstring are [PEP 257](https://www.python.org/dev/peps/pep-0257/) and section 3.8 in [Google's Python Style Guide](https://github.com/google/styleguide/blob/gh-pages/pyguide.md)." ] }, { @@ -92,7 +92,7 @@ } }, "source": [ - "Once defined, a function can be referenced just like any other variable by its name (i.e., *without* the parenthesis). Its value might seem awkward at first: It consists of the location where we defined the function (i.e., `__main__`, which is Python's way of saying \"in this notebook\") and the signature." + "Once defined, a function may be referenced just like any other variable by its name (i.e., *without* the parenthesis). Its value might seem awkward at first: It consists of the location where we defined the function (i.e., `__main__`, which is Python's way of saying \"in this notebook\") and the signature." ] }, { @@ -142,7 +142,7 @@ { "data": { "text/plain": [ - "140693945143776" + "140519730762208" ] }, "execution_count": 3, @@ -254,7 +254,7 @@ } }, "source": [ - "Two questions marks even show a function's source code." + "Two question marks even show a function's source code." ] }, { @@ -289,7 +289,7 @@ } }, "source": [ - "We can **call** (i.e., \"execute\") a function with the **call operator** `()` as often as we wish. The formal parameters are filled in by passing variables or expressions as **arguments** to the function within the parentheses." + "We **call** (i.e., \"execute\") a function with the **call operator** `()` as often as we wish. The formal parameters are filled in by passing variables or expressions as **arguments** to the function within the parentheses." ] }, { @@ -337,7 +337,7 @@ } }, "source": [ - "The return value is usually assigned to a new variable for later reference. Otherwise we would loose access to it in memory right away." + "The return value is commonly assigned to a new variable for later reference. Otherwise, we would lose access to it in memory right away." ] }, { @@ -493,7 +493,7 @@ } }, "source": [ - "[PythonTutor](http://pythontutor.com/visualize.html#code=nums%20%3D%20%5B1,%202,%203,%204,%205,%206,%207,%208,%209,%2010%5D%0A%0Adef%20average_evens%28numbers%29%3A%0A%20%20%20%20evens%20%3D%20%5Bn%20for%20n%20in%20numbers%20if%20n%20%25%202%20%3D%3D%200%5D%0A%20%20%20%20average%20%3D%20sum%28evens%29%20/%20len%28evens%29%0A%20%20%20%20return%20average%0A%0Arv%20%3D%20average_evens%28nums%29&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false) visualizes what happens in memory: To be precise, in the exact moment when the function call is initiated and `nums` is passed in as the `numbers` argument, there are *two* pointers to the *same* `list` object (cf., steps 4-5 in the visualization). We also see how Python creates a *new* **frame** that holds the function's local scope (i.e., \"internal names\") in addition to the **global** frame. Frames are nothing but [namespaces](https://en.wikipedia.org/wiki/Namespace) to *isolate* the names of different **scopes** from each other. The list comprehension `[n for n in numbers if n % 2 == 0]` constitutes yet another frame that is in scope as the `list` object assigned to `evens` is *being* created (cf., steps 6-18). When the function returns, only the global frame is left (cf., steps 21-22)." + "[PythonTutor](http://pythontutor.com/visualize.html#code=nums%20%3D%20%5B1,%202,%203,%204,%205,%206,%207,%208,%209,%2010%5D%0A%0Adef%20average_evens%28numbers%29%3A%0A%20%20%20%20evens%20%3D%20%5Bn%20for%20n%20in%20numbers%20if%20n%20%25%202%20%3D%3D%200%5D%0A%20%20%20%20average%20%3D%20sum%28evens%29%20/%20len%28evens%29%0A%20%20%20%20return%20average%0A%0Arv%20%3D%20average_evens%28nums%29&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false) visualizes what happens in memory: To be precise, in the exact moment when the function call is initiated and `nums` passed in as the `numbers` argument, there are *two* pointers to the *same* `list` object (cf., steps 4-5 in the visualization). We also see how Python creates a *new* **frame** that holds the function's local scope (i.e., \"internal names\") in addition to the **global** frame. Frames are nothing but [namespaces](https://en.wikipedia.org/wiki/Namespace) to *isolate* the names of different **scopes** from each other. The list comprehension `[n for n in numbers if n % 2 == 0]` constitutes yet another frame that is in scope as the `list` object assigned to `evens` is *being* created (cf., steps 6-18). When the function returns, only the global frame is left (cf., steps 21-22)." ] }, { @@ -515,7 +515,7 @@ } }, "source": [ - "On the contrary, while a function is being executed, it can reference the variables of **enclosing scopes** (i.e., \"outside\" of it). This is a common source of *semantic* errors. Consider the following stylized (and incorrect) example `average_wrong()`. The error is hard to spot with eyes: The function never references the `numbers` parameter but the `nums` variable in the **global scope** instead." + "On the contrary, while a function is *being* executed, it can reference the variables of **enclosing scopes** (i.e., \"outside\" of it). This is a common source of *semantic* errors. Consider the following stylized (and incorrect) example `average_wrong()`. The error is hard to spot with eyes: The function never references the `numbers` parameter but the `nums` variable in the **global scope** instead." ] }, { @@ -550,7 +550,7 @@ } }, "source": [ - "`nums` in the global scope is of course unchanged." + "`nums` in the global scope is, of course, unchanged." ] }, { @@ -657,7 +657,7 @@ "source": [ "[PythonTutor](http://pythontutor.com/visualize.html#code=nums%20%3D%20%5B1,%202,%203,%204,%205,%206,%207,%208,%209,%2010%5D%0A%0Adef%20average_wrong%28numbers%29%3A%0A%20%20%20%20evens%20%3D%20%5Bn%20for%20n%20in%20nums%20if%20n%20%25%202%20%3D%3D%200%5D%0A%20%20%20%20average%20%3D%20sum%28evens%29%20/%20len%28evens%29%0A%20%20%20%20return%20average%0A%0Arv%20%3D%20average_wrong%28%5B123,%20456,%20789%5D%29&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false) is again helpful at visualizing the error interactively: Creating the `list` object `evens` eventually points to takes *12* computational steps, namely one for setting up an empty `list` object, *ten* for filling it with elements derived from `nums` in the global scope, and one to make `evens` point at it (cf., steps 6-18).\n", "\n", - "The frames logic shown by PythonTutor is the mechanism by which Python not only manages the names inside one function call but for potentially many calls occuring simultaneously as we will see in [Chapter 4](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/04_iteration.ipynb). It is the reason why we may re-use the same names for the parameters and variables inside both `average_evens()` and `average_wrong()` without Python mixing them up. So, as we already read in the [Zen of Python](https://www.python.org/dev/peps/pep-0020/), \"namespaces are one honking great idea\" (cf., `import this`) and a frame is just a special kind of namespace." + "The frames logic shown by PythonTutor is the mechanism by which Python not only manages the names inside *one* function call but also for *many* potentially simultaneous* calls, as revealed in [Chapter 4](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/04_iteration.ipynb). It is the reason why we may re-use the same names for the parameters and variables inside both `average_evens()` and `average_wrong()` without Python mixing them up. So, as we already read in the [Zen of Python](https://www.python.org/dev/peps/pep-0020/), \"namespaces are one honking great idea\" (cf., `import this`), and a frame is just a special kind of namespace." ] }, { @@ -752,7 +752,7 @@ } }, "source": [ - "As good practice, let's first use inputs for which we can calculate the answer in our heads to \"verify\" that `average_odds()` is \"correct\"." + "As good practice, let's first use inputs for which we can calculate the answer in our heads to \"verify\" that `average_odds()` is \"correct.\"" ] }, { @@ -787,7 +787,7 @@ } }, "source": [ - "To make the confusion even bigger, let's also pass the global `nums` as an argument to `average_odds()`." + "To add to the confusion, let's also pass the global `nums` as an argument to `average_odds()`." ] }, { @@ -857,13 +857,13 @@ } }, "source": [ - "The reason why everything just works is that *every time* we (re-)assign an object to a variable *inside* a function with the `=` statement, this is done in the *local* scope by default. There are ways to change variables existing in an outer scope from within a function but this is a rather advanced topic on its own.\n", + "The reason why everything works is that *every time* we (re-)assign an object to a variable *inside* a function with the `=` statement, this is done in the *local* scope by default. There are ways to change variables existing in an outer scope from within a function, but this is a rather advanced topic on its own.\n", "\n", "[PythonTutor](http://pythontutor.com/visualize.html#code=nums%20%3D%20%5B1,%202,%203,%204,%205,%206,%207,%208,%209,%2010%5D%0A%0Adef%20average_odds%28numbers%29%3A%0A%20%20%20%20nums%20%3D%20%5Bint%28n%29%20for%20n%20in%20numbers%5D%0A%20%20%20%20odds%20%3D%20%5Bn%20for%20n%20in%20nums%20if%20n%20%25%202%20!%3D%200%5D%0A%20%20%20%20average%20%3D%20sum%28odds%29%20/%20len%28odds%29%0A%20%20%20%20return%20average%0A%0Arv%20%3D%20average_odds%28%5B1.0,%2010.0,%203.0,%2010.0,%205.0%5D%29&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false) shows how *two* `nums` variables exist in *different* scopes pointing to *different* objects (cf., steps 14-25) when we execute `average_odds([1.0, 10.0, 3.0, 10.0, 5.0])`.\n", "\n", "Variables whose names collide with the ones of variables in enclosing scopes - and the global scope is just the most enclosing scope - are said to **shadow** them.\n", "\n", - "While this is not a problem for Python as we have observed, it may lead to less readable code for us humans and should be avoided if possible. But, as we have also heard, \"[naming things](https://skeptics.stackexchange.com/questions/19836/has-phil-karlton-ever-said-there-are-only-two-hard-things-in-computer-science)\" is often considered hard as well and we have to be prepared to encounter shadowing variables." + "While this is not a problem for Python, it may lead to less readable code for us humans and should be avoided if possible. But, as we have also heard, \"[naming things](https://skeptics.stackexchange.com/questions/19836/has-phil-karlton-ever-said-there-are-only-two-hard-things-in-computer-science)\" is often considered hard as well, and we have to be prepared to encounter shadowing variables." ] }, { @@ -946,7 +946,7 @@ } }, "source": [ - "We can cast certain objects as a different type. For example, to \"convert\" a float or a text into an integer, we use the [int()](https://docs.python.org/3/library/functions.html#int) built-in. This actually creates a *new* object of type `int` from the provided `avg` or `\"6\"` objects who continue to exist in memory unchanged." + "We may cast objects as a different type. For example, to \"convert\" a float or a text into an integer, we use the [int()](https://docs.python.org/3/library/functions.html#int) built-in. It creates a *new* object of type `int` from the provided `avg` or `\"6\"` objects that continue to exist in memory unchanged." ] }, { @@ -1088,7 +1088,7 @@ } }, "source": [ - "Not all conversions are valid and *runtime* errors can occur as the `ValueError` shows." + "Not all conversions are valid and *runtime* errors may occur as the `ValueError` shows." ] }, { @@ -1124,7 +1124,7 @@ } }, "source": [ - "We can also go in the other direction with the [float()](https://docs.python.org/3/library/functions.html#float) built-in function." + "We may also go in the other direction with the [float()](https://docs.python.org/3/library/functions.html#float) built-in function." ] }, { @@ -1170,7 +1170,7 @@ } }, "source": [ - "So far we have only specified one parameter in each of our user-defined functions. In [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb), however, we saw the built-in function [divmod()](https://docs.python.org/3/library/functions.html#divmod) taking two arguments. Obviously, the order of the numbers passed in mattered. Whenever we call a function and list its arguments in a comma seperated manner, we say that we pass in the arguments by position or refer to them as **positional arguments**." + "So far, we have only specified one parameter in each of our user-defined functions. In [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb), however, we saw the built-in function [divmod()](https://docs.python.org/3/library/functions.html#divmod) take two arguments. And, the order of the numbers passed in mattered! Whenever we call a function and list its arguments in a comma separated manner, we say that we pass in the arguments by position or refer to them as **positional arguments**." ] }, { @@ -1229,7 +1229,7 @@ } }, "source": [ - "For many functions there is a natural order to the arguments. But what if this is not the case? For example, let's create a close relative of the above `average_evens()` function that also scales the resulting average by a factor. What is more natural? Passing in `numbers` first? Or `scalar`? There is no obvious way and we continue with the first alternative for no real reason." + "For many functions, there is a natural order to the arguments. But what if this is not the case? For example, let's create a close relative of the above `average_evens()` function that also scales the resulting average by a factor. What is more natural? Passing in `numbers` first? Or `scalar`? There is no obvious way and we continue with the first alternative for no concrete reason." ] }, { @@ -1266,7 +1266,7 @@ } }, "source": [ - "As with [divmod()](https://docs.python.org/3/library/functions.html#divmod), we can pass in the arguments by position." + "As with [divmod()](https://docs.python.org/3/library/functions.html#divmod), we pass in the arguments by position." ] }, { @@ -1301,9 +1301,9 @@ } }, "source": [ - "However, now the function call is a bit harder to comprehend as we need to always remember what the `2` means. This becomes even harder the more parameters we specify.\n", + "However, now the function call is a bit harder to comprehend as we always need to remember what the `2` means. This becomes harder the more parameters we specify.\n", "\n", - "Luckily, we can also reference the formal parameter names as **keyword arguments**. We can even combine positional and keyword arguments in the same function call. Each of the following does the exact same thing." + "Luckily, we may also reference the formal parameter names as **keyword arguments**. We can even combine positional and keyword arguments in the same function call. Each of the following does the *same*." ] }, { @@ -1430,11 +1430,11 @@ } }, "source": [ - "Defining both `average_evens()` and `scaled_average_evens()` is also kind of repetitive as most of their code is the same. Such a redundancy will make a code base harder to maintain in the long run as whenever we change the logic in one function we must *not* forget to do so for the other function as well.\n", + "Defining both `average_evens()` and `scaled_average_evens()` is also kind of repetitive as most of their code is the same. Such a redundancy makes a codebase harder to maintain in the long run as whenever we change the logic in one function, we must *not* forget to do so for the other function as well.\n", "\n", "A better way is to design related functions in a **modular** fashion such that they re-use each other's logic.\n", "\n", - "For example, as not scaling an average is just a special case of scaling it with `1`, we could re-define the two functions like below: In this setting, the function resembling the *special* case (i.e., `average_evens()`) simply **forwards** the call to the more *general* function (i.e., `scaled_average_evens()`) using a `scalar=1` argument." + "For example, as not scaling an average is just a special case of scaling it with `1`, we could re-define the two functions like below: In this setting, the function resembling the *special* case (i.e., `average_evens()`) **forwards** the call to the more *general* function (i.e., `scaled_average_evens()`) using a `scalar=1` argument." ] }, { @@ -1483,7 +1483,7 @@ } }, "source": [ - "The outcome of `average_evens(nums)` is of course still `6.0`." + "The outcome of `average_evens(nums)` is, of course, still `6.0`." ] }, { @@ -1555,9 +1555,9 @@ } }, "source": [ - "Now we can call the function either with or without the `scalar` argument.\n", + "Now we call the function either with or without the `scalar` argument.\n", "\n", - "If `scalar` is passed in, this can be done as either a positional or a keyword argument. Which of the two versions where `scalar` is `2` is easier to comprehend in a large program?" + "If `scalar` is to be passed in, this may be done as either a positional or a keyword argument. Which of the two versions where `scalar` is `2` is more \"natural\" and faster to comprehend in a larger program?" ] }, { @@ -1651,9 +1651,9 @@ } }, "source": [ - "Since we *assumed* that scaling will occur *rarely*, we'd prefer that our new version of `average_evens()` be called with a *keyword argument* whenever `scalar` is passed in explicitly. Then, the second argument is never ambiguous as we could always read its name.\n", + "Since we *assumed* that scaling occurs *rarely*, we'd prefer that our new version of `average_evens()` be called with a *keyword argument* whenever `scalar` is to be passed in explicitly. Then, the second argument is never ambiguous as we could always read its name.\n", "\n", - "Luckily, Python offers a **keyword-only** syntax, where all we need to do is to place the arguments for which we require *explicit* keyword use after an asterix `*`." + "Luckily, Python offers a **keyword-only** syntax, where all we need to do is to place the arguments for which we require *explicit* keyword use after an asterisk `*`." ] }, { @@ -1798,13 +1798,13 @@ "source": [ "The `def` statement is a statement because of its side effect of creating a *new* name that points to a *new* `function` object in memory.\n", "\n", - "We can thus think of it as doing *two* things at once (i.e., either both of them happen or none). First, a `function` object is created that contains the concrete $0$s and $1$s that resemble the instructions we put into the function's body. In the context of a function, these $0$s and $1$s are also called **[byte code](https://en.wikipedia.org/wiki/Bytecode)**. Then, a name is created pointing at the new `function` object.\n", + "We can thus think of it as doing *two* things at once (i.e., either both of them happen or none). First, a `function` object is created that contains the concrete $0$s and $1$s that resemble the instructions we put into the function's body. In the context of a function, these $0$s and $1$s are also called **[byte code](https://en.wikipedia.org/wiki/Bytecode)**. Then, a name pointing at the new `function` object is created.\n", "\n", - "Only this second aspect makes `def` a statement: Merely creating a new object in memory without making it accessible for later reference does *not* constitute a side effect. This is because the state the program is *not* changed. After all, if we cannot reference an object, how do we know that it is actually existing?\n", + "Only this second aspect makes `def` a statement: Merely creating a new object in memory without making it accessible for later reference does *not* constitute a side effect because the state the program is *not* changed. After all, if we cannot reference an object, how do we know it exists in the first place?\n", "\n", "Python provides a so-called **[lambda expression](https://docs.python.org/3/reference/expressions.html#lambda)** syntax that allows us to *only* create a `function` object in memory *without* making a name point to it.\n", "\n", - "It starts with the keyword `lambda` followed by an optional comma seperated enumeration of parameters, a mandatory colon, and *one* expression that also is the resulting `function` object's return value.\n", + "It starts with the keyword `lambda` followed by an optional comma separated enumeration of parameters, a mandatory colon, and *one* expression that also is the resulting `function` object's return value.\n", "\n", "Because it does not create a name pointing to the object, we effectively create \"anonymous\" functions with it. In the example, we create a `function` object that adds `3` to the only argument passed in as the parameter `x` and returns that sum." ] @@ -1845,7 +1845,7 @@ "\n", "We created a `function` object, dit *not* call it, and Python immediately forgot about it. So what's the point?\n", "\n", - "Just to prove that the `lambda` expression really creates a callable `function` object, we use the simple `=` statement to assign it to the variable `add_three`, which is really `add_three()` as per our convention from above." + "To prove that a `lambda` expression creates a callable `function` object, we use the simple `=` statement to assign it to the variable `add_three`, which is really `add_three()` as per our convention from above." ] }, { @@ -1869,7 +1869,7 @@ } }, "source": [ - "Now we can call `add_three()` as if we defined it with the `def` statement to begin with." + "Now we call `add_three()` as if we defined it with the `def` statement in the first place." ] }, { @@ -1904,7 +1904,7 @@ } }, "source": [ - "Alternatively, we could call any anonymous `function` object created with an `lambda` expression right away (i.e. without assigning it to a variable), which looks really weird for now as we need *two* pairs of parentheses: The first is just a delimiter whereas the second the call operator." + "Alternatively, we could call an anonymous `function` object created with a `lambda` expression right away (i.e., without assigning it to a variable), which looks quite weird for now as we need *two* pairs of parentheses: The first is just a delimiter whereas the second the call operator." ] }, { @@ -1939,9 +1939,9 @@ } }, "source": [ - "The main point of having functions without a name is to use them in a situation where we know ahead of time that we will use the function only once.\n", + "The main point of having functions without a name is to use them in a situation where we know ahead of time that we use the function *once* only.\n", "\n", - "Very popular contexts where we will apply lambda expressions are with the **map-filter-reduce** paradigm in Chapter 7 or when we do \"number crunching\" with **arrays** and **data frames** in Chapter 9." + "Popular applications of lambda expressions are with the **map-filter-reduce** paradigm in Chapter 7 or when we do \"number crunching\" with **arrays** and **data frames** in Chapter 9." ] }, { @@ -1963,13 +1963,13 @@ } }, "source": [ - "So far, we have only used what we refer to as **core** Python in this book. By this, we mean all the syntactical rules as specified in the [language reference](https://docs.python.org/3/reference/) and a minimal set of about 50 built-in [functions](https://docs.python.org/3/library/functions.html). With this we could already implement any algorithm or business logic we can think of!\n", + "So far, we have only used what we refer to as **core** Python in this book. By this, we mean all the syntactical rules as specified in the [language reference](https://docs.python.org/3/reference/) and a minimal set of about 50 built-in [functions](https://docs.python.org/3/library/functions.html). With this, we could already implement any algorithm or business logic we can think of!\n", "\n", "However, after our first couple of programs, we would already start seeing recurring patterns in the code we write. In other words, we would constantly be \"re-inventing the wheel\" in each new project.\n", "\n", "Would it not be smarter to pull out the re-usable components from our programs and put them into some project independent **library** of generically useful functionalities? Then we would only need a way of including these **utilities** in our projects.\n", "\n", - "As all programmers across all languages face this very same issue, most programming languages come with a so-called **[standard library](https://en.wikipedia.org/wiki/Standard_library)** that provides utilities to accomplish common tasks without a lot of code. Examples are making an HTTP request to some website, open and read popular file types (e.g., CSV or Excel files), do something on a computer's file system, and many more." + "As all programmers across all languages face this very same issue, most programming languages come with a so-called **[standard library](https://en.wikipedia.org/wiki/Standard_library)** that provides utilities to accomplish everyday tasks without much code. Examples are making an HTTP request to some website, open and read popular file types (e.g., CSV or Excel files), do something on a computer's file system, and many more." ] }, { @@ -1991,13 +1991,13 @@ } }, "source": [ - "Python also comes with its own [standard library](https://docs.python.org/3/library/index.html) that is structured into coherent modules and packages for given topics: A **module** is just a plain text file with the file extension *.py* that contains Python code while a **package** is a folder that groups several related modules.\n", + "Python also comes with a [standard library](https://docs.python.org/3/library/index.html) that is structured into coherent modules and packages for given topics: A **module** is just a plain text file with the file extension *.py* that contains Python code while a **package** is a folder that groups several related modules.\n", "\n", - "The code in the [standard library](https://docs.python.org/3/library/index.html) is contributed and maintained by many volunteers around the world. In contrast to so-called \"third-party\" packages (cf., the next section below), the Python core development team closely monitors and tests the code in the [standard library](https://docs.python.org/3/library/index.html). Consequently, we can be reasonably sure that anything provided by it works correctly independent of our computer's operating system and will most likely also be there in the next Python versions. Parts in the [standard library](https://docs.python.org/3/library/index.html) that are computationally expensive are often re-written in C and therefore much faster than anything we could code in Python ourselves. So, whenever we can solve a problem with the help of the [standard library](https://docs.python.org/3/library/index.html), it is almost always the best way to do so as well.\n", + "The code in the [standard library](https://docs.python.org/3/library/index.html) is contributed and maintained by many volunteers around the world. In contrast to so-called \"third-party\" packages (cf., the next section below), the Python core development team closely monitors and tests the code in the [standard library](https://docs.python.org/3/library/index.html). Consequently, we can be reasonably sure that anything provided by it works correctly independent of our computer's operating system and will most likely also be there in the next Python versions. Parts in the [standard library](https://docs.python.org/3/library/index.html) that are computationally expensive are often re-written in C and, therefore, much faster than anything we could write in Python ourselves. So, whenever we can solve a problem with the help of the [standard library](https://docs.python.org/3/library/index.html), it is almost always the best way to do so as well.\n", "\n", - "The [standard library](https://docs.python.org/3/library/index.html) has grown very big over the years and we refer to the website [PYMOTW](https://pymotw.com/3/index.html) (i.e., \"Python Module of the Week\") that features well written introductory tutorials and how-to guides to most parts of the library. The same author also published a [book](https://www.amazon.com/Python-Standard-Library-Example-Developers/dp/0134291050/ref=as_li_ss_tl?ie=UTF8&qid=1493563121&sr=8-1&keywords=python+3+standard+library+by+example) that many Pythonistas keep on their shelf for reference. Knowing what is in the [standard library](https://docs.python.org/3/library/index.html) is quite valuable for solving real world tasks quickly.\n", + "The [standard library](https://docs.python.org/3/library/index.html) has grown very big over the years, and we refer to the website [PYMOTW](https://pymotw.com/3/index.html) (i.e., \"Python Module of the Week\") that features well written introductory tutorials and how-to guides to most parts of the library. The same author also published a [book](https://www.amazon.com/Python-Standard-Library-Example-Developers/dp/0134291050/ref=as_li_ss_tl?ie=UTF8&qid=1493563121&sr=8-1&keywords=python+3+standard+library+by+example) that many Pythonistas keep on their shelf for reference. Knowing what is in the [standard library](https://docs.python.org/3/library/index.html) is quite valuable for solving real-world tasks quickly.\n", "\n", - "Throughout this book we will look at many modules and packages from the [standard library](https://docs.python.org/3/library/index.html) in more depth, starting with the [math](https://docs.python.org/3/library/math.html) and [random](https://docs.python.org/3/library/random.html) modules in this chapter." + "Throughout this book, we look at many modules and packages from the [standard library](https://docs.python.org/3/library/index.html) in more depth, starting with the [math](https://docs.python.org/3/library/math.html) and [random](https://docs.python.org/3/library/random.html) modules in this chapter." ] }, { @@ -2084,7 +2084,7 @@ { "data": { "text/plain": [ - "140694050824664" + "140519828111832" ] }, "execution_count": 58, @@ -2132,7 +2132,7 @@ "\n", "Let's see what we can do with the `math` module.\n", "\n", - "The [dir()](https://docs.python.org/3/library/functions.html#dir) built-in function can also be used with an argument passed in. Ignoring the dunder-style names, `math` offers quite a lot of ... names. As we cannot know at this point in time if a listed name refers to a function or an ordinary variable, we use the more generic term **attribute** to mean either one of them." + "The [dir()](https://docs.python.org/3/library/functions.html#dir) built-in function may also be used with an argument passed in. Ignoring the dunder-style names, `math` offers quite a lot of names. As we cannot know at this point if a listed name refers to a function or an ordinary variable, we use the more generic term **attribute** to mean either one of them." ] }, { @@ -2354,9 +2354,9 @@ } }, "source": [ - "Observe how the arguments passed to functions do not need to be just variables or simple literals. Instead, we can pass in any *expression* that evaluates to a *new* object of the type the function expects.\n", + "Observe how the arguments passed to functions do not need to be just variables or simple literals. Instead, we may pass in any *expression* that evaluates to a *new* object of the type the function expects.\n", "\n", - "So just as a reminder from the expression vs. statement discussion in [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb): An expression is *any* syntactically correct combination of variables and literals with operators. And the call operator `()` is just ... well another operator. So both of the next two code cells are just expressions! They have no permanent side effect in memory. We can execute them as often as we want *without* changing the state of the program (i.e., this Jupyter notebook).\n", + "So just as a reminder from the expression vs. statement discussion in [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb): An expression is *any* syntactically correct combination of variables and literals with operators. And the call operator `()` is yet another operator. So both of the next two code cells are just expressions! They have no permanent side effects in memory. We may execute them as often as we want *without* changing the state of the program (i.e., this Jupyter notebook).\n", "\n", "So, regarding the very next cell in particular: Although the `2 ** 2` creates a *new* object `4` in memory that is then immediately passed into the [math.sqrt()](https://docs.python.org/3/library/math.html#math.sqrt) function, once that function call returns, \"all is lost\" and the newly created `4` object is forgotten again, as well as the return value of [math.sqrt()](https://docs.python.org/3/library/math.html#math.sqrt)." ] @@ -2428,7 +2428,7 @@ } }, "source": [ - "If we only need one particular function from a module, we can also use the alternative `from ... import ...` syntax.\n", + "If we only need one particular function from a module, we may also use the alternative `from ... import ...` syntax.\n", "\n", "This does *not* create a module object but only makes a variable in our current location point to an object defined inside a module directly." ] @@ -2489,7 +2489,7 @@ } }, "source": [ - "Often times, we need a random variable, for example, when we want to build a simulation program. The [random](https://docs.python.org/3/library/random.html) module in the [standard library](https://docs.python.org/3/library/index.html) often suffices for that." + "Often, we need a random variable, for example, when we want to build a simulation program. The [random](https://docs.python.org/3/library/random.html) module in the [standard library](https://docs.python.org/3/library/index.html) often suffices for that." ] }, { @@ -2537,7 +2537,7 @@ } }, "source": [ - "Besides the usual dunder-style attributes, the [dir()](https://docs.python.org/3/library/functions.html#dir) built-in function lists some attributes in an upper case naming convention and many others starting with a single underscore \"\\_\". To understand the former, we have to wait until Chapter 10 while the latter are explained further below." + "Besides the usual dunder-style attributes, the built-in [dir()](https://docs.python.org/3/library/functions.html#dir) function lists some attributes in an upper case naming convention and many others starting with a *single* underscore `_`. To understand the former, we must wait until Chapter 10, while the latter is explained further below." ] }, { @@ -2697,7 +2697,7 @@ { "data": { "text/plain": [ - "0.15268128055183228" + "0.5291407120147841" ] }, "execution_count": 75, @@ -2717,7 +2717,7 @@ } }, "source": [ - "While we could build some conditional logic with an `if` statement to map the number generated by [random.random()](https://docs.python.org/3/library/random.html#random.random) to a finite set of elements manually, the [random.choice()](https://docs.python.org/3/library/random.html#random.choice) function provides a lot more **convenience** for us. We simply call it with, for example, the `nums` list and it draws one element out of it with equal chance." + "While we could build some conditional logic with an `if` statement to map the number generated by [random.random()](https://docs.python.org/3/library/random.html#random.random) to a finite set of elements manually, the [random.choice()](https://docs.python.org/3/library/random.html#random.choice) function provides a lot more **convenience** for us. We call it with, for example, the `nums` list and it draws one element out of it with equal chance." ] }, { @@ -2732,7 +2732,7 @@ { "data": { "text/plain": [ - ">" + ">" ] }, "execution_count": 76, @@ -2781,7 +2781,7 @@ { "data": { "text/plain": [ - "2" + "4" ] }, "execution_count": 78, @@ -2801,7 +2801,7 @@ } }, "source": [ - "In order to re-produce the same random numbers in a simulation each time we run it, we can set the **[random seed](https://en.wikipedia.org/wiki/Random_seed)**. It is good practice to do this at the beginning of a program or notebook. Then every time we re-start the program, we will get the exact same random numbers again. This becomes very important, for example, when we employ certain machine learning algorithms that rely on randomization, like the infamous [Random Forest](https://en.wikipedia.org/wiki/Random_forest), and want to obtain **re-producable** results.\n", + "To reproduce the same random numbers in a simulation each time we run it, we set the **[random seed](https://en.wikipedia.org/wiki/Random_seed)**. It is good practice to do this at the beginning of a program or notebook. Then every time we re-start the program, we get the *same* random numbers again. This becomes very important, for example, when we employ machine learning algorithms that rely on randomization, like the infamous [Random Forest](https://en.wikipedia.org/wiki/Random_forest), and want to obtain **reproducable** results.\n", "\n", "The [random](https://docs.python.org/3/library/random.html) module provides the [random.seed()](https://docs.python.org/3/library/random.html#random.seed) function to do that." ] @@ -2899,11 +2899,11 @@ } }, "source": [ - "As the Python community is based around open source, many developers publish their code, for example, on the Python Package Index [PyPI](https://pypi.org) from where anyone can download and install it for free using command line based tools like [pip](https://pip.pypa.io/en/stable/) or [conda](https://conda.io/en/latest/). This way, we can always customize our Python installation even more. Managing many such packages is actually quite a deep topic on its own, sometimes fearfully called **[dependency hell](https://en.wikipedia.org/wiki/Dependency_hell)**.\n", + "As the Python community is based around open source, many developers publish their code, for example, on the Python Package Index [PyPI](https://pypi.org) from where anyone may download and install it for free using command-line based tools like [pip](https://pip.pypa.io/en/stable/) or [conda](https://conda.io/en/latest/). This way, we can always customize our Python installation even more. Managing many such packages is quite a deep topic on its own, sometimes fearfully called **[dependency hell](https://en.wikipedia.org/wiki/Dependency_hell)**.\n", "\n", - "The difference between the [standard library](https://docs.python.org/3/library/index.html) and such **third-party** packages is that in the first case the code goes through a much more formalized review process and is officially endorsed by the Python core developers. Yet, many third-party projects also offer the highest quality standards and a lot of such software is actually also relied on by many businesses and researchers.\n", + "The difference between the [standard library](https://docs.python.org/3/library/index.html) and such **third-party** packages is that in the first case, the code goes through a much more formalized review process and is officially endorsed by the Python core developers. Yet, many third-party projects also offer the highest quality standards and are also relied on by many businesses and researchers.\n", "\n", - "Throughout this book, we will look at many third-party libraries, mostly from Python's [scientific stack](https://scipy.org/about.html), a tightly coupled set of third-party libraries for storing **big data** efficiently ([numpy](http://www.numpy.org/)), \"wrangling\" ([pandas](https://pandas.pydata.org/)) and visualizing them ([matplotlib](https://matplotlib.org/) and [seaborn](https://seaborn.pydata.org/)), fitting classical statistical models ([statsmodels](http://www.statsmodels.org/)), training machine learning models ([sklearn](http://scikit-learn.org/)), and much more.\n", + "Throughout this book, we will look at many third-party libraries, mostly from Python's [scientific stack](https://scipy.org/about.html), a tightly coupled set of third-party libraries for storing **big data** efficiently (e.g., [numpy](http://www.numpy.org/)), \"wrangling\" (e.g., [pandas](https://pandas.pydata.org/)) and visualizing them (e.g., [matplotlib](https://matplotlib.org/) or [seaborn](https://seaborn.pydata.org/)), fitting classical statistical models (e.g., [statsmodels](http://www.statsmodels.org/)), training machine learning models (e.g., [sklearn](http://scikit-learn.org/)), and much more.\n", "\n", "Below, we briefly show how to install third-party libraries." ] @@ -2927,11 +2927,11 @@ } }, "source": [ - "[numpy](http://www.numpy.org/) is the de-facto standard in the Python world for handling **array-like** data. That is a fancy word for data that can be put into a matrix or vector format. We will look at it in depth in Chapter 9.\n", + "[numpy](http://www.numpy.org/) is the de-facto standard in the Python world for handling **array-like** data. That is a fancy word for data that can be put into a matrix or vector format. We look at it in depth in Chapter 9.\n", "\n", - "As [numpy](http://www.numpy.org/) is *not* in the [standard library](https://docs.python.org/3/library/index.html), it must be *manually* installed, for example, with the [pip](https://pip.pypa.io/en/stable/) tool. As mentioned in [Chapter 0](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/00_start_up.ipynb), to execute terminal commands from within a Jupyter notebook, we just need to start a code cell with an exclamation mark.\n", + "As [numpy](http://www.numpy.org/) is *not* in the [standard library](https://docs.python.org/3/library/index.html), it must be *manually* installed, for example, with the [pip](https://pip.pypa.io/en/stable/) tool. As mentioned in [Chapter 0](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/00_start_up.ipynb), to execute terminal commands from within a Jupyter notebook, we start a code cell with an exclamation mark.\n", "\n", - "If you are running this notebook with an installation of the [Anaconda Distribution](https://www.anaconda.com/distribution/), then [numpy](http://www.numpy.org/) is probably already installed. Running the cell below, will just confirm that." + "If you are running this notebook with an installation of the [Anaconda Distribution](https://www.anaconda.com/distribution/), then [numpy](http://www.numpy.org/) is probably already installed. Running the cell below confirms that." ] }, { @@ -2963,7 +2963,7 @@ } }, "source": [ - "[numpy](http://www.numpy.org/) is conventionally imported with the shorter **idiomatic** name `np`. The `as` in the import statement just changes the resulting variable name. It is a shortcut for the three lines `import numpy`, `np = numpy`, and `del numpy`." + "[numpy](http://www.numpy.org/) is conventionally imported with the shorter **idiomatic** name `np`. The `as` in the import statement changes the resulting variable name. It is a shortcut for the three lines `import numpy`, `np = numpy`, and `del numpy`." ] }, { @@ -2987,7 +2987,7 @@ } }, "source": [ - "`np` can be used in the same way as `math` or `random` above." + "`np` is used in the same way as `math` or `random` above." ] }, { @@ -3094,9 +3094,9 @@ } }, "source": [ - "[numpy](http://www.numpy.org/) somehow magically adds new behavior to Python's built-in arithmetic operators. For example, we can now [scalar-multiply](https://en.wikipedia.org/wiki/Scalar_multiplication) `vec`.\n", + "[numpy](http://www.numpy.org/) somehow magically adds new behavior to Python's built-in arithmetic operators. For example, we may now [scalar-multiply](https://en.wikipedia.org/wiki/Scalar_multiplication) `vec`.\n", "\n", - "[numpy](http://www.numpy.org/)'s functions are implemented in highly optimized C code and therefore fast, especially when it comes to big data." + "[numpy](http://www.numpy.org/)'s functions are implemented in highly optimized C code and, therefore, are fast, especially when dealing with bigger amounts of data." ] }, { @@ -3236,13 +3236,13 @@ } }, "source": [ - "For sure, we can create our own modules and packages. In the repository's main directory, there is a [*sample_module.py*](https://github.com/webartifex/intro-to-python/blob/master/sample_module.py) file that contains, among others, a function equivalent to the final version of `average_evens()`. To be realistic, this sample module is structured in a modular manner with several functions building on each other. It is best to skim over it *now* before reading on.\n", + "For sure, we can create local modules and packages. In the repository's main directory, there is a [*sample_module.py*](https://github.com/webartifex/intro-to-python/blob/master/sample_module.py) file that contains, among others, a function equivalent to the final version of `average_evens()`. To be realistic, this sample module is structured in a modular manner with several functions building on each other. It is best to skim over it *now* before reading on.\n", "\n", "To make code we put into a *.py* file available in our program, we import it as a module just as we did above with modules in the [standard library](https://docs.python.org/3/library/index.html) or third-party packages.\n", "\n", - "The *name* to be imported is the file's name except for the *.py* part. In order for this to work, the file's name *must* adhere to the *same* rules as hold for [variable names](https://docs.python.org/3/reference/lexical_analysis.html#identifiers) in general.\n", + "The *name* to be imported is the file's name except for the *.py* part. For this to work, the file's name *must* adhere to the *same* rules as hold for [variable names](https://docs.python.org/3/reference/lexical_analysis.html#identifiers) in general.\n", "\n", - "What happens during an import is conceptually as follows. When Python sees the `import sample_module` part, it first creates a *new* object of type `module` in memory. This is effectively an *empty* namespace. Then, it executes the imported file's code from top to bottom. Whatever variables are still defined at the end of this, are put into the module's namespace. Only if the file's code does *not* raise an error, will Python make a variable in our current location (i.e., `mod` here) point to the created `module` object. Otherwise, it is discarded. In essence, it is as if we copied and pasted the file's code in place of the import statement. If we import an already imported module again, Python is smart enough to avoid doing all this work all over and does nothing." + "What happens during an import is as follows. When Python sees the `import sample_module` part, it first creates a *new* object of type `module` in memory. This is effectively an *empty* namespace. Then, it executes the imported file's code from top to bottom. Whatever variables are still defined at the end of this, are put into the module's namespace. Only if the file's code does *not* raise an error, will Python make a variable in our current location (i.e., `mod` here) point to the created `module` object. Otherwise, it is discarded. In essence, it is as if we copied and pasted the file's code in place of the import statement. If we import an already imported module again, Python is smart enough to avoid doing all this work all over and does nothing." ] }, { @@ -3292,7 +3292,7 @@ "source": [ "Disregarding the dunder-style attributes, `mod` defines the five attributes `_default_scalar`, `_scaled_average`, `average`, `average_evens`, and `average_odds`, which are exactly the ones we would expect from reading the [*sample_module.py*](https://github.com/webartifex/intro-to-python/blob/master/sample_module.py) file.\n", "\n", - "An important convention when working with imported code is to *disregard* any attributes starting with an underscore \"\\_\". These are considered **private** and constitute **implementation details** the author of the imported code might change in a future version of his software. We *must* not rely on them in any way.\n", + "A convention when working with imported code is to *disregard* any attributes starting with an underscore `_`. These are considered **private** and constitute **implementation details** the author of the imported code might change in a future version of his software. We *must* not rely on them in any way.\n", "\n", "In contrast, the three remaining **public** attributes are the functions `average()`, `average_evens()`, and `average_odds()` that we may use after the import." ] @@ -3341,7 +3341,7 @@ } }, "source": [ - "We can use the imported `mod.average_evens()` just like `average_evens()` defined above. The advantage we get from **modularization** with *.py* files is that we can now easily re-use functions across different Jupyter notebooks without re-defining them again and again. Also, we can \"source out\" code that distracts from the storyline told in a notebook." + "We use the imported `mod.average_evens()` just like `average_evens()` defined above. The advantage we get from **modularization** with *.py* files is that we can now easily re-use functions across different Jupyter notebooks without re-defining them again and again. Also, we can \"source out\" code that distracts from the storyline told in a notebook." ] }, { @@ -3433,9 +3433,9 @@ } }, "source": [ - "Packages are a generalization of modules and we will look at one in Chapter 10 in detail. You can, however, already look at a [sample package](https://github.com/webartifex/intro-to-python/tree/master/sample_package) in the repository, which is nothing but a folder with *.py* files in it.\n", + "Packages are a generalization of modules, and we look at one in detail in Chapter 10. You may, however, already look at a [sample package](https://github.com/webartifex/intro-to-python/tree/master/sample_package) in the repository, which is nothing but a folder with *.py* files in it.\n", "\n", - "As a further references on modules, we refer to the [official tutorial](https://docs.python.org/3/tutorial/modules.html)." + "As a further reading on modules and packages, we refer to the [official tutorial](https://docs.python.org/3/tutorial/modules.html)." ] }, { @@ -3461,14 +3461,14 @@ "\n", "Functions provide benefits as they:\n", "\n", - "- make programs easier to comprehend and debug for humans as they give names to the smaller parts of a larger program (i.e., they **modularize** a code base), and\n", + "- make programs easier to comprehend and debug for humans as they give names to the smaller parts of a larger program (i.e., they **modularize** a codebase), and\n", "- eliminate redundancies by allowing **re-use of code**.\n", "\n", - "Functions are **defined** once with the `def` statement. Then, they can be **called** many times with the call operator `()`.\n", + "Functions are **defined** once with the `def` statement. Then, they may be **called** many times with the call operator `()`.\n", "\n", "They may process **parameterized** inputs, **passed** in as **arguments**, and output a **return value**.\n", "\n", - "Arguments can be passed in by **position** or **keyword**. Some functions may even require **keyword-only** arguments.\n", + "Arguments may be passed in by **position** or **keyword**. Some functions may even require **keyword-only** arguments.\n", "\n", "**Lambda expressions** create anonymous functions.\n", "\n", diff --git a/02_functions_review_and_exercises.ipynb b/02_functions_review_and_exercises.ipynb index dcd0a9c..d0935c6 100644 --- a/02_functions_review_and_exercises.ipynb +++ b/02_functions_review_and_exercises.ipynb @@ -324,9 +324,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Q11.5**: Using the [range()](https://docs.python.org/3/library/functions.html#func-range) built-in, write a `for`-loop and calculate the volume of a sphere with `radius = 42.0` for all `digits` from `1` through `20`. Print out each volume on a seperate line.\n", + "**Q11.5**: Using the [range()](https://docs.python.org/3/library/functions.html#func-range) built-in, write a `for`-loop and calculate the volume of a sphere with `radius = 42.0` for all `digits` from `1` through `20`. Print out each volume on a separate line.\n", "\n", - "Note: This is the first task where you actually need to use the [print()](https://docs.python.org/3/library/functions.html#print) built-in function." + "Note: This is the first task where you need to use the built-in [print()](https://docs.python.org/3/library/functions.html#print) function." ] }, { @@ -349,7 +349,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Q11.6**: What important lesson did you learn about the `float` type?" + "**Q11.6**: What lesson did you learn about the `float` type?" ] }, { diff --git a/03_conditionals.ipynb b/03_conditionals.ipynb index 6d70f07..d0ed61d 100644 --- a/03_conditionals.ipynb +++ b/03_conditionals.ipynb @@ -19,9 +19,9 @@ } }, "source": [ - "We analyzed every aspect of the `average_evens()` function in [Chapter 2](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/02_functions.ipynb) except for the `if` part. While it seems to intuitively do what we expect it to, there is a whole lot more to be learned from taking it apart. In particular, the `if` can occur within both a **statement** as in our introductory example in [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb) but also an **expression** as in `average_evens()`. This is analogous as to how a noun in a natural language is *either* the subject of *or* an object in a sentence. What is common to both versions of the `if` is that it leads to code being executed for *parts* of the input only. It is our first way of **controlling** the **flow of execution** in a program.\n", + "We analyzed every aspect of the `average_evens()` function in [Chapter 2](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/02_functions.ipynb) except for the `if` related parts. While it seems to do what we expect it to, there is a whole lot more we learn from taking it apart. In particular, the `if` may occur within both a **statement** or an **expression**, analogous as to how a noun in a natural language is *either* the subject of *or* an object in a sentence. What is common to both usages is that it leads to code being executed for *parts* of the input only. It is our first way of **controlling** the **flow of execution** in a program.\n", "\n", - "After deconstructing `if` in the first part of this chapter, we take a close look at a similar concept, namely handling and raising **exceptions**." + "After deconstructing `if` in the first part of this chapter, we take a close look at a similar concept, namely handling **exceptions**." ] }, { @@ -43,7 +43,7 @@ } }, "source": [ - "Any expression that is either true or not is called a **boolean expression**. If you think such expressions are boring or just not so useful, read a bit on [propositional logic](https://en.wikipedia.org/wiki/Propositional_calculus) and you will quickly realize how mathematicians and originally philosophers base their rules of how to prove or disprove a conclusion on simple true-or-false \"statements\" about the world. It is the underlying principle of all of reasoning.\n", + "Any expression that is either true or not is called a **boolean expression**. It is such simple true-or-false \"statements\" about the world on which mathematicians, and originally philosophers, base their rules of reasoning: They are studied formally in the field of [propositional logic](https://en.wikipedia.org/wiki/Propositional_calculus).\n", "\n", "A trivial example involves the equality operator `==` that evaluates to either `True` or `False` depending on its operands \"comparing equal\" or not." ] @@ -104,7 +104,7 @@ } }, "source": [ - "Observe how `==` can handle objects of *different* type. This shows how it implements a notion of equality in line with how we humans think of things being equal or not. After all, `42` and `42.0` are totally different $0$s and $1$s for a computer and many programming languages would actually say `False` here! Technically, this is yet another example of operator overloading." + "The `==` operator handles objects of *different* types: It implements a notion of equality in line with how humans think of things being equal or not. After all, `42` and `42.0` are different $0$s and $1$s for a computer and other programming languages might say `False` here! Technically, this is yet another example of operator overloading." ] }, { @@ -139,7 +139,7 @@ } }, "source": [ - "There are, however, cases where even well-behaved Python does not make us happy. Chapter 5 will provide more insights on this \"bug\"." + "There are, however, cases where even well-behaved Python does not make us happy. [Chapter 5](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/05_numbers.ipynb) provides more insights into this \"bug.\"" ] }, { @@ -174,7 +174,7 @@ } }, "source": [ - "`True` and `False` are special built-in *objects* of type `bool`." + "`True` and `False` are built-in *objects* of type `bool`." ] }, { @@ -189,7 +189,7 @@ { "data": { "text/plain": [ - "94163040564192" + "94906834637792" ] }, "execution_count": 5, @@ -213,7 +213,7 @@ { "data": { "text/plain": [ - "94163040564160" + "94906834637760" ] }, "execution_count": 6, @@ -281,11 +281,11 @@ } }, "source": [ - "Let's not confuse the boolean `False` with `None`, another special built-in object! We saw the latter before in [Chapter 2](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/02_functions.ipynb) as the *implicit* return value of a function without a `return` statement.\n", + "Let's not confuse the boolean `False` with `None`, another built-in object! We saw the latter before in [Chapter 2](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/02_functions.ipynb) as the *implicit* return value of a function without a `return` statement.\n", "\n", - "We might think of `None` in a boolean context indicating a \"maybe\" or even an \"unknown\" answer; however, for Python, there are no \"maybe\" or \"unknown\" objects as we will see further below!\n", + "We might think of `None` in a boolean context indicating a \"maybe\" or even an \"unknown\" answer; however, for Python, there are no \"maybe\" or \"unknown\" objects, as we see further below!\n", "\n", - "Whereas `False` is of type `bool`, `None` is of type `NoneType`. So, they are totally unrelated. On the contrary, as both `True` and `False` are of the same type, we could call them \"siblings\"." + "Whereas `False` is of type `bool`, `None` is of type `NoneType`. So, they are unrelated. On the contrary, as both `True` and `False` are of the same type, we could call them \"siblings.\"" ] }, { @@ -313,7 +313,7 @@ { "data": { "text/plain": [ - "94163040551152" + "94906834624752" ] }, "execution_count": 10, @@ -357,9 +357,9 @@ } }, "source": [ - "`True`, `False`, and `None` have the property that they each exist in memory only *once*. Objects designed this way are so-called **singletons**. This **[design pattern](https://en.wikipedia.org/wiki/Design_Patterns)** was originally developed to keep a program's memory usage at a minimum. It may only be employed in situations where we know that an object will *not* mutate its value (i.e., to re-use the bag analogy from [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb), no flipping of $0$s and $1$s in the bag is allowed). In languages \"closer\" to the memory like C we would have to code this singleton logic ourselves but Python has this already built in for *some* types.\n", + "`True`, `False`, and `None` have the property that they each exist in memory only *once*. Objects designed this way are so-called **singletons**. This **[design pattern](https://en.wikipedia.org/wiki/Design_Patterns)** was originally developed to keep a program's memory usage at a minimum. It may only be employed in situations where we know that an object does *not* mutate its value (i.e., to re-use the bag analogy from [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb), no flipping of $0$s and $1$s in the bag is allowed). In languages \"closer\" to the memory like C, we would have to code this singleton logic ourselves, but Python has this built in for *some* types.\n", "\n", - "We can verify this with either the `is` operator or by comparing memory addresses." + "We verify this with either the `is` operator or by comparing memory addresses." ] }, { @@ -523,7 +523,7 @@ } }, "source": [ - "The \"less than\" `<` or \"greater than\" `>` operators on their own mean \"strictly less than\" or \"strictly greater than\" but can be combined with the equality operator into just `<=` and `>=`. This is a shortcut for using the logical `or` operator as described in the next section." + "The \"less than\" `<` or \"greater than\" `>` operators mean \"strictly less than\" or \"strictly greater than\" but may be combined with the equality operator into just `<=` and `>=`. This is a shortcut for using the logical `or` operator as described in the next section." ] }, { @@ -641,9 +641,9 @@ } }, "source": [ - "Boolean expressions can be combined or negated with the **logical operators** `and`, `or`, and `not` to form new boolean expressions. Of course, this may be done *recursively* as well to obtain boolean expressions of arbitrary complexity.\n", + "Boolean expressions may be combined or negated with the **logical operators** `and`, `or`, and `not` to form new boolean expressions. Of course, this may be done *recursively* as well to obtain boolean expressions of arbitrary complexity.\n", "\n", - "Their usage is similar to how the equivalent words are used in plain English:\n", + "Their usage is similar to how the equivalent words are used in everyday English:\n", "\n", "- `and` evaluates to `True` if *both* sub-expressions evaluate to `True` and `False` otherwise,\n", "- `or` evaluates to `True` if either one *or* both sub-expressions evaluate to `True` and `False` otherwise, and\n", @@ -672,7 +672,7 @@ } }, "source": [ - "Relational operators have a **[higher precedence](https://docs.python.org/3/reference/expressions.html#operator-precedence)** over logical operators. So the following expression means what we intuitively think it does." + "Relational operators have **[higher precedence](https://docs.python.org/3/reference/expressions.html#operator-precedence)** over logical operators. So the following expression means what we intuitively think it does." ] }, { @@ -707,7 +707,7 @@ } }, "source": [ - "However, sometimes it is good to use *parentheses* around each sub-expression for clarity." + "However, sometimes, it is good to use *parentheses* around each sub-expression for clarity." ] }, { @@ -742,7 +742,7 @@ } }, "source": [ - "This is especially useful when several logical operators are combined." + "This is especially the case when several logical operators are combined." ] }, { @@ -825,9 +825,9 @@ } }, "source": [ - "For even better readability, [some practitioners](https://llewellynfalco.blogspot.com/2016/02/dont-use-greater-than-sign-in.html) suggest to *never* use the `>` and `>=` operators (note that the included example is written in [Java](https://en.wikipedia.org/wiki/Java_%28programming_language%29) and `&&` means `and` and `||` means `or`).\n", + "For even better readability, some practitioners suggest to *never* use the `>` and `>=` operators (cf., [source](https://llewellynfalco.blogspot.com/2016/02/dont-use-greater-than-sign-in.html); note that the included example is written in [Java](https://en.wikipedia.org/wiki/Java_%28programming_language%29) where `&&` means `and` and `||` means `or`).\n", "\n", - "Python allows **chaining** relational operators that are combined with the `and` operator. For example, the following two cells implement the same logic where the second is a lot easier to read." + "Python allows **chaining** relational operators that are combined with the `and` operator. For example, the following two cells implement the same logic, where the second is a lot easier to read." ] }, { @@ -897,9 +897,9 @@ } }, "source": [ - "The operands of the logical operators do not actually have to be *boolean* expressions as defined above but may be *any* kind of expression. If a sub-expression does *not* evaluate to an object of type `bool`, Python automatically casts the resulting object as such.\n", + "The operands of the logical operators do not need to be *boolean* expressions as defined above but may be *any* expression. If a sub-expression does *not* evaluate to an object of type `bool`, Python automatically casts it as such.\n", "\n", - "For example, any non-zero numeric object becomes `True`. While this behavior allows writing conciser and thus more \"beautiful\" code, it is also a common source of confusion." + "For example, any non-zero numeric object becomes `True`. While this behavior allows writing more concise and thus more \"beautiful\" code, it is also a common source of confusion. `(x - 9)` is cast as `True` and then the overall expression evaluates to `True` as well." ] }, { @@ -923,7 +923,7 @@ } ], "source": [ - "(x - 9) and (y < 100) # = 33 and (y < 100)" + "(x - 9) and (y < 100) # = 33 and (y < 100) = 33 and True" ] }, { @@ -934,7 +934,7 @@ } }, "source": [ - "Whenever we are unsure as to how Python will evaluate a non-boolean expression in a boolean context, the [bool()](https://docs.python.org/3/library/functions.html#bool) built-in allows us to check it ourselves." + "Whenever we are unsure as to how Python evaluates a non-boolean expression in a boolean context, the [bool()](https://docs.python.org/3/library/functions.html#bool) built-in allows us to check it ourselves." ] }, { @@ -1028,7 +1028,7 @@ } }, "source": [ - "In a boolean context, `None` is casted as `False`! So, `None` is really *not* a \"maybe\" answer but a \"no\"." + "In a boolean context, `None` is cast as `False`! So, `None` is *not* a \"maybe\" answer but a \"no.\"" ] }, { @@ -1063,7 +1063,7 @@ } }, "source": [ - "Another good rule to know is that container types (e.g., `list`) evaluate to `True` whenever they are not empty and `False` otherwise." + "Another good rule to know is that container types (e.g., `list`) evaluate to `True` whenever they are *not* empty and `False` if they are." ] }, { @@ -1133,7 +1133,7 @@ } }, "source": [ - "## Conditional Statements" + "### Short-Circuiting" ] }, { @@ -1144,24 +1144,406 @@ } }, "source": [ - "In order to write useful programs, we need to control the flow of execution, for example, to react to user input. The logic by which a program does that is referred to as **business logic**.\n", + "When evaluating boolean expressions with logical operators in it, Python follows the **[short-circuiting](https://en.wikipedia.org/wiki/Short-circuit_evaluation)** strategy: First, the inner-most sub-expressions are evaluated. Second, with identical **[operator precedence](https://docs.python.org/3/reference/expressions.html#operator-precedence)**, evaluation goes from left to right. Once it is clear what the overall truth value is, no more sub-expressions are evaluated, and the result is *immediately* returned.\n", "\n", - "One major language construct to do so is the **[conditional statement](https://docs.python.org/3/reference/compound_stmts.html#the-if-statement)** or `if` statement. It consists of:\n", + "In summary, data science practitioners must know *how* the following two generic expressions are evaluated:\n", + "\n", + "- `x or y`: The `y` expression is evaluated *only if* `x` evaluates to `False`, in which case `y` is returned; otherwise, `x` is returned *without* even looking at `y`.\n", + "- `x and y`: The `y` expression is evaluated *only if* `x` evaluates to `True`. Then, if `y` also evaluates to `True`, it is returned; otherwise, `x` is returned.\n", + "\n", + "Let's look at a couple of examples." + ] + }, + { + "cell_type": "code", + "execution_count": 36, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "x = 0\n", + "y = 1\n", + "z = 2" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "We define a helper function `expr()` that prints out the only argument it is passed before returning it. With `expr()`, we can see if a sub-expression is evaluated or not." + ] + }, + { + "cell_type": "code", + "execution_count": 37, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "def expr(arg):\n", + " \"\"\"Print and return the only argument.\"\"\"\n", + " print(\"Arg:\", arg)\n", + " return arg" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "With the `or` operator, the first sub-expression that evaluates to `True` is returned." + ] + }, + { + "cell_type": "code", + "execution_count": 38, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Arg: 0\n", + "Arg: 1\n" + ] + }, + { + "data": { + "text/plain": [ + "1" + ] + }, + "execution_count": 38, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "expr(x) or expr(y)" + ] + }, + { + "cell_type": "code", + "execution_count": 39, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Arg: 1\n" + ] + }, + { + "data": { + "text/plain": [ + "1" + ] + }, + "execution_count": 39, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "expr(y) or expr(z)" + ] + }, + { + "cell_type": "code", + "execution_count": 40, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Arg: 0\n", + "Arg: 1\n" + ] + }, + { + "data": { + "text/plain": [ + "1" + ] + }, + "execution_count": 40, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "expr(x) or expr(y) or expr(z)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "If all sub-expressions evaluate to `False`, the last one is the result." + ] + }, + { + "cell_type": "code", + "execution_count": 41, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Arg: False\n", + "Arg: []\n", + "Arg: 0\n" + ] + }, + { + "data": { + "text/plain": [ + "0" + ] + }, + "execution_count": 41, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "expr(False) or expr([]) or expr(x)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "With the `and` operator, the first sub-expression that evaluates to `False` is returned." + ] + }, + { + "cell_type": "code", + "execution_count": 42, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Arg: 0\n" + ] + }, + { + "data": { + "text/plain": [ + "0" + ] + }, + "execution_count": 42, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "expr(x) and expr(y)" + ] + }, + { + "cell_type": "code", + "execution_count": 43, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Arg: 1\n", + "Arg: 0\n" + ] + }, + { + "data": { + "text/plain": [ + "0" + ] + }, + "execution_count": 43, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "expr(y) and expr(x)" + ] + }, + { + "cell_type": "code", + "execution_count": 44, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Arg: 2\n", + "Arg: 1\n", + "Arg: 0\n" + ] + }, + { + "data": { + "text/plain": [ + "0" + ] + }, + "execution_count": 44, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "expr(z) and expr(y) and expr(x)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "If all sub-expressions evaluate to `True`, the last one is returned." + ] + }, + { + "cell_type": "code", + "execution_count": 45, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Arg: 1\n", + "Arg: 2\n" + ] + }, + { + "data": { + "text/plain": [ + "2" + ] + }, + "execution_count": 45, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "expr(y) and expr(z)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The crucial takeaway is that Python does *not* necessarily evaluate *all* sub-expressions and, therefore, our code should never rely on that assumption." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## The `if` Statement" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "To write useful programs, we need to control the flow of execution, for example, to react to user input. The logic by which a program does that is referred to as **business logic**.\n", + "\n", + "One language construct to do so is the **[conditional statement](https://docs.python.org/3/reference/compound_stmts.html#the-if-statement)**, or `if` statement for short. It consists of:\n", "\n", "- *one* mandatory `if`-clause,\n", "- an *arbitrary* number of `elif`-clauses (i.e. \"else if\"), and\n", "- an *optional* `else`-clause.\n", "\n", - "The `if`- and `elif`-clauses each specify one *boolean* expression, also called **condition**, while the `else`-clause serves as a \"catch everything else\" case.\n", + "The `if`- and `elif`-clauses each specify one *boolean* expression, also called **condition**, while the `else`-clause serves as the \"catch everything else\" case.\n", "\n", - "In terms of syntax, the header lines end with a colon and the code blocks are indented.\n", + "In terms of syntax, the header lines end with a colon, and the code blocks are indented.\n", "\n", "In contrast to our intuitive interpretation in natural languages, only the code in *one* of the alternatives, also called **branches**, is executed. To be precise, it is always the code in the first branch whose condition evaluates to `True`." ] }, { "cell_type": "code", - "execution_count": 36, + "execution_count": 46, "metadata": { "code_folding": [], "slideshow": { @@ -1175,7 +1557,7 @@ }, { "cell_type": "code", - "execution_count": 37, + "execution_count": 47, "metadata": { "code_folding": [], "slideshow": { @@ -1212,12 +1594,12 @@ "source": [ "In many situations, we only need a reduced form of the `if` statement.\n", "\n", - "We could **inject** code only at random to, for example, implement some sort of [A/B testing](https://en.wikipedia.org/wiki/A/B_testing)." + "We could **inject** code only at random, for example, to implement an [A/B testing](https://en.wikipedia.org/wiki/A/B_testing) strategy." ] }, { "cell_type": "code", - "execution_count": 38, + "execution_count": 48, "metadata": { "slideshow": { "slide_type": "slide" @@ -1230,24 +1612,16 @@ }, { "cell_type": "code", - "execution_count": 39, + "execution_count": 49, "metadata": { "slideshow": { "slide_type": "-" } }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "You will read this just as often as you see heads when tossing a coin\n" - ] - } - ], + "outputs": [], "source": [ "if random.random() > 0.5:\n", - " print(\"You will read this just as often as you see heads when tossing a coin\")" + " print(\"You read this just as often as you see heads when tossing a coin\")" ] }, { @@ -1258,12 +1632,12 @@ } }, "source": [ - "More often than not, we might model a binary choice." + "More often than not, we model a **binary choice**." ] }, { "cell_type": "code", - "execution_count": 40, + "execution_count": 50, "metadata": { "slideshow": { "slide_type": "skip" @@ -1293,12 +1667,14 @@ } }, "source": [ - "We may **nest** `if` statements to control the flow of execution in a more granular way. Every additional layer, however, makes the code less readable, in particular, if we have more than one line per code block." + "We may **nest** `if` statements to control the flow of execution in a more granular way. Every additional layer, however, makes the code *less* readable, in particular, if we have more than one line per code block.\n", + "\n", + "The code cell below *either* checks if a number is even or odd *or* if it is positive or negative." ] }, { "cell_type": "code", - "execution_count": 41, + "execution_count": 51, "metadata": { "slideshow": { "slide_type": "slide" @@ -1309,7 +1685,7 @@ "name": "stdout", "output_type": "stream", "text": [ - "z is odd\n" + "z is positive\n" ] } ], @@ -1334,12 +1710,14 @@ } }, "source": [ - "A good way to make this code more readable is to introduce **temporary variables** *in combination* with using the `and` operator to **flatten** the branching logic. The `if` statement then reads almost like plain English. In contrast to many other languages, creating variables is a computationally *cheap* operation in Python and also helps to document the code *inline*. Without temporary variables, the `and` flattening could actually lead to more sub-expressions in the conditions be evaluated than necessary. Do you see why?" + "A way to make this code more readable is to introduce **temporary variables** *in combination* with the `and` operator to **flatten** the branching logic. The `if` statement then reads almost like plain English. In contrast to many other languages, creating variables is a computationally *cheap* operation in Python and also helps to document the code *inline* with meaningful variable names.\n", + "\n", + "Flattening the logic *without* temporary variables could lead to *more* sub-expressions in the conditions be evaluated than necessary. Do you see why?" ] }, { "cell_type": "code", - "execution_count": 42, + "execution_count": 52, "metadata": { "slideshow": { "slide_type": "slide" @@ -1377,7 +1755,7 @@ } }, "source": [ - "## Conditional Expressions" + "## The `if` Expression" ] }, { @@ -1388,7 +1766,7 @@ } }, "source": [ - "When all we do with an `if` statement is to assign an object to a variable with respect to a single true-or-false condition (cf., binary choice above), there is a shortcut for that: We could simply assign the result of a so-called **conditional expression** or `if` expression to the variable.\n", + "When all we do with an `if` statement is to assign an object to a variable according to a single true-or-false condition (i.e., a binary choice), there is a shortcut: We assign the variable the result of a so-called **conditional expression**, or `if` expression for short, instead.\n", "\n", "Think of a situation where we evaluate a piece-wise functional relationship $y = f(x)$ at a given $x$, for example:" ] @@ -1412,7 +1790,7 @@ }, { "cell_type": "code", - "execution_count": 43, + "execution_count": 53, "metadata": { "slideshow": { "slide_type": "slide" @@ -1436,7 +1814,7 @@ }, { "cell_type": "code", - "execution_count": 44, + "execution_count": 54, "metadata": { "slideshow": { "slide_type": "-" @@ -1452,7 +1830,7 @@ }, { "cell_type": "code", - "execution_count": 45, + "execution_count": 55, "metadata": { "slideshow": { "slide_type": "-" @@ -1465,7 +1843,7 @@ "9" ] }, - "execution_count": 45, + "execution_count": 55, "metadata": {}, "output_type": "execute_result" } @@ -1487,7 +1865,7 @@ }, { "cell_type": "code", - "execution_count": 46, + "execution_count": 56, "metadata": { "slideshow": { "slide_type": "fragment" @@ -1500,7 +1878,7 @@ }, { "cell_type": "code", - "execution_count": 47, + "execution_count": 57, "metadata": { "slideshow": { "slide_type": "skip" @@ -1513,7 +1891,7 @@ "9" ] }, - "execution_count": 47, + "execution_count": 57, "metadata": {}, "output_type": "execute_result" } @@ -1530,12 +1908,12 @@ } }, "source": [ - "In this concrete example, however, the most elegant solution would be to use the built-in [max()](https://docs.python.org/3/library/functions.html#max) function." + "In this example, however, the most elegant solution would be to use the built-in [max()](https://docs.python.org/3/library/functions.html#max) function." ] }, { "cell_type": "code", - "execution_count": 48, + "execution_count": 58, "metadata": { "slideshow": { "slide_type": "fragment" @@ -1548,7 +1926,7 @@ }, { "cell_type": "code", - "execution_count": 49, + "execution_count": 59, "metadata": { "slideshow": { "slide_type": "skip" @@ -1561,7 +1939,7 @@ "9" ] }, - "execution_count": 49, + "execution_count": 59, "metadata": {}, "output_type": "execute_result" } @@ -1578,7 +1956,7 @@ } }, "source": [ - "Conditional expressions may not only be used in the way described in this section. We already saw them as part of a list comprehension that is introduced in Chapter 7." + "Conditional expressions may not only be used in the way described in this section. We already saw them as part of a *list comprehension* in [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb) and [Chapter 2](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/02_functions.ipynb) and revisit this in Chapter 7 in greater detail." ] }, { @@ -1589,7 +1967,7 @@ } }, "source": [ - "## Exceptions" + "## The `try` Statement" ] }, { @@ -1600,16 +1978,16 @@ } }, "source": [ - "In the previous two chapters we already encountered a couple of *runtime* errors. A natural urge we might have after reading about conditional statements is to write code that somehow reacts to the occurence of such exceptions. All we need for that is a way to formulate a condition for that.\n", + "In the previous two chapters, we encountered a couple of *runtime* errors. A natural urge we might have after reading about conditional statements is to write code that somehow reacts to the occurrence of such exceptions. All we need is a way to formulate a condition for that.\n", "\n", - "For sure, this is such a common thing to do that Python provides its own language construct for it, namely the `try` [statement](https://docs.python.org/3/reference/compound_stmts.html#the-try-statement).\n", + "For sure, this is such a common thing to do that Python provides a language construct for it, namely the `try` [statement](https://docs.python.org/3/reference/compound_stmts.html#the-try-statement).\n", "\n", - "In its simplest form, it comes with just two branches: `try` and `except`. The following basically tells Python to execute the code in the `try`-branch and if *anything* goes wrong, continue in the `except`-branch instead of **raising** an error to us. Of course, if nothing goes wrong, the `except`-branch is *not* executed." + "In its simplest form, it comes with just two branches: `try` and `except`. The following tells Python to execute the code in the `try`-branch, and if *anything* goes wrong, continue in the `except`-branch instead of **raising** an error to us. Of course, if nothing goes wrong, the `except`-branch is *not* executed." ] }, { "cell_type": "code", - "execution_count": 50, + "execution_count": 60, "metadata": { "slideshow": { "slide_type": "slide" @@ -1622,7 +2000,7 @@ }, { "cell_type": "code", - "execution_count": 51, + "execution_count": 61, "metadata": { "slideshow": { "slide_type": "-" @@ -1652,16 +2030,16 @@ } }, "source": [ - "However, it is good practise to *not* **handle** *any* possible exception but only the ones we may expect from the code in the `try`-branch. The reasoning why this is done is a bit involved. We only remark here that the code base becomes easier to understand as we clearly communicate to any human reader what could go wrong during execution. Python comes with a lot of [built-in exceptions](https://docs.python.org/3/library/exceptions.html#concrete-exceptions) that we should familiarize ourselves with.\n", + "However, it is good practice *not* to **handle** *any* possible exception but only the ones we may *expect* from the code in the `try`-branch. The reasoning why this is done is a bit involved. We only remark that the codebase becomes easier to understand as we communicate to any human reader what could go wrong during execution in an *explicit* way. Python comes with a lot of [built-in exceptions](https://docs.python.org/3/library/exceptions.html#concrete-exceptions) that we should familiarize ourselves with.\n", "\n", - "Another good practise is to always keep the code in the `try`-branch short so as to not accidently handle an exception we do not want to handle.\n", + "Another good practice is to always keep the code in the `try`-branch short to not *accidentally* handle an exception we do *not* want to handle.\n", "\n", - "In the example, we are dividing numbers and may therefore expect a `ZeroDivisionError`." + "In the example, we are dividing numbers and may expect a `ZeroDivisionError`." ] }, { "cell_type": "code", - "execution_count": 52, + "execution_count": 62, "metadata": { "slideshow": { "slide_type": "fragment" @@ -1691,16 +2069,16 @@ } }, "source": [ - "Often, we must have some code run independent of an exception occuring (e.g., to close a connection to a database). To achieve that, we can add an optional `finally`-branch to the `try` statement.\n", + "Often, we may have to run some code *independent* of an exception occurring, for example, to close a connection to a database. To achieve that, we add a `finally`-branch to the `try` statement.\n", "\n", - "Similarly, we might have some code that must be run exactly when no exception occurs but we do not want to put it in the `try`-branch as per the good practice mentioned. To achieve that, we can add an optional `else`-branch to the `try` statement.\n", + "Similarly, we may have to run some code *only if* no exception occurs, but we do not want to put it in the `try`-branch as per the good practice mentioned above. To achieve that, we add an `else`-branch to the `try` statement.\n", "\n", - "To showcase everything together, we look at one last example. To spice it up a bit, we randomize the input. So run the cell several times and see for yourself. It's actually quite easy." + "To showcase everything together, we look at one last example. To spice it up a bit, we randomize the input. So run the cell several times and see for yourself." ] }, { "cell_type": "code", - "execution_count": 53, + "execution_count": 63, "metadata": { "slideshow": { "slide_type": "slide" @@ -1711,7 +2089,7 @@ "name": "stdout", "output_type": "stream", "text": [ - "Oops. Division by 0. How does that work?\n", + "Yes, division worked smoothly.\n", "I am always printed\n" ] } @@ -1751,9 +2129,9 @@ "- **boolean expressions** evaluate to either `True` or `False`\n", "- **relational operators** compare operands according to \"human\" interpretations\n", "- **logical operators** combine boolean sub-expressions to more \"complex\" expressions\n", - "- the **conditional statement** is a *major* concept to **control** the **flow of execution** depending on some **conditions**\n", + "- the **conditional statement** allows **controlling** the **flow of execution** depending on some **conditions**\n", "- a **conditional expression** is a short form of a conditional statement\n", - "- **exception handling** is also a common way of **controlling** the **flow of execution**, in particular if we have to be prepared for bad input data" + "- **exception handling** is also a common way of **controlling** the **flow of execution**, in particular, if we have to be prepared for bad input data" ] } ], diff --git a/03_conditionals_review_and_exercises.ipynb b/03_conditionals_review_and_exercises.ipynb index d39c551..89d7009 100644 --- a/03_conditionals_review_and_exercises.ipynb +++ b/03_conditionals_review_and_exercises.ipynb @@ -19,7 +19,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Read [Chapter 3](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/03_conditionals.ipynb) of the book. Then work through the seven review questions." + "Read [Chapter 3](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/03_conditionals.ipynb) of the book. Then work through the eight review questions." ] }, { @@ -68,7 +68,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Q3**: Explain how the conceptual difference between a **statement** and an **expression** relates to the difference between a **conditional statement** and a **conditional expression**." + "**Q3**: Describe in your own words the concept of **short-circuiting**! What does it imply for an individual sub-expression in a boolean expression?" ] }, { @@ -82,7 +82,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Q4**: Why is the use of **temporary variables** encouraged in Python?" + "**Q4**: Explain how the conceptual difference between a **statement** and an **expression** relates to the difference between a **conditional statement** and a **conditional expression**." ] }, { @@ -96,7 +96,21 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Q5**: What does the `finally`-branch enforce in this code snippet? How can a `try` statement be useful *without* an `except`-branch?" + "**Q5**: Why is the use of **temporary variables** encouraged in Python?" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q6**: What does the `finally`-branch enforce in this code snippet? How can a `try` statement be useful *without* an `except`-branch?" ] }, { @@ -136,7 +150,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Q6**: The objects `True`, `False`, and `None` represent the idea of \"yes\", \"no\", and \"maybe\" answers in a natural language.\n", + "**Q7**: The objects `True`, `False`, and `None` represent the idea of *yes*, *no*, and *maybe* answers in a natural language.\n", "\n", "Hint: you also respond with a code cell." ] @@ -152,7 +166,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Q7**: The `try` statement is useful for handling **syntax** errors." + "**Q8**: The `try` statement is useful for handling **syntax** errors." ] }, { @@ -180,7 +194,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Q8.1**: Write a function `discounted_price()` that takes the positional arguments `unit_price` (of type `float`) and `quantity` (of type `int`) and implements a discount scheme for a line item in a customer order as follows:\n", + "**Q9.1**: Write a function `discounted_price()` that takes the positional arguments `unit_price` (of type `float`) and `quantity` (of type `int`) and implements a discount scheme for a line item in a customer order as follows:\n", "\n", "- if the unit price is over 100 dollars, grant 10% relative discount\n", "- if a customer orders more than 10 items, one in every five items is for free\n", @@ -218,7 +232,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Q8.2**: Calculate the final price for the following line items of an order:\n", + "**Q9.2**: Calculate the final price for the following line items of an order:\n", "- $7$ smartphones @ $99.00$ USD\n", "- $3$ workstations @ $999.00$ USD\n", "- $19$ GPUs @ $879.95$ USD\n", @@ -265,7 +279,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Q8.3**: Re-calculate the last two line items with order quantities of $20$ and $15$. What do you observe?" + "**Q9.3**: Re-calculate the last two line items with order quantities of $20$ and $15$. What do you observe?" ] }, { @@ -297,7 +311,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Q8.4**: Looking at the `if`-`else`-logic in the function, why do you think the four example line items in **Q8.2** were chosen as they were?" + "**Q9.4**: Looking at the `if`-`else`-logic in the function, why do you think the four example line items in **Q9.2** were chosen as they were?" ] }, { @@ -318,16 +332,16 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The kids game [Fizz Buzz](https://en.wikipedia.org/wiki/Fizz_buzz) is said to be often used in job interviews for entry level positions. However, opinions vary as to how good of a test it actually is ([source](https://news.ycombinator.com/item?id=16446774)).\n", + "The kids game [Fizz Buzz](https://en.wikipedia.org/wiki/Fizz_buzz) is said to be often used in job interviews for entry-level positions. However, opinions vary as to how good of a test it is (cf., [source](https://news.ycombinator.com/item?id=16446774)).\n", "\n", - "In its simplest form, a group of people start counting upwards in an alternating fashion. Whenever a number is divisible by $3$, the person must say \"Fizz\" instead of the number. The same holds for numbers divisible by $5$ when the person must say \"Buzz\". If a number is divisible by both numbers, one must say \"FizzBuzz\". Probably, this game would also make a good drinking game with the \"right\" beverages." + "In its simplest form, a group of people starts counting upwards in an alternating fashion. Whenever a number is divisible by $3$, the person must say \"Fizz\" instead of the number. The same holds for numbers divisible by $5$ when the person must say \"Buzz.\" If a number is divisible by both numbers, one must say \"FizzBuzz.\" Probably, this game would also make a good drinking game with the \"right\" beverages." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "**Q9.1**: First, create a list `numbers` with the numbers from 1 through 100. You could type all numbers manually but there is of course a smarter way. The built-in [range()](https://docs.python.org/3/library/functions.html#func-range) may be useful here. Read how it works in the documentation. To make the output of [range()](https://docs.python.org/3/library/functions.html#func-range) a `list` object, you have to wrap it with the [list()](https://docs.python.org/3/library/functions.html#func-list) built-in (i.e., `list(range(...))`)." + "**Q10.1**: First, create a list `numbers` with the numbers from 1 through 100. You could type all numbers manually, but there is, of course, a smarter way. The built-in [range()](https://docs.python.org/3/library/functions.html#func-range) may be useful here. Read how it works in the documentation. To make the output of [range()](https://docs.python.org/3/library/functions.html#func-range) a `list` object, you have to wrap it with the [list()](https://docs.python.org/3/library/functions.html#func-list) built-in (i.e., `list(range(...))`)." ] }, { @@ -343,11 +357,11 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Q9.2**: Loop over the `numbers` list and replace numbers for which one of the two (or both) conditions apply with text strings `\"Fizz\"`, `\"Buzz\"`, or `\"FizzBuzz\"` using the indexing operator `[]` and the assignment statement `=`.\n", + "**Q10.2**: Loop over the `numbers` list and replace numbers for which one of the two (or both) conditions apply with text strings `\"Fizz\"`, `\"Buzz\"`, or `\"FizzBuzz\"` using the indexing operator `[]` and the assignment statement `=`.\n", "\n", - "In Chapter 1 we saw that Python starts indexing with `0` as the first element. Keep that in mind.\n", + "In [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb), we saw that Python starts indexing with `0` as the first element. Keep that in mind.\n", "\n", - "So in each iteration of the `for`-loop you have to determine an `index` variable as well as checking the actual `number` for its divisors.\n", + "So in each iteration of the `for`-loop, you have to determine an `index` variable as well as check the actual `number` for its divisors.\n", "\n", "Hint: the order of the conditions is important!" ] @@ -372,7 +386,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Q9.3**: Create a loop that prints out either the number or any of the Fizz Buzz substitutes. Do it in such a way that we do not end up with 100 lines of output here." + "**Q10.3**: Create a loop that prints out either the number or any of the Fizz Buzz substitutes. Do it in such a way that we do not end up with 100 lines of output here." ] }, { diff --git a/04_iteration.ipynb b/04_iteration.ipynb index 755b4a5..b760a25 100644 --- a/04_iteration.ipynb +++ b/04_iteration.ipynb @@ -19,11 +19,11 @@ } }, "source": [ - "While controlling the flow of execution with an `if` statement is definitely a must-have building block in any programming language, it alone does not allow us to run a block of code repetitively and we need to be able to do exactly that in order to write useful software.\n", + "While controlling the flow of execution with an `if` statement is a must-have building block in any programming language, it alone does not allow us to run a block of code repetitively, and we need to be able to do precisely that to write useful software.\n", "\n", - "You might think that the `for` statement shown in some examples before is the missing piece in the puzzle. However, we can actually live without it and postpone its official introduction until the second half of this chapter.\n", + "The `for` statement shown in some examples before might be the missing piece in the puzzle. However, we can live without it and postpone its official introduction until the second half of this chapter.\n", "\n", - "Instead, we dive into the big idea of **iteration** by studying the concept of **recursion** first. This is quite the opposite of many introductory books on programming that only treat the latter as a nice-to-have artifact, if at all. Yet, understanding recursion sharpens one's mind and contributes to seeing problems from a different angle." + "Instead, we dive into the big idea of **iteration** by studying the concept of **recursion** first. This order is opposite to many other introductory books that only treat the latter as a nice-to-have artifact, if at all. Yet, understanding recursion sharpens one's mind and contributes to seeing problems from a different angle." ] }, { @@ -45,11 +45,11 @@ } }, "source": [ - "A function that calls itself is **recursive** and the process of executing such a function it is called **recursion**.\n", + "A function that calls itself is **recursive**, and the process of executing such a function is called **recursion**.\n", "\n", "Recursive functions contain some form of a conditional check (e.g., `if` statement) to identify a **base case** that ends the recursion *when* reached. Otherwise, the function would keep calling itself forever.\n", "\n", - "The meaning of the word *recursive* is similar to *circular*. However, a truly circular definition is not very helpful and we think of a recursive function as kind of a \"circular function with a way out at the end\"." + "The meaning of the word *recursive* is similar to *circular*. However, a truly circular definition is not very helpful, and we think of a recursive function as kind of a \"circular function with a way out at the end.\"" ] }, { @@ -71,7 +71,7 @@ } }, "source": [ - "A rather trivial toy example illustrates the important aspects concretely: If called with any positive integer as its `n` argument, `countdown()` just prints that number and calls itself with the *new* `n` being the *old* `n` minus `1`. This continues until `countdown()` is called with `n=0`. Then, the flow of execution hits the base case and the function calls stop." + "A rather trivial toy example illustrates the important aspects concretely: If called with any positive integer as its `n` argument, `countdown()` just prints that number and calls itself with the *new* `n` being the *old* `n` minus `1`. This continues until `countdown()` is called with `n=0`. Then, the flow of execution hits the base case, and the function calls stop." ] }, { @@ -92,7 +92,7 @@ " n (int): seconds until the party begins\n", " \"\"\"\n", " if n == 0: # base case\n", - " print(\"Happy new Year!\")\n", + " print(\"Happy New Year!\")\n", " else:\n", " print(n)\n", " countdown(n - 1)" @@ -114,7 +114,7 @@ "3\n", "2\n", "1\n", - "Happy new Year!\n" + "Happy New Year!\n" ] } ], @@ -130,7 +130,7 @@ } }, "source": [ - "As trivial as this seems at first sight, a lot of complexity is hidden in this implementation. In particular, the order in which objects are created and de-referenced in memory might not be obvious right away as [PythonTutor](http://pythontutor.com/visualize.html#code=def%20countdown%28n%29%3A%0A%20%20%20%20if%20n%20%3D%3D%200%3A%0A%20%20%20%20%20%20%20%20print%28%22Happy%20new%20Year!%22%29%0A%20%20%20%20else%3A%0A%20%20%20%20%20%20%20%20print%28n%29%0A%20%20%20%20%20%20%20%20countdown%28n%20-%201%29%0A%0Acountdown%283%29&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false) shows: Each time `countdown()` is called, Python creates a *new* frame in the part of the memory where it manages all the names. This way Python *isolates* all the different `n` variables from each other. As new frames are created until we reach the base case after which the frames are destroyed in the *reversed* order, this is called a **[stack](https://en.wikipedia.org/wiki/Stack_(abstract_data_type))** of frames in computer science terminology. In simple words, a stack is a last-in-first-out (LIFO) task queue. Each frame has a single parent frame, namely the one whose recursive function call created it." + "As trivial as this seems, a lot of complexity is hidden in this implementation. In particular, the order in which objects are created and de-referenced in memory might not be apparent right away as [PythonTutor](http://pythontutor.com/visualize.html#code=def%20countdown%28n%29%3A%0A%20%20%20%20if%20n%20%3D%3D%200%3A%0A%20%20%20%20%20%20%20%20print%28%22Happy%20new%20Year!%22%29%0A%20%20%20%20else%3A%0A%20%20%20%20%20%20%20%20print%28n%29%0A%20%20%20%20%20%20%20%20countdown%28n%20-%201%29%0A%0Acountdown%283%29&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false) shows: Each time `countdown()` is called, Python creates a *new* frame in the part of the memory where it manages all the names. This way, Python *isolates* all the different `n` variables from each other. As new frames are created until we reach the base case, after which the frames are destroyed in the *reversed* order, this is called a **[stack](https://en.wikipedia.org/wiki/Stack_(abstract_data_type))** of frames in computer science terminology. In simple words, a stack is a last-in-first-out (LIFO) task queue. Each frame has a single parent frame, namely the one whose recursive function call created it." ] }, { @@ -152,7 +152,7 @@ } }, "source": [ - "Recursion plays an important role in mathematics as well and we likely know about it from some introductory course, for example, in [combinatorics](https://en.wikipedia.org/wiki/Combinatorics)." + "Recursion plays a vital role in mathematics as well, and we likely know about it from some introductory course, for example, in [combinatorics](https://en.wikipedia.org/wiki/Combinatorics)." ] }, { @@ -174,7 +174,7 @@ } }, "source": [ - "The factorial function, denoted with the symbol $!$, is defined as follows for all non-negative integers:" + "The factorial function, denoted with the $!$ symbol, is defined as follows for all non-negative integers:" ] }, { @@ -197,9 +197,9 @@ } }, "source": [ - "Whenever we can find a recursive way of formulating an idea, we can immediately translate it into Python in a *naive* way (i.e., we create a *correct* program that may *not* be an *efficient* implementation yet).\n", + "Whenever we find a recursive way of formulating an idea, we can immediately translate it into Python in a *naive* way (i.e., we create a *correct* program that may *not* be an *efficient* implementation yet).\n", "\n", - "Below is a first version of `factorial()`: The `return` statement does not have to be a function's last code line and we can certainly have several `return` statements as well." + "Below is a first version of `factorial()`: The `return` statement does not have to be a function's last code line, and we may indeed have several `return` statements as well." ] }, { @@ -238,9 +238,9 @@ } }, "source": [ - "When we read such code, it is often easier not to follow every function call (i.e., `factorial(n - 1)` here) in one's mind but assume we receive a return value as specified in the documentation. Some call this approach **[leap of faith](http://greenteapress.com/thinkpython2/html/thinkpython2007.html#sec75)**. In fact, we practice this anyways whenever we call built-in functions (e.g., [print()](https://docs.python.org/3/library/functions.html#print) or [len()](https://docs.python.org/3/library/functions.html#len)) where we would have to read C code in many cases.\n", + "When we read such code, it is often easier not to follow every function call (i.e., `factorial(n - 1)` here) in one's mind but assume we receive a return value as specified in the documentation. Some call this approach a **[leap of faith](http://greenteapress.com/thinkpython2/html/thinkpython2007.html#sec75)**. We practice this already whenever we call built-in functions (e.g., [print()](https://docs.python.org/3/library/functions.html#print) or [len()](https://docs.python.org/3/library/functions.html#len)) where we would have to read C code in many cases.\n", "\n", - "To visualize *all* the computational steps of the examplary `factorial(3)`, we use [PythonTutor](http://pythontutor.com/visualize.html#code=def%20factorial%28n%29%3A%0A%20%20%20%20if%20n%20%3D%3D%200%3A%0A%20%20%20%20%20%20%20%20return%201%0A%20%20%20%20else%3A%0A%20%20%20%20%20%20%20%20recurse%20%3D%20factorial%28n%20-%201%29%0A%20%20%20%20%20%20%20%20result%20%3D%20n%20*%20recurse%0A%20%20%20%20%20%20%20%20return%20result%0A%0Asolution%20%3D%20factorial%283%29&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false): The recursion again creates a stack of frames in memory. In contrast to the previous trivial example, each frame leaves a return value in memory after it is destroyed. This return value is then assigned to the `recurse` variable within the parent frame and used to compute `result`." + "To visualize *all* the computational steps of the exemplary `factorial(3)`, we use [PythonTutor](http://pythontutor.com/visualize.html#code=def%20factorial%28n%29%3A%0A%20%20%20%20if%20n%20%3D%3D%200%3A%0A%20%20%20%20%20%20%20%20return%201%0A%20%20%20%20else%3A%0A%20%20%20%20%20%20%20%20recurse%20%3D%20factorial%28n%20-%201%29%0A%20%20%20%20%20%20%20%20result%20%3D%20n%20*%20recurse%0A%20%20%20%20%20%20%20%20return%20result%0A%0Asolution%20%3D%20factorial%283%29&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false): The recursion again creates a stack of frames in memory. In contrast to the previous trivial example, each frame leaves a return value in memory after it is destroyed. This return value is then assigned to the `recurse` variable within the parent frame and used to compute `result`." ] }, { @@ -299,7 +299,7 @@ } }, "source": [ - "A Pythonista would formulate `factorial()` in a conciser way using the so-called **early exit** pattern: No `else`-clause is needed as reaching a `return` statement immediately ends a function call. Furthermore, we do not really need the temporary variables `recurse` and `result`.\n", + "A Pythonista would formulate `factorial()` in a more concise way using the so-called **early exit** pattern: No `else`-clause is needed as reaching a `return` statement ends a function call *immediately*. Furthermore, we do not need the temporary variables `recurse` and `result`.\n", "\n", "As [PythonTutor](http://pythontutor.com/visualize.html#code=def%20factorial%28n%29%3A%0A%20%20%20%20if%20n%20%3D%3D%200%3A%0A%20%20%20%20%20%20%20%20return%201%0A%20%20%20%20return%20n%20*%20factorial%28n%20-%201%29%0A%0Asolution%20%3D%20factorial%283%29&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false) shows, this implementation is more efficient as it only requires 18 computational steps instead of 24 to calculate `factorial(3)`, an improvement of 25 percent! " ] @@ -384,7 +384,7 @@ } }, "source": [ - "Note that the [math](https://docs.python.org/3/library/math.html) module in the standard library provides a [factorial()](https://docs.python.org/3/library/math.html#math.factorial) function as well and we should therefore *never* implement it ourselves in a real code base." + "Note that the [math](https://docs.python.org/3/library/math.html) module in the [standard library](https://docs.python.org/3/library/index.html) provides a [factorial()](https://docs.python.org/3/library/math.html#math.factorial) function as well, and we should, therefore, *never* implement it ourselves in a real codebase." ] }, { @@ -554,7 +554,7 @@ } }, "source": [ - "Euclid's algorithm is stunningly fast, even for large numbers. Its speed comes from the use of the modulo operation `%`. However, this is *not* true for recusion in general, which can result in very slow programs if not applied correctly." + "Euclid's algorithm is stunningly fast, even for large numbers. Its speed comes from the use of the modulo operator `%`. However, this is *not* true for recursion in general, which may result in slow programs if not applied correctly." ] }, { @@ -624,7 +624,7 @@ } }, "source": [ - "The [math](https://docs.python.org/3/library/math.html) module in the standard library provides a [gcd()](https://docs.python.org/3/library/math.html#math.gcd) function as well and therefore we should again *never* implement it on our own." + "The [math](https://docs.python.org/3/library/math.html) module in the [standard library](https://docs.python.org/3/library/index.html) provides a [gcd()](https://docs.python.org/3/library/math.html#math.gcd) function as well, and, therefore, we should again *never* implement it on our own." ] }, { @@ -765,7 +765,7 @@ } }, "source": [ - "Let's write a function `fibonacci()` that calculates the $i$th Fibonacci number where $0$ will be the $0$th number. Looking at the numbers in a **backwards** fashion (i.e., from right to left), we realize that the return value for `fibonacci(i)` can be reduced to the sum of the return values for `fibonacci(i - 1)` and `fibonacci(i - 2)` disregarding the *two* base cases." + "Let's write a function `fibonacci()` that calculates the $i$th Fibonacci number where $0$ is the $0$th number. Looking at the numbers in a **backward** fashion (i.e., from right to left), we realize that the return value for `fibonacci(i)` can be reduced to the sum of the return values for `fibonacci(i - 1)` and `fibonacci(i - 2)` disregarding the *two* base cases." ] }, { @@ -837,11 +837,11 @@ } }, "source": [ - "This implementation is *highly* **inefficient** as small Fibonacci numbers can already take a very long time to compute. The reason for this is **exponential growth** in the number of function calls. As [PythonTutor](http://pythontutor.com/visualize.html#code=def%20fibonacci%28i%29%3A%0A%20%20%20%20if%20i%20%3D%3D%200%3A%0A%20%20%20%20%20%20%20%20return%200%0A%20%20%20%20elif%20i%20%3D%3D%201%3A%0A%20%20%20%20%20%20%20%20return%201%0A%20%20%20%20return%20fibonacci%28i%20-%201%29%20%2B%20fibonacci%28i%20-%202%29%0A%0Arv%20%3D%20fibonacci%285%29&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false) shows, `fibonacci()` is called again and again with the same `i` arguments.\n", + "This implementation is *highly* **inefficient** as small Fibonacci numbers already take a very long time to compute. The reason for this is **exponential growth** in the number of function calls. As [PythonTutor](http://pythontutor.com/visualize.html#code=def%20fibonacci%28i%29%3A%0A%20%20%20%20if%20i%20%3D%3D%200%3A%0A%20%20%20%20%20%20%20%20return%200%0A%20%20%20%20elif%20i%20%3D%3D%201%3A%0A%20%20%20%20%20%20%20%20return%201%0A%20%20%20%20return%20fibonacci%28i%20-%201%29%20%2B%20fibonacci%28i%20-%202%29%0A%0Arv%20%3D%20fibonacci%285%29&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false) shows, `fibonacci()` is called again and again with the same `i` arguments.\n", "\n", "To understand this in detail, we would have to study algorithms and data structures (e.g., with [this book](https://www.amazon.de/Introduction-Algorithms-Press-Thomas-Cormen/dp/0262033844/ref=sr_1_1?__mk_de_DE=%C3%85M%C3%85%C5%BD%C3%95%C3%91&crid=1JNE8U0VZGU0O&qid=1569837169&s=gateway&sprefix=algorithms+an%2Caps%2C180&sr=8-1)), a discipline within computer science, and dive into the analysis of **[time complexity of algorithms](https://en.wikipedia.org/wiki/Time_complexity)**.\n", "\n", - "Luckily, in the Fibonacci case the inefficiency can be resolved with a **caching** (i.e., \"re-use\") strategy from the field of **[dynamic programming](https://en.wikipedia.org/wiki/Dynamic_programming)**, namely **[memoization](https://en.wikipedia.org/wiki/Memoization)**. We will do so in Chapter 8 after introducing the [dictionaries](https://docs.python.org/3/library/stdtypes.html#dict) data type.\n", + "Luckily, in the Fibonacci case, the inefficiency can be resolved with a **caching** (i.e., \"re-use\") strategy from the field of **[dynamic programming](https://en.wikipedia.org/wiki/Dynamic_programming)**, namely **[memoization](https://en.wikipedia.org/wiki/Memoization)**. We do so in Chapter 8, after introducing the [dictionaries](https://docs.python.org/3/library/stdtypes.html#dict) data type.\n", "\n", "Let's measure the average run times for `fibonacci()` and varying `i` arguments with the `%%timeit` [cell magic](https://ipython.readthedocs.io/en/stable/interactive/magics.html#magic-timeit) that comes with Jupyter." ] @@ -859,8 +859,7 @@ "name": "stdout", "output_type": "stream", "text": [ - "The slowest run took 5.01 times longer than the fastest. This could mean that an intermediate result is being cached.\n", - "55 µs ± 44.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n" + "39.9 µs ± 5.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n" ] } ], @@ -882,7 +881,7 @@ "name": "stdout", "output_type": "stream", "text": [ - "1.63 ms ± 68 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n" + "1.61 ms ± 8.84 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n" ] } ], @@ -904,7 +903,7 @@ "name": "stdout", "output_type": "stream", "text": [ - "192 ms ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)\n" + "199 ms ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)\n" ] } ], @@ -948,7 +947,7 @@ "name": "stdout", "output_type": "stream", "text": [ - "5.62 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)\n" + "5.8 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)\n" ] } ], @@ -976,7 +975,7 @@ } }, "source": [ - "If a recursion does not reach its base case, it will theoretically run forever. Luckily, Python detects that and saves the computer from crashing by raising a `RecursionError`.\n", + "If a recursion does not reach its base case, it theoretically runs forever. Luckily, Python detects that and saves the computer from crashing by raising a `RecursionError`.\n", "\n", "The simplest possible infinite recursion is generated like so." ] @@ -1032,7 +1031,7 @@ } }, "source": [ - "However, even the trivial `countdown()` function from above is not immune to an infinite recursion. Let's call it with `3.1` instead of `3`. What goes wrong here?" + "However, even the trivial `countdown()` function from above is not immune to infinite recursion. Let's call it with `3.1` instead of `3`. What goes wrong here?" ] }, { @@ -4012,9 +4011,9 @@ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mRecursionError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mcountdown\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m3.1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", - "\u001b[0;32m\u001b[0m in \u001b[0;36mcountdown\u001b[0;34m(n)\u001b[0m\n\u001b[1;32m 9\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 10\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mn\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 11\u001b[0;31m \u001b[0mcountdown\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mn\u001b[0m \u001b[0;34m-\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;32m\u001b[0m in \u001b[0;36mcountdown\u001b[0;34m(n)\u001b[0m\n\u001b[1;32m 9\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 10\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mn\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 11\u001b[0;31m \u001b[0mcountdown\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mn\u001b[0m \u001b[0;34m-\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "... last 1 frames repeated, from the frame below ...\n", - "\u001b[0;32m\u001b[0m in \u001b[0;36mcountdown\u001b[0;34m(n)\u001b[0m\n\u001b[1;32m 9\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 10\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mn\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 11\u001b[0;31m \u001b[0mcountdown\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mn\u001b[0m \u001b[0;34m-\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;32m\u001b[0m in \u001b[0;36mcountdown\u001b[0;34m(n)\u001b[0m\n\u001b[1;32m 9\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 10\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mn\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 11\u001b[0;31m \u001b[0mcountdown\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mn\u001b[0m \u001b[0;34m-\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mRecursionError\u001b[0m: maximum recursion depth exceeded while calling a Python object" ] } @@ -4070,19 +4069,21 @@ } }, "source": [ - "The infinite recursions could easily be avoided by replacing `n == 0` with `n <= 0` in both functions and thereby **generalizing** them. But even then, calling either `countdown()` or `factorial()` with a non-integer number is *semantically* wrong and therefore we better leave the base cases unchanged.\n", + "The infinite recursions could easily be avoided by replacing `n == 0` with `n <= 0` in both functions and thereby **generalizing** them. But even then, calling either `countdown()` or `factorial()` with a non-integer number is *semantically* wrong, and, therefore, we better leave the base cases unchanged.\n", "\n", - "Errors as above are a symptom of missing **type checking**: By design, Python allows us to pass in not only integers but objects of any type as arguments to the `countdown()` and `factorial()` functions. As long as the arguments \"behave\" like integers, we will not encounter any *runtime* errors. This is the case here as the two example functions only use the `-` and `*` operators internally and in this context a `float` object behaves exactly like an `int` object. So, the functions keep calling themselves until Python decides with a built-in heuristic that the recursion is likely not going to end and aborts the computations with a `RecursionError`. Stricly speaking, this is of course a *runtime* error as well. The missing type checking is 100% intentional and considered a feature of rather than a bug in Python.\n", + "Errors as above are a symptom of missing **type checking**: By design, Python allows us to pass in not only integers but objects of any type as arguments to the `countdown()` and `factorial()` functions. As long as the arguments \"behave\" like integers, we do not encounter any *runtime* errors. This is the case here as the two example functions only use the `-` and `*` operators internally, and, in the context of arithmetic, a `float` object behaves like an `int` object. So, the functions keep calling themselves until Python decides with a built-in heuristic that the recursion is likely not going to end and aborts the computations with a `RecursionError`. Strictly speaking, a `RecursionError` is, of course, a *runtime* error as well.\n", "\n", - "Pythonistas often use the term **[duck typing](https://en.wikipedia.org/wiki/Duck_typing)** when refering to the same idea and the colloquial saying goes \"If it walks like a duck and it quacks like a duck, it must be a duck\". For example, we could call `factorial()` with the `float` object `3.0` and the recursion works out fine. So, as long as the `3.0` \"walks\" like a `3` and \"quacks\" like a `3`, it \"must be\" a `3`.\n", + "The missing type checking is *100% intentional* and considered a **[feature of rather than a bug](https://www.urbandictionary.com/define.php?term=It%27s%20not%20a%20bug%2C%20it%27s%20a%20feature)** in Python!\n", "\n", - "We see a similar behavior when we mix objects of types `int` and `float` with arithmetic operators. For example, `1 + 2.0` works because Python implicitly views the `1` as a `1.0` at runtime and then knows how to do floating-point arithmetic: Here, the `int` \"walks\" and \"quacks\" like a `float`. Strictly speaking, this is yet another example of operator overloading whereas duck typing refers to the same behavior when passing arguments to function calls.\n", + "Pythonistas often use the colloquial term **[duck typing](https://en.wikipedia.org/wiki/Duck_typing)** when referring to the same idea, and the saying goes, \"If it walks like a duck and it quacks like a duck, it must be a duck.\" For example, we could call `factorial()` with the `float` object `3.0`, and the recursion works out fine. So, as long as the `3.0` \"walks\" like a `3` and \"quacks\" like a `3`, it \"must be\" a `3`.\n", + "\n", + "We see similar behavior when we mix objects of types `int` and `float` with arithmetic operators. For example, `1 + 2.0` works because Python implicitly views the `1` as a `1.0` at runtime and then knows how to do floating-point arithmetic: Here, the `int` \"walks\" and \"quacks\" like a `float`. Strictly speaking, this is yet another example of operator overloading, whereas duck typing refers to the same behavior when passing arguments to function calls.\n", "\n", "The important lesson is that we must expect our functions to be called with objects of *any* type at runtime, as opposed to the one type we had in mind when we defined the function.\n", "\n", - "Duck typing is possible because Python is a dynamically typed language. On the contrary, in statically typed languages like C we have to declare (i.e., \"specify\") the data type of every parameter in a function definition. Then, a `RecursionError` as for `countdown(3.1)` or `factorial(3.1)` above could not occur. For example, if we declared the `countdown()` and `factorial()` functions to only accept `int` objects, calling the functions with a `float` argument would immediately fail *syntactically*. As a downside, we would then lose the ability to call `factorial()` with `3.0`, which is *semantically* correct nevertheless.\n", + "Duck typing is possible because Python is a dynamically typed language. On the contrary, in statically typed languages like C, we *must* declare (i.e., \"specify\") the data type of every parameter in a function definition. Then, a `RecursionError` as for `countdown(3.1)` or `factorial(3.1)` above could not occur. For example, if we declared the `countdown()` and `factorial()` functions to only accept `int` objects, calling the functions with a `float` argument would immediately fail *syntactically*. As a downside, we would then lose the ability to call `factorial()` with `3.0`, which is *semantically* correct nevertheless.\n", "\n", - "So, there is no black or white answer as to which of the two language designs is better. Yet, most professional programmers have very strong opinions with respect to duck typing reaching from \"love\" to \"hate\". This is another example as to how programming is a subjective art and not an \"objective\" science. Probably, Python is regarded more beginner friendly as `3` and `3.0` should intuitively be interchangeable." + "So, there is no black or white answer as to which of the two language designs is better. Yet, most professional programmers have strong opinions concerning duck typing, reaching from \"love\" to \"hate.\" This is another example of how programming is a subjective art rather than \"objective\" science. Python's design is probably more appealing to beginners who intuitively regard `3` and `3.0` as interchangeable." ] }, { @@ -4104,15 +4105,15 @@ } }, "source": [ - "We can use the built-in [isinstance()](https://docs.python.org/3/library/functions.html#isinstance) function to make sure `factorial()` is called with an `int` object passed in. We further **validate the input** by verifying that the integer is non-negative.\n", + "We use the built-in [isinstance()](https://docs.python.org/3/library/functions.html#isinstance) function to make sure `factorial()` is called with an `int` object as the argument. We further **validate the input** by verifying that the integer is non-negative.\n", "\n", - "Meanwhile, we also see how we can manually raise exceptions with the `raise` [statement](https://docs.python.org/3/reference/simple_stmts.html#the-raise-statement), another way of controlling the flow of execution.\n", + "Meanwhile, we also see how we manually raise exceptions with the `raise` [statement](https://docs.python.org/3/reference/simple_stmts.html#the-raise-statement), another way of controlling the flow of execution.\n", "\n", - "The first two branches in the revised `factorial()` function act as **guardians** ensuring that the code does not produce *unexpected* runtime errors: Errors can certainly be expected when mentioned in the docstring.\n", + "The first two branches in the revised `factorial()` function act as **guardians** ensuring that the code does not produce *unexpected* runtime errors: Errors may be expected when mentioned in the docstring.\n", "\n", - "Forcing `n` to be an `int` is a very puritan way of handling the issues discussed above. A more relaxed approach could be to also accept a `float` and use its [is_integer()](https://docs.python.org/3/library/stdtypes.html#float.is_integer) method to check if `n` could be casted as an `int`. After all, by being too puritan we can not take advantage of duck typing.\n", + "Forcing `n` to be an `int` is a very puritan way of handling the issues discussed above. A more relaxed approach could be to also accept a `float` and use its [is_integer()](https://docs.python.org/3/library/stdtypes.html#float.is_integer) method to check if `n` could be cast as an `int`. After all, being too puritan, we cannot take advantage of duck typing.\n", "\n", - "So in essence, we are doing *two* things here. Besides checking for the correct type, we are also enforcing **domain-specific** (i.e., mathematical) rules with respect to the non-negativity of `n`." + "So, in essence, we are doing *two* things here: Besides checking the type, we also enforce **domain-specific** (i.e., mathematical here) rules concerning the non-negativity of `n`." ] }, { @@ -4214,7 +4215,7 @@ } }, "source": [ - "Instead of running into an infinite recursion, we now receive very specifc error messages." + "Instead of running into a situation of infinite recursion, we now receive specific error messages." ] }, { @@ -4288,7 +4289,7 @@ } }, "source": [ - "With everything *officially* introduced so far (i.e., without the introductary example in [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb) that only served as an overview), Python would be what is called **[Turing complete](https://en.wikipedia.org/wiki/Turing_completeness)**. That means that anything that could be formulated as an algorithm could be expressed with all the language features we have seen. Note that, in particular, we have *not* yet formally *introduced* the `for` and `while` statements!" + "With everything *officially* introduced so far (i.e., without the introductory example in [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb) that only served as an overview), Python would be what is called **[Turing complete](https://en.wikipedia.org/wiki/Turing_completeness)**. That means that anything that could be formulated as an algorithm could be expressed with all the language features we have seen. Note that, in particular, we have *not* yet formally *introduced* the `for` and `while` statements!" ] }, { @@ -4321,11 +4322,11 @@ } }, "source": [ - "Whereas functions combined with `if` statements suffice to model any type of iteration with a recursion, Python comes with a `while` [statement](https://docs.python.org/3/reference/compound_stmts.html#the-while-statement) that often makes it easier to implement iterative ideas.\n", + "Whereas functions combined with `if` statements suffice to model any repetitive logic, Python comes with a `while` [statement](https://docs.python.org/3/reference/compound_stmts.html#the-while-statement) that often makes it easier to implement iterative ideas.\n", "\n", - "It consists of a header line with a boolean expression followed by an indented code block. Before the first and after every execution of the code block, the boolean expression is evaluated and if it is (still) equal to `True`, the code block runs (again). Eventually, some variable referenced in the boolean expression is changed in the code block such that the condition becomes `False`.\n", + "It consists of a header line with a boolean expression followed by an indented code block. Before the first and after every execution of the code block, the boolean expression is evaluated, and if it is (still) equal to `True`, the code block runs (again). Eventually, some variable referenced in the boolean expression is changed in the code block such that the condition becomes `False`.\n", "\n", - "If the condition is `False` before the first iteration, the entire code block is *never* executed. As the flow of control keeps **looping** (or **iterating**) back to the beginning of the code block, this concept is also called a `while`-loop and each pass through the loop an **iteration**." + "If the condition is `False` before the first iteration, the entire code block is *never* executed. As the flow of control keeps \"**looping**\" (i.e., more formally, **iterating**) back to the beginning of the code block, this concept is also called a `while`-loop and each pass through the loop an **iteration**." ] }, { @@ -4347,7 +4348,7 @@ } }, "source": [ - "Let's rewrite the `countdown()` example in an iterative style. We also build in **input validation** by allowing the function to only be called with strictly positive integers. As any positive integer will hit $0$ at some point in time when iteratively decremented by $1$, `countdown()` is guaranteed to **terminate**. Note also that the base case is now handled at the end of the function, which commonly happens with iterative solutions to problems." + "Let's rewrite the `countdown()` example in an iterative style. We also build in **input validation** by allowing the function only to be called with strictly positive integers. As any positive integer hits $0$ at some point when iteratively decremented by $1$, `countdown()` is guaranteed to **terminate**. Also, the base case is now handled at the end of the function, which commonly happens with iterative solutions to problems." ] }, { @@ -4368,7 +4369,7 @@ " n (int): seconds until the party begins; must be positive\n", "\n", " Raises:\n", - " TypeError: if n is not of an integer\n", + " TypeError: if n is not an integer\n", " ValueError: if n is not positive\n", " \"\"\"\n", " if not isinstance(n, int):\n", @@ -4415,7 +4416,7 @@ } }, "source": [ - "As [PythonTutor](http://pythontutor.com/visualize.html#code=def%20countdown%28n%29%3A%0A%20%20%20%20if%20not%20isinstance%28n,%20int%29%3A%0A%20%20%20%20%20%20%20%20raise%20TypeError%28%22...%22%29%0A%20%20%20%20elif%20n%20%3C%3D%200%3A%0A%20%20%20%20%20%20%20%20raise%20ValueError%28%22...%22%29%0A%0A%20%20%20%20while%20n%20!%3D%200%3A%0A%20%20%20%20%20%20%20%20print%28n%29%0A%20%20%20%20%20%20%20%20n%20-%3D%201%0A%0A%20%20%20%20print%28%22Happy%20new%20Year!%22%29%0A%0Acountdown%283%29&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false) shows, there is a subtle but important difference in the way a `while` statement is treated in memory: In short, `while` statements can *not* run into a `RecursionError` as only *one* frame is needed to manage the names. After all, there is only *one* function call to be made. For common day-to-day applications this difference is, however, not important." + "As [PythonTutor](http://pythontutor.com/visualize.html#code=def%20countdown%28n%29%3A%0A%20%20%20%20if%20not%20isinstance%28n,%20int%29%3A%0A%20%20%20%20%20%20%20%20raise%20TypeError%28%22...%22%29%0A%20%20%20%20elif%20n%20%3C%3D%200%3A%0A%20%20%20%20%20%20%20%20raise%20ValueError%28%22...%22%29%0A%0A%20%20%20%20while%20n%20!%3D%200%3A%0A%20%20%20%20%20%20%20%20print%28n%29%0A%20%20%20%20%20%20%20%20n%20-%3D%201%0A%0A%20%20%20%20print%28%22Happy%20new%20Year!%22%29%0A%0Acountdown%283%29&cumulative=false&curInstr=0&heapPrimitives=nevernest&mode=display&origin=opt-frontend.js&py=3&rawInputLstJSON=%5B%5D&textReferences=false) shows, there is a subtle but essential difference in the way a `while` statement is treated in memory: In short, `while` statements can *not* run into a `RecursionError` as only *one* frame is needed to manage the names. After all, there is only *one* function call to be made. For typical day-to-day applications, this difference is, however, not so important *unless* a problem instance becomes so big that a large (i.e., $> 3.000$) number of recursive calls must be made." ] }, { @@ -4437,9 +4438,9 @@ } }, "source": [ - "Finding the greatest common divisor of two numbers is still not so obvious when using a `while`-loop instead of a recursion.\n", + "Finding the greatest common divisor of two numbers is still not so obvious when using a `while`-loop instead of a recursive formulation.\n", "\n", - "The iterative implementation of `gcd()` below accepts any two strictly positive integers. As in any iteration through the loop the smaller number is subtracted from the larger one, the two decremented values of `a` and `b` will eventually be equal. Thus, this algorithm is also guaranteed to terminate. If one of the two numbers were negative or $0$ to begin with, `gcd()` would run forever and not even Python could detect this. Try this out by removing the input validation and running the function with negative arguments!" + "The iterative implementation of `gcd()` below accepts any two strictly positive integers. As in any iteration through the loop, the smaller number is subtracted from the larger one, the two decremented values of `a` and `b` eventually become equal. Thus, this algorithm is also guaranteed to terminate. If one of the two numbers were negative or $0$ in the first place, `gcd()` would run forever, and not even Python could detect this. Try this out by removing the input validation and running the function with negative arguments!" ] }, { @@ -4464,7 +4465,7 @@ " gcd (int)\n", "\n", " Raises:\n", - " TypeError: if a or b are not of an integer type\n", + " TypeError: if a or b are not integers\n", " ValueError: if a or b are not positive\n", " \"\"\"\n", " if not isinstance(a, int) or not isinstance(b, int):\n", @@ -4548,7 +4549,7 @@ } }, "source": [ - "We can also see that this implementation is a lot *less* efficient than its recursive counterpart which solves `gcd()` for the same two numbers $112233445566778899$ and $987654321$ within microseconds." + "We also see that this implementation is a lot *less* efficient than its recursive counterpart which solves `gcd()` for the same two numbers $112233445566778899$ and $987654321$ within microseconds." ] }, { @@ -4564,7 +4565,7 @@ "name": "stdout", "output_type": "stream", "text": [ - "4.76 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)\n" + "4.69 s ± 0 ns per loop (mean ± std. dev. of 1 run, 1 loop each)\n" ] } ], @@ -4592,7 +4593,7 @@ } }, "source": [ - "As with recursion, we must ensure that the iteration ends. For the above `countdown()` and `gcd()` examples we could \"prove\" (i.e., at least argue in favor) that some pre-defined **termination criterion** will be reached eventually. However, this cannot be done in all cases as the following example shows." + "As with recursion, we must ensure that the iteration ends. For the above `countdown()` and `gcd()` examples, we could \"prove\" (i.e., at least argue in favor) that some pre-defined **termination criterion** is reached eventually. However, this cannot be done in all cases, as the following example shows." ] }, { @@ -4615,10 +4616,10 @@ }, "source": [ "Let's play the following game:\n", - "- think of any positive integer $n$\n", - "- if $n$ is even, the next $n$ is half the old $n$\n", - "- if $n$ is odd, multiply the old $n$ by $3$ and add $1$ to obtain the next $n$\n", - "- repeat these steps until you reach $1$\n", + "- Think of any positive integer $n$.\n", + "- If $n$ is even, the next $n$ is half the old $n$.\n", + "- If $n$ is odd, multiply the old $n$ by $3$ and add $1$ to obtain the next $n$.\n", + "- Repeat these steps until you reach $1$.\n", "\n", "**Do we always reach the final $1$?**" ] @@ -4631,7 +4632,7 @@ } }, "source": [ - "The function below implements this game. Does it always reach $1$? No one has proven it so far! We include some input validation as before because `collatz()` would definitely not terminate if we called it with a negative number. Further, the Collatz sequence also works for real numbers but then we would have to study fractals (cf., [this](https://en.wikipedia.org/wiki/Collatz_conjecture#Iterating_on_real_or_complex_numbers)). So we restrict our example to integers only." + "The function below implements this game. Does it always reach $1$? No one has proven it so far! We include some input validation as before because `collatz()` would for sure not terminate if we called it with a negative number. Further, the Collatz sequence also works for real numbers, but then we would have to study fractals (cf., [this](https://en.wikipedia.org/wiki/Collatz_conjecture#Iterating_on_real_or_complex_numbers)). So we restrict our example to integers only." ] }, { @@ -4657,7 +4658,7 @@ " n (int): a positive number to start the Collatz sequence at\n", "\n", " Raises:\n", - " TypeError: if n is not of an integer\n", + " TypeError: if n is not an integer\n", " ValueError: if n is not positive\n", " \"\"\"\n", " if not isinstance(n, int):\n", @@ -4668,7 +4669,7 @@ " while n != 1:\n", " print(n, end=\" \")\n", " if n % 2 == 0:\n", - " n //= 2 # //= so that n remains an int\n", + " n //= 2 # //= instead of /= so that n remains an int\n", " else:\n", " n = 3 * n + 1\n", "\n", @@ -4789,11 +4790,11 @@ } }, "source": [ - "Recursion and the `while` statement are two sides of the same coin. Disregarding that in the case of recursion Python internally faces some additional burden for managing the stack of frames in memory, both approaches lead to the *same* computational steps in memory. More importantly, we can re-formulate a recursive implementation in an iterative way and vice verca despite one of the two ways often \"feeling\" a lot more natural given a particular problem.\n", + "Recursion and the `while` statement are two sides of the same coin. Disregarding that in the case of recursion Python internally faces some additional burden for managing the stack of frames in memory, both approaches lead to the *same* computational steps in memory. More importantly, we can re-formulate a recursive implementation in an iterative way and vice versa despite one of the two ways often \"feeling\" a lot more natural given a particular problem.\n", "\n", - "So how does the `for` [statement](https://docs.python.org/3/reference/compound_stmts.html#the-for-statement) we saw in the very first example in this book fit into this picture? It is really a *redundant* language construct to provide a *shorter* and more *convenient* syntax for common applications of the `while` statement. In programming, such additions to a language are called **syntactic sugar**. Sugar makes a cup of tea taste better but we can drink tea without sugar too.\n", + "So how does the `for` [statement](https://docs.python.org/3/reference/compound_stmts.html#the-for-statement) we saw in the very first example in this book fit into this picture? It is a *redundant* language construct to provide a *shorter* and more *convenient* syntax for common applications of the `while` statement. In programming, such additions to a language are called **syntactic sugar**. A cup of tea tastes better with sugar, but we may drink tea without sugar too.\n", "\n", - "Consider the following `numbers` list. Without the `for` statement, we would have to keep track of a temporary **index variable** `i` to loop over all its elements and also obtain the individual elements with the `[]` operator in each iteration of the loop." + "Consider the following `numbers` list. Without the `for` statement, we have to manage a temporary **index variable** `i` to loop over all the elements and also obtain the individual elements with the `[]` operator in each iteration of the loop." ] }, { @@ -4806,7 +4807,7 @@ }, "outputs": [], "source": [ - "numbers = [5, 6, 7, 8, 9]" + "numbers = [0, 1, 2, 3, 4]" ] }, { @@ -4822,7 +4823,7 @@ "name": "stdout", "output_type": "stream", "text": [ - "5 6 7 8 9 " + "0 1 2 3 4 " ] } ], @@ -4831,7 +4832,8 @@ "while i < len(numbers):\n", " number = numbers[i]\n", " print(number, end=\" \")\n", - " i += 1" + " i += 1\n", + "del i" ] }, { @@ -4842,7 +4844,7 @@ } }, "source": [ - "The `for` statement, on the contrary, makes the actual business logic more apparent by stripping all the boilerplate code away." + "The `for` statement, on the contrary, makes the actual business logic more apparent by stripping all the **[boilerplate code](https://en.wikipedia.org/wiki/Boilerplate_code)** away." ] }, { @@ -4858,7 +4860,7 @@ "name": "stdout", "output_type": "stream", "text": [ - "5 6 7 8 9 " + "0 1 2 3 4 " ] } ], @@ -4875,34 +4877,12 @@ } }, "source": [ - "For sequences of integers the [range()](https://docs.python.org/3/library/functions.html#func-range) built-in makes the `for` statement even more convenient: It creates a list-like object of type `range` that generates integers \"on the fly\" and we will look closely at the underlying data types in Chapter 7." + "For sequences of integers, the [range()](https://docs.python.org/3/library/functions.html#func-range) built-in makes the `for` statement even more convenient: It creates a list-like object of type `range` that generates integers \"on the fly,\" and we look closely at the underlying effects in memory in Chapter 7." ] }, { "cell_type": "code", "execution_count": 51, - "metadata": { - "slideshow": { - "slide_type": "slide" - } - }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "0 1 2 3 4 " - ] - } - ], - "source": [ - "for number in [0, 1, 2, 3, 4]:\n", - " print(number, end=\" \")" - ] - }, - { - "cell_type": "code", - "execution_count": 52, "metadata": { "slideshow": { "slide_type": "fragment" @@ -4922,20 +4902,9 @@ " print(number, end=\" \")" ] }, - { - "cell_type": "markdown", - "metadata": { - "slideshow": { - "slide_type": "skip" - } - }, - "source": [ - "Let's quickly verify that `range(5)` creates an object of type `range`." - ] - }, { "cell_type": "code", - "execution_count": 53, + "execution_count": 52, "metadata": { "slideshow": { "slide_type": "fragment" @@ -4948,7 +4917,7 @@ "range" ] }, - "execution_count": 53, + "execution_count": 52, "metadata": {}, "output_type": "execute_result" } @@ -4965,12 +4934,12 @@ } }, "source": [ - "[range()](https://docs.python.org/3/library/functions.html#func-range) takes optional `start` and `step` arguments that we can use to customize the sequence of integers even more." + "[range()](https://docs.python.org/3/library/functions.html#func-range) takes optional `start` and `step` arguments that we use to customize the sequence of integers even more." ] }, { "cell_type": "code", - "execution_count": 54, + "execution_count": 53, "metadata": { "slideshow": { "slide_type": "slide" @@ -4992,7 +4961,7 @@ }, { "cell_type": "code", - "execution_count": 55, + "execution_count": 54, "metadata": { "slideshow": { "slide_type": "fragment" @@ -5031,26 +5000,26 @@ } }, "source": [ - "The important difference between the `list` objects (i.e., `[0, 1, 2, 3, 4]` and `[1, 3, 5, 7, 9]`) and the `range` objects (i.e., `range(5)` and `range(1, 10, 2)`) is that in the former case *six* objects are created in memory, one `list` holding pointers to *five* `int` objects, whereas in the latter case only *one* `range` object exists in memory that **generates** `int` objects as we ask for it.\n", + "The essential difference between the above `list` objects, `[0, 1, 2, 3, 4]` and `[1, 3, 5, 7, 9]`, and the `range` objects, `range(5)` and `range(1, 10, 2)`, is that in the former case *six* objects are created in memory *before* the `for` statement starts running, *one* `list` holding pointers to *five* `int` objects, whereas in the latter case only *one* `range` object is created that **generates** `int` objects one at a time *while* the `for`-loop runs.\n", "\n", - "However, we can iterate over both of them. So a natural question to ask is why Python treats objects of *different* types in the *same* way when used with a `for` statement.\n", + "However, we can loop over both of them. So a natural question to ask is why Python treats objects of *different* types in the *same* way when used with a `for` statement.\n", "\n", - "So far, the overarching storyline in this book goes like this: In Python, *everything* is an object. Besides its *identity* and *value*, every object is characterized by belonging to *one* data type that determines how the object behaves and what we can do with it.\n", + "So far, the overarching storyline in this book goes like this: In Python, *everything* is an object. Besides its *identity* and *value*, every object is characterized by belonging to *one data type* that determines how the object behaves and what we may do with it.\n", "\n", - "Now, just as we classify objects by their types, we also classify these **concrete data types** (e.g., `int`, `float`, or `str`) into **abstract concepts**.\n", + "Now, just as we classify objects by their types, we also classify these **concrete data types** (e.g., `int`, `float`, `str`, or `list`) into **abstract concepts**.\n", "\n", - "We have actually done this in [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb) already when we described a `list` object as \"some sort of container that holds [...] pointers to other objects\". So, abstractly speaking, **containers** are any objects that are \"composed\" of other objects and also \"manage\" how these objects are organized. `list` objects, for example, have the property that they model an inherent order associated with its elements. There exist, however, many other container types, many of which do *not* come with an order. So, containers primarily \"contain\" other objects and have *nothing* to do with looping.\n", + "We did this already in [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb) when we described a `list` object as \"some sort of container that holds [...] pointers to other objects\". So, abstractly speaking, **containers** are any objects that are \"composed\" of other objects and also \"manage\" how these objects are organized. `list` objects, for example, have the property that they model an internal order associated with its elements. There exist, however, other container types, many of which do *not* come with an order. So, containers primarily \"contain\" other objects and have *nothing* to do with looping.\n", "\n", - "On the contrary, the abstract concept of **iterables** is all about looping: Any object that we can loop over is by definition an iterable. So, `range` objects, for example, are iterables taht do *not* contain any other objects. Moreover, looping does *not* have to occur in any particular order although this is the case for both `list` and `range` objects.\n", + "On the contrary, the abstract concept of **iterables** is all about looping: Any object that we can loop over is, by definition, an iterable. So, `range` objects, for example, are iterables that do *not* contain other objects. Moreover, looping does *not* have to occur in a *predictable* order, although this is the case for both `list` and `range` objects.\n", "\n", - "Typically, containers are iterable and iterables are containers. Yet, only because these two concepts coincide often, we must not think of them as the same. Chapter 10 will finally give an explanation as to how abstract concepts are implemented and play together.\n", + "Typically, containers are iterables, and iterables are containers. Yet, only because these two concepts coincide often, we must not think of them as the same. Chapter 10 finally gives an explanation as to how abstract concepts are implemented and play together.\n", "\n", - "`list` objects like `first_names` below are iterable containers. They actually implement even more abstract concepts as we will see in Chapter 7." + "So, `list` objects like `first_names` below are iterable containers. They implement even more abstract concepts, as Chapter 7 reveals." ] }, { "cell_type": "code", - "execution_count": 56, + "execution_count": 55, "metadata": { "slideshow": { "slide_type": "slide" @@ -5069,12 +5038,12 @@ } }, "source": [ - "The characteristic operator associated with a container type is the `in` operator which checks if a given object evaluates equal to any of the objects in the container. Colloquially, it checks if an object is \"contained\" in the container. This operation is also called **membership testing**." + "The characteristic operator associated with a container type is the `in` operator: It checks if a given object evaluates equal to at least one of the objects in the container. Colloquially, it checks if an object is \"contained\" in the container. Formally, this operation is called **membership testing**." ] }, { "cell_type": "code", - "execution_count": 57, + "execution_count": 56, "metadata": { "slideshow": { "slide_type": "fragment" @@ -5087,7 +5056,7 @@ "True" ] }, - "execution_count": 57, + "execution_count": 56, "metadata": {}, "output_type": "execute_result" } @@ -5098,7 +5067,7 @@ }, { "cell_type": "code", - "execution_count": 58, + "execution_count": 57, "metadata": { "slideshow": { "slide_type": "-" @@ -5111,7 +5080,7 @@ "False" ] }, - "execution_count": 58, + "execution_count": 57, "metadata": {}, "output_type": "execute_result" } @@ -5128,12 +5097,12 @@ } }, "source": [ - "This shows the exact workings of the `in` operator: Although `7.0` is *not* in `numbers`, `7.0` evaluates equal to the `7` that is in it, which is why the following expression evaluates to `True`. So, while we could colloquially say that `numbers` \"contains\" `7.0`, it actually does not." + "The cell below shows the *exact* workings of the `in` operator: Although `3.0` is *not* contained in `numbers`, it evaluates equal to the `3` that is, which is why the following expression evaluates to `True`. So, while we could colloquially say that `numbers` \"contains\" `3.0`, it actually does not." ] }, { "cell_type": "code", - "execution_count": 59, + "execution_count": 58, "metadata": { "slideshow": { "slide_type": "skip" @@ -5146,13 +5115,13 @@ "True" ] }, - "execution_count": 59, + "execution_count": 58, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "7.0 in numbers" + "3.0 in numbers" ] }, { @@ -5163,12 +5132,12 @@ } }, "source": [ - "Similarly, the characteristic operation of an iterable type is that it supports iteration, for example, with a `for`-loop." + "Similarly, the characteristic operation of an iterable type is that it supports being looped over, for example, with the `for` statement." ] }, { "cell_type": "code", - "execution_count": 60, + "execution_count": 59, "metadata": { "slideshow": { "slide_type": "slide" @@ -5196,12 +5165,12 @@ } }, "source": [ - "If we need to have an index variable in the loop's body, we use the [enumerate()](https://docs.python.org/3/library/functions.html#enumerate) built-in that takes an iterable as its argument and then generates a stream of \"pairs\" consisting of an index variable and an object provided by the iterable. There is *no* need to ever revert back to the `while` statement to loop over an iterable object." + "If we must have an index variable in the loop's body, we use the [enumerate()](https://docs.python.org/3/library/functions.html#enumerate) built-in that takes an *iterable* as its argument and then generates a \"stream\" of \"pairs\" consisting of an index variable and an object provided by the iterable. There is *no* need to ever revert to the `while` statement to loop over an iterable object." ] }, { "cell_type": "code", - "execution_count": 61, + "execution_count": 60, "metadata": { "slideshow": { "slide_type": "fragment" @@ -5234,7 +5203,7 @@ }, { "cell_type": "code", - "execution_count": 62, + "execution_count": 61, "metadata": { "slideshow": { "slide_type": "skip" @@ -5267,7 +5236,7 @@ }, { "cell_type": "code", - "execution_count": 63, + "execution_count": 62, "metadata": { "slideshow": { "slide_type": "skip" @@ -5280,7 +5249,7 @@ }, { "cell_type": "code", - "execution_count": 64, + "execution_count": 63, "metadata": { "slideshow": { "slide_type": "skip" @@ -5291,17 +5260,13 @@ "name": "stdout", "output_type": "stream", "text": [ - "Achim Müller\n", - "Berthold Meyer\n", - "Carl Mayer\n", - "Diedrich Schmitt\n", - "Eckardt Schmidt\n" + "Achim Müller Berthold Meyer Carl Mayer Diedrich Schmitt Eckardt Schmidt " ] } ], "source": [ "for first_name, last_name in zip(first_names, last_names):\n", - " print(first_name, last_name)" + " print(first_name, last_name, end=\" \")" ] }, { @@ -5325,14 +5290,14 @@ "source": [ "In contrast to its recursive counterpart, the iterative `fibonacci()` function below is somewhat harder to read. For example, it is not so obvious as to how many iterations through the `for`-loop we need to make when implementing it. There is an increased risk of making an *off-by-one* error. Moreover, we need to track a `temp` variable along, at least until we have worked through Chapter 7. Do you understand what `temp` does?\n", "\n", - "However, one advantage of calculating Fibonacci numbers in a **forwards** fashion with a `for` statement is that we could list the entire sequence in ascending order as we calculate the desired number. To show this, we added `print()` statements in `fibonacci()` below.\n", + "However, one advantage of calculating Fibonacci numbers in a **forward** fashion with a `for` statement is that we could list the entire sequence in ascending order as we calculate the desired number. To show this, we added `print()` statements in `fibonacci()` below.\n", "\n", - "We do *not* need to store the index variable in the `for`-loop's header line: That is what the underscore \"\\_\" indicates; we \"throw it away\". Also, we do not need to explicitly check if `i` is of type `int` as the [range()](https://docs.python.org/3/library/functions.html#func-range) built-in raises a `TypeError` if used with anything other than an `int`." + "We do *not* need to store the index variable in the `for`-loop's header line: That is what the underscore variable `_` indicates; we \"throw it away.\"" ] }, { "cell_type": "code", - "execution_count": 65, + "execution_count": 64, "metadata": { "code_folding": [], "slideshow": { @@ -5351,11 +5316,12 @@ " ith_fibonacci (int)\n", "\n", " Raises:\n", - " TypeError: if i is not of an integer type\n", + " TypeError: if i is not an integer\n", " ValueError: if i is not positive\n", " \"\"\"\n", - " # no need to check if i is an integer as range() does that\n", - " if i < 0:\n", + " if not isinstance(i, int):\n", + " raise TypeError(\"i must be an integer\")\n", + " elif i < 0:\n", " raise ValueError(\"i must be non-negative\")\n", "\n", " a = 0\n", @@ -5372,7 +5338,7 @@ }, { "cell_type": "code", - "execution_count": 66, + "execution_count": 65, "metadata": { "slideshow": { "slide_type": "slide" @@ -5392,7 +5358,7 @@ "144" ] }, - "execution_count": 66, + "execution_count": 65, "metadata": {}, "output_type": "execute_result" } @@ -5420,12 +5386,12 @@ } }, "source": [ - "Another more important advantage is that now we can calculate even big Fibonacci numbers *efficiently*." + "Another more important advantage is that now we may calculate even big Fibonacci numbers *efficiently*." ] }, { "cell_type": "code", - "execution_count": 67, + "execution_count": 66, "metadata": { "slideshow": { "slide_type": "slide" @@ -5445,7 +5411,7 @@ "218922995834555169026" ] }, - "execution_count": 67, + "execution_count": 66, "metadata": {}, "output_type": "execute_result" } @@ -5473,12 +5439,12 @@ } }, "source": [ - "The iterative `factorial()` function is comparable to its recursive counterpart when it comes to readability. One advantage of calculating the factorial in a forwards fashion is that we could track the intermediate `product` as it grows." + "The iterative `factorial()` implementation is comparable to its recursive counterpart when it comes to readability. One advantage of calculating the factorial in a forward fashion is that we could track the intermediate `product` as it grows." ] }, { "cell_type": "code", - "execution_count": 68, + "execution_count": 67, "metadata": { "slideshow": { "slide_type": "slide" @@ -5514,7 +5480,7 @@ }, { "cell_type": "code", - "execution_count": 69, + "execution_count": 68, "metadata": { "slideshow": { "slide_type": "slide" @@ -5534,7 +5500,7 @@ "3628800" ] }, - "execution_count": 69, + "execution_count": 68, "metadata": {}, "output_type": "execute_result" } @@ -5562,7 +5528,7 @@ } }, "source": [ - "The remainder of this chapter introduces more syntactic sugar. None of the language constructs introduced below are actually needed but contribute to making Python the expressive and easy to read language that it is." + "The remainder of this chapter introduces more syntactic sugar: None of the language constructs introduced below are needed but contribute to making Python the expressive and easy to read language that it is." ] }, { @@ -5584,12 +5550,12 @@ } }, "source": [ - "Let's say we have a list of numbers and we want to check if the square of at least one of its numbers is above a `threshold` of `100`." + "Let's say we have a `numbers` list and want to check if the square of at least one of its elements is above a `threshold` of `100`." ] }, { "cell_type": "code", - "execution_count": 70, + "execution_count": 69, "metadata": { "slideshow": { "slide_type": "slide" @@ -5602,7 +5568,7 @@ }, { "cell_type": "code", - "execution_count": 71, + "execution_count": 70, "metadata": { "slideshow": { "slide_type": "-" @@ -5621,12 +5587,12 @@ } }, "source": [ - "A first naive implementation could look like this: We loop over *every* element in `numbers` and set an **indicator variable** `is_above` to `True` once we encounter an element satisfying our search condition." + "A first naive implementation could look like this: We loop over *every* element in `numbers` and set an **indicator variable** `is_above`, initialized as `False`, to `True` once we encounter an element satisfying the search condition." ] }, { "cell_type": "code", - "execution_count": 72, + "execution_count": 71, "metadata": { "slideshow": { "slide_type": "fragment" @@ -5663,16 +5629,16 @@ } }, "source": [ - "This implementation is *inefficient* as even if the *first* number in `numbers` has a square greater than `100`, we loop until the last element, which could take a long time for a big list.\n", + "This implementation is *inefficient* as even if the *first* element in `numbers` has a square greater than `100`, we loop until the last element: This could take a long time for a big list.\n", "\n", - "Moreover, we must initialize `is_above` before the `for`-loop and write an `if`-`else`-logic seperate from it to check for the final result. The actual business logic is not clear right away.\n", + "Moreover, we must initialize `is_above` *before* the `for`-loop and write an `if`-`else`-logic *after* it to check for the result. The actual business logic is *not* clear right away.\n", "\n", - "Luckily, Python provides a `break` [statement](https://docs.python.org/3/reference/simple_stmts.html#the-break-statement) that let's us stop the `for`-loop in any iteration of the loop. Conceptually, it is yet another means of controlling the flow of execution." + "Luckily, Python provides a `break` [statement](https://docs.python.org/3/reference/simple_stmts.html#the-break-statement) that lets us stop the `for`-loop in any iteration of the loop. Conceptually, it is yet another means of controlling the flow of execution." ] }, { "cell_type": "code", - "execution_count": 73, + "execution_count": 72, "metadata": { "slideshow": { "slide_type": "slide" @@ -5710,7 +5676,7 @@ } }, "source": [ - "This is a computational improvement. However, the code can still be split into *three* groups: initialization, the `for`-loop, and some finalizing logic. We would prefer to convey the program's idea in *one* compound statement." + "This is a computational improvement. However, the code still consists of *three* logical sections: Some initialization *before* the `for`-loop, the loop itself, and some finalizing logic. We prefer to convey the program's idea in *one* compound statement instead." ] }, { @@ -5732,9 +5698,9 @@ } }, "source": [ - "To express the logic in a prettier way, we add an `else`-clause at the end of the `for`-loop. The `else`-branch is only executed if the body in the `for`-branch is *not* stopped with a `break` statement before reaching the last iteration in the loop. The word \"else\" implies a rather unintuitive meaning and it had better been named a `then`-clause. In most scenarios, however, the `else`-clause logically goes together with some `if` statement within the `for`-loop's body.\n", + "To express the logic in a prettier way, we add an `else`-clause at the end of the `for`-loop. The `else`-branch is executed *iff* the `for`-loop is *not* stopped with a `break` statement **prematurely** (i.e., *before* reaching the *last* iteration in the loop). The word \"else\" implies a somewhat unintuitive meaning and had better been named a `then`-clause. In most scenarios, however, the `else`-clause logically goes together with some `if` statement in the `for`-loop's body.\n", "\n", - "Overall, the expressive power of our code increases. Not many programming languages support a `for`-`else`-branching, which turns out to be very useful in practice." + "Overall, the code's expressive power increases. Not many programming languages support a `for`-`else`-branching, which turns out to be very useful in practice." ] }, { @@ -5750,7 +5716,7 @@ }, { "cell_type": "code", - "execution_count": 74, + "execution_count": 73, "metadata": { "slideshow": { "slide_type": "slide" @@ -5788,12 +5754,12 @@ } }, "source": [ - "Lastly, we incorporate the `if`-`else` logic at the end into the `for`-loop and avoid the `is_above` variable alltogether." + "Lastly, we incorporate the finalizing `if`-`else` logic into the `for`-loop, avoiding the `is_above` variable altogether." ] }, { "cell_type": "code", - "execution_count": 75, + "execution_count": 74, "metadata": { "slideshow": { "slide_type": "slide" @@ -5826,12 +5792,12 @@ } }, "source": [ - "Of course, if we set the `threshold` a number's square has to pass higher, for example to `200`, we have to loop through the entire `numbers` list. There is no way to optimize this **[linear search](https://en.wikipedia.org/wiki/Linear_search)**, at least as long as we model the list of numbers with a `list` object. More advanced data types, however, exist that mitigate that downside." + "Of course, if we set the `threshold` an element's square has to pass higher, for example, to `200`, we have to loop through the entire `numbers` list. There is *no way* to optimize this **[linear search](https://en.wikipedia.org/wiki/Linear_search)** further, at least as long as we model `numbers` as a `list` object. More advanced data types, however, exist that mitigate that downside." ] }, { "cell_type": "code", - "execution_count": 76, + "execution_count": 75, "metadata": { "slideshow": { "slide_type": "skip" @@ -5877,17 +5843,17 @@ } }, "source": [ - "Often times, we process some iterable with numeric data, for example, a list of numbers as in the introductory example in [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb) or, more realistically, data from a CSV file with many rows and columns.\n", + "Often, we process some iterable with numeric data, for example, a list of numbers as in the introductory example in [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb) or, more realistically, data from a CSV file with many rows and columns.\n", "\n", - "Processing numeric data usually comes down to operations that can be grouped into one of the following three categories:\n", + "Processing numeric data usually comes down to operations that may be grouped into one of the following three categories:\n", "\n", - "- **mapping**: transform an observation according to some functional relationship $y = f(x)$\n", - "- **filtering**: throw away individual observations (e.g., statistical outliers)\n", - "- **reducing**: collect individual observations into summary statistics\n", + "- **mapping**: transform a sample according to some functional relationship $y = f(x)$\n", + "- **filtering**: throw away individual samples (e.g., statistical outliers)\n", + "- **reducing**: collect individual samples into summary statistics\n", "\n", - "We will study this **map-filter-reduce** paradigm extensively in Chapter 7 after introducing more advanced data types that are needed to work with \"big\" data.\n", + "We study this **map-filter-reduce** paradigm extensively in Chapter 7 after introducing more advanced data types that are needed to work with \"big\" data.\n", "\n", - "In this section, we focus on *filtering out* some samples within a `for`-loop." + "In the remainder of this section, we focus on *filtering out* some samples within a `for`-loop." ] }, { @@ -5919,7 +5885,7 @@ }, { "cell_type": "code", - "execution_count": 77, + "execution_count": 76, "metadata": { "slideshow": { "slide_type": "slide" @@ -5932,7 +5898,7 @@ }, { "cell_type": "code", - "execution_count": 78, + "execution_count": 77, "metadata": { "slideshow": { "slide_type": "-" @@ -5952,7 +5918,7 @@ "370" ] }, - "execution_count": 78, + "execution_count": 77, "metadata": {}, "output_type": "execute_result" } @@ -5977,11 +5943,11 @@ } }, "source": [ - "The above code is still rather easy to read as it only involves two levels of indentation.\n", + "The above code is easy to read as it involves only two levels of indentation.\n", "\n", - "In general, code gets harder to comprehend the more **horizontal space** it occupies. It is commonly considered good practice to grow a program **vertically** rather than horizontally. Code complient with [PEP 8](https://www.python.org/dev/peps/pep-0008/#maximum-line-length) requires us to use *at most* 79 characters in a line!\n", + "In general, code gets harder to comprehend the more **horizontal space** it occupies. It is commonly considered good practice to grow a program **vertically** rather than horizontally. Code compliant with [PEP 8](https://www.python.org/dev/peps/pep-0008/#maximum-line-length) requires us to use *at most* 79 characters in a line!\n", "\n", - "Consider the next example, whose implementation in code already starts to look \"unbalanced\"." + "Consider the next example, whose implementation in code already starts to look unbalanced." ] }, { @@ -6014,7 +5980,7 @@ }, { "cell_type": "code", - "execution_count": 79, + "execution_count": 78, "metadata": { "slideshow": { "slide_type": "slide" @@ -6027,7 +5993,7 @@ "[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]" ] }, - "execution_count": 79, + "execution_count": 78, "metadata": {}, "output_type": "execute_result" } @@ -6038,7 +6004,7 @@ }, { "cell_type": "code", - "execution_count": 80, + "execution_count": 79, "metadata": { "slideshow": { "slide_type": "-" @@ -6058,7 +6024,7 @@ "182" ] }, - "execution_count": 80, + "execution_count": 79, "metadata": {}, "output_type": "execute_result" } @@ -6084,11 +6050,11 @@ } }, "source": [ - "With already three levels of indentation, less horizontal space is available for the actual code. Of course, one could combine the two `if` statements with the logical `and` operator as shown in [Chapter 3](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/03_conditionals.ipynb). However, then we trade off horizontal space against a more \"complex\" `if` logic and this is not a real improvement.\n", + "With already three levels of indentation, less horizontal space is available for the actual code block. Of course, one could flatten the two `if` statements with the logical `and` operator, as shown in [Chapter 3](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/03_conditionals.ipynb#The-if-Statement). Then, however, we trade off horizontal space against a more \"complex\" `if` logic, and this is *not* a real improvement.\n", "\n", - "A Pythonista would instead make use of the `continue` [statement](https://docs.python.org/3/reference/simple_stmts.html#the-continue-statement) that causes a loop to jump right into the next iteration skipping the rest of the code block.\n", + "A Pythonista would instead make use of the `continue` [statement](https://docs.python.org/3/reference/simple_stmts.html#the-continue-statement) that causes a loop to jump into the next iteration skipping the rest of the code block.\n", "\n", - "The revised code fragement below occupies more vertical space and less horizontal space. A good trade-off." + "The revised code fragment below occupies more vertical space and less horizontal space: A *good* trade-off." ] }, { @@ -6104,7 +6070,7 @@ }, { "cell_type": "code", - "execution_count": 81, + "execution_count": 80, "metadata": { "slideshow": { "slide_type": "slide" @@ -6124,7 +6090,7 @@ "182" ] }, - "execution_count": 81, + "execution_count": 80, "metadata": {}, "output_type": "execute_result" } @@ -6153,13 +6119,11 @@ } }, "source": [ - "This is yet another illustration of why programming is an art. The two preceding code fragments do *exactly* the *same* with *identical* time complexity. However, arguably the latter it a lot easier to read for a human, at least when the business logic grows beyond more than just two nested filters.\n", + "This is yet another illustration of why programming is an art. The two preceding code cells do the *same* with *identical* time complexity. However, the latter is arguably easier to read for a human, even more so when the business logic grows beyond two nested filters.\n", "\n", - "The idea behind the `continue` statement is conceptually similar to the early exit pattern we saw in the context of function definitions in [Chapter 2](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/02_functions.ipynb).\n", + "The idea behind the `continue` statement is conceptually similar to the early exit pattern we saw above.\n", "\n", - "The two examples can be modeled in an even better way as we will see in Chapter 7.\n", - "\n", - "Both the `break` and `continue` statements as well as the optional `else`-clause are not only supported within `for`-loops but also `while`-loops." + "Both the `break` and `continue` statements, as well as the optional `else`-clause, are not only supported with `for`-loops but also `while`-loops." ] }, { @@ -6181,7 +6145,7 @@ } }, "source": [ - "Sometimes we find ourselves in situations where we can *not* know ahead of time how often or until which point in time a code block is to be executed." + "Sometimes we find ourselves in situations where we *cannot* know ahead of time how often or until which point in time a code block is to be executed." ] }, { @@ -6205,16 +6169,16 @@ "source": [ "Let's consider a game where we randomly choose a variable to be either \"Heads\" or \"Tails\" and the user of our program has to guess it.\n", "\n", - "Python provides the built-in [input()](https://docs.python.org/3/library/functions.html#input) function that prints a message to the user, called the **prompt**, and reads in what the user typed in response as a `str` object. We use it to process the user's \"unpredictable\" input to our program (i.e., a user might type in some invalid response). Further, we use the [random()](https://docs.python.org/3/library/random.html#random.random) function in the [random](https://docs.python.org/3/library/random.html) module to model the coin toss.\n", + "Python provides the built-in [input()](https://docs.python.org/3/library/functions.html#input) function that prints a message to the user, called the **prompt**, and reads in what was typed in response as a `str` object. We use it to process a user's \"unreliable\" input to our program (i.e., a user might type in some invalid response). Further, we use the [random()](https://docs.python.org/3/library/random.html#random.random) function in the [random](https://docs.python.org/3/library/random.html) module to model the coin toss.\n", "\n", - "A popular pattern to approach such **indefinite loops** is to go with a `while True` statement which on its own would cause Python to enter into an infinite loop. Then, once a certain event occurs, we `break` out of the loop.\n", + "A popular pattern to approach such **indefinite loops** is to go with a `while True` statement, which on its own would cause Python to enter into an infinite loop. Then, once a particular event occurs, we `break` out of the loop.\n", "\n", - "Let's look at a first naive implementation." + "Let's look at a first and naive implementation." ] }, { "cell_type": "code", - "execution_count": 82, + "execution_count": 81, "metadata": { "slideshow": { "slide_type": "slide" @@ -6227,7 +6191,7 @@ }, { "cell_type": "code", - "execution_count": 83, + "execution_count": 82, "metadata": { "slideshow": { "slide_type": "-" @@ -6240,7 +6204,7 @@ }, { "cell_type": "code", - "execution_count": 84, + "execution_count": 83, "metadata": { "code_folding": [], "slideshow": { @@ -6285,10 +6249,10 @@ } }, "source": [ - "This version has two *severe* aspects where we should improve on:\n", + "This version exhibits two *severe* issues where we should improve on:\n", "\n", - "1. If a user enters something other than \"heads\" or \"tails\", for example, \"Heads\" or \"Tails\", the program keeps running without the user knowing about the mistake!\n", - "2. It intermingles the coin tossing with the comparison against the user's input: Mixing unrelated business logic in the same code fragment makes a program harder to read and maintain in the long run." + "1. If a user enters *anything* other than `\"heads\"` or `\"tails\"`, for example, `\"Heads\"` or `\"Tails\"`, the program keeps running *without* the user knowing about the mistake!\n", + "2. The code *intermingles* the coin tossing with the processing of the user's input: Mixing *unrelated* business logic in the *same* code block makes a program harder to read and, more importantly, maintain in the long run." ] }, { @@ -6310,16 +6274,16 @@ } }, "source": [ - "Let's refactor the code and make it *modular* and *comprehendable*.\n", + "Let's refactor the code and make it *modular*.\n", "\n", - "First, we divide the logic into two functions `get_guess()` and `toss_coin()` that are controlled from within a `while`-loop.\n", + "First, we divide the business logic into two functions `get_guess()` and `toss_coin()` that are controlled from within a `while`-loop.\n", "\n", - "`get_guess()` not only reads in the user's input but also implements a simple input validation pattern in that the [strip()](https://docs.python.org/3/library/stdtypes.html?highlight=__contains__#str.strip) and [lower()](https://docs.python.org/3/library/stdtypes.html?highlight=__contains__#str.lower) methods remove preceeding and trailing whitespace and lower case the input ensuring that the user can spell the input in any possible way (e.g. all upper or lower case). Also, `get_guess()` checks if the user entered one of the two valid options. If so, it returns either `\"heads\"` or `\"tails\"`; if not, it returns `None`." + "`get_guess()` not only reads in the user's input but also implements a simple input validation pattern in that the [strip()](https://docs.python.org/3/library/stdtypes.html?highlight=__contains__#str.strip) and [lower()](https://docs.python.org/3/library/stdtypes.html?highlight=__contains__#str.lower) methods remove preceding and trailing whitespace and lower case the input ensuring that the user may spell the input in any possible way (e.g., all upper or lower case). Also, `get_guess()` checks if the user entered one of the two valid options. If so, it returns either `\"heads\"` or `\"tails\"`; if not, it returns `None`." ] }, { "cell_type": "code", - "execution_count": 85, + "execution_count": 84, "metadata": { "slideshow": { "slide_type": "slide" @@ -6351,12 +6315,12 @@ } }, "source": [ - "`toss_coin()` is a simple function that models a fair coin toss when called with default arguments." + "`toss_coin()` models a fair coin toss when called with default arguments." ] }, { "cell_type": "code", - "execution_count": 86, + "execution_count": 85, "metadata": { "slideshow": { "slide_type": "slide" @@ -6387,14 +6351,14 @@ } }, "source": [ - "Second, we rewrite the `if`-`else`-logic to explictly handle the case where `get_guess()` returns `None`: Whenever the user enters something invalid, a warning is shown, and another try is granted. Observe how we use the `is` operator and not the `==` operator as `None` is a singleton object.\n", + "Second, we rewrite the `if`-`else`-logic to handle the case where `get_guess()` returns `None` explicitly: Whenever the user enters something invalid, a warning is shown, and another try is granted. We use the `is` operator and not the `==` operator as `None` is a singleton object.\n", "\n", - "The `while` statement itself takes on the role of **glue code** that only manages how other parts of the program interact with each other." + "The `while`-loop takes on the role of **glue code** that manages how other parts of the program interact with each other." ] }, { "cell_type": "code", - "execution_count": 87, + "execution_count": 86, "metadata": { "slideshow": { "slide_type": "skip" @@ -6407,7 +6371,7 @@ }, { "cell_type": "code", - "execution_count": 88, + "execution_count": 87, "metadata": { "slideshow": { "slide_type": "slide" @@ -6447,7 +6411,7 @@ } }, "source": [ - "Now, our little program's business logic is a lot easier to comprehend. More importantly, we can now easily make changes to the program. For example, we could make the `toss_coin()` function base the tossing on a probability distribution other than the uniform (i.e., replace the [random.random()](https://docs.python.org/3/library/random.html#random.random) function with another one). In general, a modular architecture leads to improved software maintenance." + "Now, the program's business logic is expressed in a clearer way. More importantly, we can now change it more easily. For example, we could make the `toss_coin()` function base the tossing on a probability distribution other than the uniform (i.e., replace the [random.random()](https://docs.python.org/3/library/random.html#random.random) function with another one). In general, modular architecture leads to improved software maintenance." ] }, { @@ -6471,13 +6435,13 @@ "source": [ "**Iteration** is about **running blocks of code repeatedly**.\n", "\n", - "There are two redundant approaches of achieving that.\n", + "There are two redundant approaches to achieving that.\n", "\n", - "First, we can combine functions that call themselves with conditional statements. This concept is known as **recursion** and suffices to control the flow of execution in *every* way we desire. For a beginner, this approach of **backwards** reasoning might not be intuitive but it turns out to be a very useful tool to have in one's toolbox.\n", + "First, we combine functions that call themselves with conditional statements. This concept is known as **recursion** and suffices to control the flow of execution in *every* way we desire. For a beginner, this approach of **backward** reasoning might not be intuitive, but it turns out to be a handy tool to have in one's toolbox.\n", "\n", - "Second, the `while` and `for` statements are alternative and potentially more intuitive ways to express iteration as they correspond to a **forwards** reasoning. The `for` statement is **syntactic sugar** that allows to rewrite common occurences of the `while` statement in a concise way. Python provides the `break` and `continue` statements as well as an optional `else`-clause that make working with the `for` and `while` statements even more convenient.\n", + "Second, the `while` and `for` statements are alternative and potentially more intuitive ways to express iteration as they correspond to a **forward** reasoning. The `for` statement is **syntactic sugar** that allows rewriting common occurrences of the `while` statement concisely. Python provides the `break` and `continue` statements as well as an optional `else`-clause that make working with the `for` and `while` statements even more convenient.\n", "\n", - "**Iterables** are any **concrete data types** that support being looped over, for example, with the `for` statement. The idea behind iterables is an **abstract concept** that may or may not be implemented by any given concrete data type. For example, both `list` and `range` objects can be looped over. The `list` type is also a **container** as any given `list` objects \"contains\" pointers to other objects in memory. On the contrary, the `range` type does not point to any other objects but rather creates new `int` objects \"on the fly\"." + "**Iterables** are any **concrete data types** that support being looped over, for example, with the `for` statement. The idea behind iterables is an **abstract concept** that may or may not be implemented by any given concrete data type. For example, both `list` and `range` objects can be looped over. The `list` type is also a **container** as any given `list` objects \"contains\" pointers to other objects in memory. On the contrary, the `range` type does not point to any other objects but instead creates *new* `int` objects \"on the fly\" (i.e., when being looped over)." ] } ], diff --git a/04_iteration_review_and_exercises.ipynb b/04_iteration_review_and_exercises.ipynb index 0e937b7..a953796 100644 --- a/04_iteration_review_and_exercises.ipynb +++ b/04_iteration_review_and_exercises.ipynb @@ -54,7 +54,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Q2**: Solving a problem with a **recursion** is not only popular in computer science and math. Name some examples from the fields of business or economics where problems are also solved in a **backwards** fashion!" + "**Q2**: Solving a problem by **recursion** is not only popular in computer science and math. Name some examples from the fields of business or economics where problems are also solved in a **backward** fashion!" ] }, { @@ -68,7 +68,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Q3**: Explain what **duck typing** means! Why can it cause problems? Why it is [not a bug but a feature](https://www.urbandictionary.com/define.php?term=It%27s%20not%20a%20bug%2C%20it%27s%20a%20feature)?" + "**Q3**: Explain what **duck typing** means! Why can it cause problems? Why is it [not a bug but a feature](https://www.urbandictionary.com/define.php?term=It%27s%20not%20a%20bug%2C%20it%27s%20a%20feature)?" ] }, { @@ -208,7 +208,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Q12**: Whereas a **recursion** may accidently result in a **never ending** program, `while`-loops and `for`-loops are guaranteed to **terminate**." + "**Q12**: Whereas a **recursion** may accidentally result in a **never-ending** program, `while`-loops and `for`-loops are guaranteed to **terminate**." ] }, { @@ -275,7 +275,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "A popular example for a problem that is solved with a recursion art the **[Towers of Hanoi](https://en.wikipedia.org/wiki/Tower_of_Hanoi)**.\n", + "A popular example of a problem that is solved by recursion art the **[Towers of Hanoi](https://en.wikipedia.org/wiki/Tower_of_Hanoi)**.\n", "\n", "In its basic version, a tower consisting of, for example, four disks with increasing radii, is placed on the left-most of **three** adjacent spots. In the following, we refer to the number of disks as $n$, so here $n = 4$.\n", "\n", @@ -304,9 +304,9 @@ "source": [ "Watch the following video by [MIT](https://www.mit.edu/)'s professor [Richard Larson](https://idss.mit.edu/staff/richard-larson/) for a comprehensive introduction.\n", "\n", - "The [MIT Blossoms Initative](https://blossoms.mit.edu/) is primarily aimed at high school students and does not have any prerequisites.\n", + "The [MIT Blossoms Initiative](https://blossoms.mit.edu/) is primarily aimed at high school students and does not have any prerequisites.\n", "\n", - "The video consists of three segments the last of which is *not* necessary to have watched in order to solve the tasks below. So, watch the video until 37:55." + "The video consists of three segments, the last of which is *not* necessary to have watched to solve the tasks below. So, watch the video until 37:55." ] }, { @@ -393,7 +393,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "As most likely the first couple of tries will result in *semantic* errors, it is advisable to have some sort of **visualization tool** for the progam's output: For example, an online version of the game can be found **[here](https://www.mathsisfun.com/games/towerofhanoi.html)**." + "As most likely the first couple of tries will result in *semantic* errors, it is advisable to have some sort of **visualization tool** for the program's output: For example, an online version of the game can be found **[here](https://www.mathsisfun.com/games/towerofhanoi.html)**." ] }, { @@ -402,7 +402,7 @@ "source": [ "Let's first **generalize** the mathematical relationship from above.\n", "\n", - "While the first number of $Sol(\\cdot)$ is the number of `disks` $n$, the second and third \"numbers\" are actually the **labels** for the three spots. Instead of spots `1`, `2`, and `3` we could also call them `\"left\"`, `\"center\"`, and `\"right\"` in our Python implementation. When \"passed\" to the $Sol(\\cdot)$ \"function\" they take on the role of an `origin` (= $o$) and `destination` (= $d$) pair.\n", + "While the first number of $Sol(\\cdot)$ is the number of `disks` $n$, the second and third \"numbers\" are the **labels** for the three spots. Instead of spots `1`, `2`, and `3`, we could also call them `\"left\"`, `\"center\"`, and `\"right\"` in our Python implementation. When \"passed\" to the $Sol(\\cdot)$ \"function\" they take on the role of an `origin` (= $o$) and `destination` (= $d$) pair.\n", "\n", "So, the expression $Sol(4, 1, 3)$ is the same as $Sol(4, \\text{\"left\"}, \\text{\"right\"})$ and describes the problem of moving a tower consisting of $n = 4$ disks from `origin` `1` / `\"left\"` to `destination` `3` / `right`. As we have seen in the video, we need some `intermediate` (= $i$) spot." ] @@ -420,9 +420,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "In words, this means that in order to move a tower consisting of $n$ disks from an `origin` $o$ to a `destination` $d$, we three steps must be executed:\n", + "In words, this means that to move a tower consisting of $n$ disks from an `origin` $o$ to a `destination` $d$, three steps must be executed:\n", "\n", - "1. Move the top most $n - 1$ disks of the tower temporarily from $o$ to $i$ (= sub-problem 1)\n", + "1. Move the topmost $n - 1$ disks of the tower temporarily from $o$ to $i$ (= sub-problem 1)\n", "2. Move the remaining and largest disk from $o$ to $d$\n", "3. Move the the $n - 1$ disks from the temporary spot $i$ to $d$ (= sub-problem 2)\n", "\n", @@ -435,9 +435,9 @@ "source": [ "$Sol(\\cdot)$ can be written in Python as a function `sol()` that takes three arguments `disks`, `origin`, and `destination` that mirror $n$, $o$, and $d$.\n", "\n", - "Assume that all arguments to `sol()` will be `int` objects!\n", + "Assume that all arguments to `sol()` are `int` objects!\n", "\n", - "Once completed, `sol()` should print out all the moves in the correct order. With **printing a move**, we simply mean a line like \"1 -> 3\", short for \"Move the top-most disk from spot 1 to spot 3\".\n", + "Once completed, `sol()` should print out all the moves in the correct order. With **printing a move**, we mean a line like \"1 -> 3\", short for \"Move the top-most disk from spot 1 to spot 3\".\n", "\n", "Write your answers to **Q15.5** to **Q15.7** into the subsequent code cell and finalize `sol()`! No need to write a docstring or validate the input here." ] @@ -550,9 +550,9 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The previous `sol()` implementation does the job but the conditional statement needed in unnecessarily tedious. \n", + "The previous `sol()` implementation does the job, but the conditional statement needed in unnecessarily tedious. \n", "\n", - "Let's create a more concise `hanoi()` function that in addition to a positional `disks` argument takes three keyword-only arguments `origin`, `intermediate`, and `destination` with default values `\"left\"`, `\"center\"`, and `\"right\"`.\n", + "Let's create a more concise `hanoi()` function that, in addition to a positional `disks` argument, takes three keyword-only arguments `origin`, `intermediate`, and `destination` with default values `\"left\"`, `\"center\"`, and `\"right\"`.\n", "\n", "Write your answers to **Q15.9** and **Q15.10** into the subsequent code cell and finalize `hanoi()`! No need to write a docstring or validate the input here." ] @@ -577,7 +577,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "**Q15.9**: Copy the base case from `sol()`." + "**Q15.9**: Copy the base case from `sol()`!" ] }, { @@ -588,7 +588,7 @@ "\n", "Figure out how the arguments are passed on in the two recursive `hanoi()` calls!\n", "\n", - "Also, write a `print()` statement analogous to the one in `sol()` in between the two recursive function calls. Is it ok to just copy and paste it?" + "Also, write a `print()` statement analogous to the one in `sol()` in between the two recursive function calls. Is it ok to copy and paste it?" ] }, { @@ -638,7 +638,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "We could of course also use **numeric labels** for the three steps like so." + "We could, of course, also use **numeric labels** for the three steps like so." ] }, { @@ -661,13 +661,13 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Let's say, we did not know about the **analytical formula** for the number of **minimal moves** given $n$.\n", + "Let's say we did not know about the **analytical formula** for the number of **minimal moves** given $n$.\n", "\n", "In such cases, we could modify a recursive function to return a count value to be passed up the recursion tree.\n", "\n", - "In fact, this is similar to what we do in the recursive versions of `factorial()` and `fibonacci()` in [Chapter 4](https://github.com/webartifex/intro-to-python/blob/master/04_iteration.ipynb) where we pass up an intermediate result.\n", + "This is similar to what we do in the recursive versions of `factorial()` and `fibonacci()` in [Chapter 4](https://github.com/webartifex/intro-to-python/blob/master/04_iteration.ipynb), where we pass up an intermediate result.\n", "\n", - "Let's create a `hanoi_moves()` function that follows the same internal logic as `hanoi()` but instead of printing out the moves returns the number of steps done so far in the recursion.\n", + "Let's create a `hanoi_moves()` function that follows the same internal logic as `hanoi()`, but, instead of printing out the moves, returns the number of steps done so far in the recursion.\n", "\n", "Write your answers to **Q15.12** to **Q15.14** into the subsequent code cell and finalize `hanoi_moves()`! No need to write a docstring or validate the input here." ] @@ -742,7 +742,7 @@ "source": [ "Observe how quickly the `hanoi_moves()` function slows down for increasing `disks` arguments.\n", "\n", - "With `disks` in the range from `24` through `26` the computation time roughly doubles for each increase of `disks` by 1.\n", + "With `disks` in the range from `24` through `26`, the computation time roughly doubles for each increase of `disks` by 1.\n", "\n", "**Q15.16**: Execute the code cells below and see for yourself!" ] @@ -788,7 +788,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "The above `hanoi()` prints the optimal solution's moves in the correct order but fails to label each move with an order number. This can be build in by passing on one more argument `offset` down the recursion tree. As the logic gets a bit \"involved\", `hanoi_ordered()` below is almost finished.\n", + "The above `hanoi()` prints the optimal solution's moves in the correct order but fails to label each move with an order number. This can be built in by passing on one more argument `offset` down the recursion tree. As the logic gets a bit \"involved,\" `hanoi_ordered()` below is almost finished.\n", "\n", "Write your answers to **Q15.17** and **Q15.18** into the subsequent code cell and finalize `hanoi_ordered()`! No need to write a docstring or validate the input here." ] @@ -879,7 +879,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "Lastly, it is to be mentioned that for problem instances with a small `disks` argument it is easier to collect all the moves first in a list and then add the order number with the [enumerate()](https://docs.python.org/3/library/functions.html#enumerate) built-in." + "Lastly, it is to be mentioned that for problem instances with a small `disks` argument, it is easier to collect all the moves first in a list and then add the order number with the [enumerate()](https://docs.python.org/3/library/functions.html#enumerate) built-in." ] }, { diff --git a/05_numbers.ipynb b/05_numbers.ipynb new file mode 100644 index 0000000..73daebb --- /dev/null +++ b/05_numbers.ipynb @@ -0,0 +1,7068 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "# Chapter 5: Numbers" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "After learning about the basic building blocks of expressing and structuring the business logic in programs, we focus our attention on the **data types** Python offers us, both built-in and available via the [standard library](https://docs.python.org/3/library/index.html) or third-party packages.\n", + "\n", + "We start with the \"simple\" ones: Numeric types in this chapter and textual data in [Chapter 6](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/06_text.ipynb). An important fact that holds for all objects of these types is that they are **immutable**. To re-use the bag analogy from [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb#Objects-vs.-Types-vs.-Values), this means that the $0$s and $1$s making up an object's *value* cannot be changed once the bag is created in memory, implying that any operation with or method on the object creates a *new* object in a *different* memory location.\n", + "\n", + "Chapters 7, 8, and 9 then cover the more \"complex\" data types, including, for example, the `list` type. Finally, Chapter 10 completes the picture by introducing language constructs to create custom types.\n", + "\n", + "We have already seen many hints indicating that numbers are not as trivial to work with as it seems at first sight:\n", + "\n", + "- [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb#%28Data%29-Type-%2F-%22Behavior%22) reveals that numbers may come in *different* data types (i.e., `int` vs. `float` so far),\n", + "- [Chapter 3](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/03_conditionals.ipynb#Boolean-Expressions) raises questions regarding the **limited precision** of `float` numbers (e.g., `42 == 42.000000000000001` evaluates to `True`), and\n", + "- [Chapter 4](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/04_iteration.ipynb#Infinite-Recursion) shows that sometimes a `float` \"walks\" and \"quacks\" like an `int`, whereas the reverse is true in other cases.\n", + "\n", + "This chapter introduces all the [built-in numeric types](https://docs.python.org/3/library/stdtypes.html#numeric-types-int-float-complex): `int`, `float`, and `complex`. To mitigate the limited precision of floating-point numbers, we also look at two replacements for the `float` type in the [standard library](https://docs.python.org/3/library/index.html), namely the `Decimal` type in the [decimals](https://docs.python.org/3/library/decimal.html#decimal.Decimal) and the `Fraction` type in the [fractions](https://docs.python.org/3/library/fractions.html#fractions.Fraction) module." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## The `int` Type" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The simplest numeric type is the `int` type: It behaves like an [integer in ordinary math](https://en.wikipedia.org/wiki/Integer) (i.e., the set $\\mathbb{Z}$) and supports operators in the way we saw in the section on arithmetic operators in [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb#%28Arithmetic%29-Operators)." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "i = 789" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Just like any other object, `789` has an identity, a type, and a value." + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "139928698315984" + ] + }, + "execution_count": 2, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "id(i)" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "int" + ] + }, + "execution_count": 3, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "type(i)" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "789" + ] + }, + "execution_count": 4, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "i" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "A nice feature is using underscores `_` as (thousands) separator in numeric literals. For example, `1_000_000` evaluates to `1000000` in memory." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "1000000" + ] + }, + "execution_count": 5, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "1_000_000" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Whereas mathematicians argue what the term $0^0$ means (cf., this [article](https://en.wikipedia.org/wiki/Zero_to_the_power_of_zero)), programmers are pragmatic about this and define $0^0 = 1$." + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "1" + ] + }, + "execution_count": 6, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "0 ** 0" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "### Binary Representations" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "As computers can only store $0$s and $1$s, `int` objects are nothing but that in memory as well. Consequently, computer scientists and engineers developed conventions as to how $0$s and $1$s are \"translated\" into integers, and one such convention is the **[binary representation](https://en.wikipedia.org/wiki/Binary_number)** of **non-negative integers**. Consider the integers from $0$ through $255$ that are encoded into $0$s and $1$s with the help of this table:" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "|Bit $i$| 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |\n", + "|-------|-----|-----|-----|-----|-----|-----|-----|-----|\n", + "| Digit |$2^7$|$2^6$|$2^5$|$2^4$|$2^3$|$2^2$|$2^1$|$2^0$|\n", + "| $=$ |$128$| $64$| $32$| $16$| $8$ | $4$ | $2$ | $1$ |" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "A number consists of exactly eight $0$s and $1$s that are read from right to left and referred to as the **bits** of the number. Each bit represents a distinct multiple of $2$, the **digit**. For sure, we start counting at $0$ again.\n", + "\n", + "To encode the integer $3$, for example, we need to find a combination of $0$s and $1$s such that the sum of digits marked with a $1$ is equal to the number we want to encode. In the example, we set all bits to $0$ except for the first ($i=0$) and second ($i=1$) as $2^0 + 2^1 = 1 + 2 = 3$. So the binary representation of $3$ is $00~00~00~11$. To borrow some terminology from linear algebra, the $3$ is a linear combination of the digits where the coefficients are either $0$ or $1$: $3 = 0*128 + 0*64 + 0*32 + 0*16 + 0*8 + 0*4 + 1*2 + 1*1$. It is *guaranteed* that there is exactly *one* such combination for each number between $0$ and $255$.\n", + "\n", + "As each bit in the binary representation is one of two values, we say that this representation has a base of $2$. Often, the base is indicated with a subscript to avoid confusion. For example, we write $3_{10} = 00000011_2$ or $3_{10} = 11_2$ for short omitting leading $0$s. A subscript of $10$ implies a decimal number as we know it from grade school.\n", + "\n", + "We use the built-in [bin()](https://docs.python.org/3/library/functions.html#bin) function to obtain an `int` object's binary representation: It returns a `str` object starting with `\"0b\"` indicating the binary format and as many $0$s and $1$s as are necessary to encode the integer omitting leading $0$s." + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0b11'" + ] + }, + "execution_count": 7, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "bin(3)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "We may pass a `str` object formatted this way as the argument to the [int()](https://docs.python.org/3/library/functions.html#int) built-in, together with `base=2`, to (re-)create an `int` object, for example, with the value of `3`." + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "3" + ] + }, + "execution_count": 8, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "int(\"0b11\", base=2)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Moreover, we may also use the contents of the returned `str` object as a **literal** instead: Just like we type, for example, `3` without quotes (i.e., \"literally\") into a code cell to create the `int` object `3`, we may type `0b11` to obtain an `int` object with the same value." + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "3" + ] + }, + "execution_count": 9, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "0b11" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Another example is the integer `177` that is the sum of $128 + 32 + 16 + 1$: Thus, its binary representation is the sequence of bits $10~11~00~01$, or to use our new notation, $177_{10} = 10110001_2$." + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0b10110001'" + ] + }, + "execution_count": 10, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "bin(177)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Analogous to typing `177` into a code cell, we may write `0b10110001`, or `0b_1011_0001` to make use of the underscores, and create an `int` object with the value `177`." + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "177" + ] + }, + "execution_count": 11, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "0b_1011_0001" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`0` and `255` are the edge cases where we set all the bits to either $0$ or $1$." + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0b0'" + ] + }, + "execution_count": 12, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "bin(0)" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0b1'" + ] + }, + "execution_count": 13, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "bin(1)" + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0b10'" + ] + }, + "execution_count": 14, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "bin(2)" + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0b11111111'" + ] + }, + "execution_count": 15, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "bin(255)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Groups of eight bits are also called a **byte**. As a byte can only represent non-negative integers up to $255$, the table above is extended conceptually with greater digits to the left to model integers beyond $255$. The memory management needed to implement this is built into Python, and we do not need to worry about it.\n", + "\n", + "For example, the `789` from above is encoded with ten bits and $789_{10} = 1100010101_2$." + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0b1100010101'" + ] + }, + "execution_count": 16, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "bin(i) # = bin(789)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "To contrast this bits encoding with the familiar decimal system, we show an equivalent table with powers of $10$ as the digits:\n", + "\n", + "|Decimal| 3 | 2 | 1 | 0 |\n", + "|-------|------|------|------|------|\n", + "| Digit |$10^3$|$10^2$|$10^1$|$10^0$|\n", + "| $=$ |$1000$| $100$| $10$ | $1$ |\n", + "\n", + "Now, an integer is a linear combination of the digits where the coefficients are one of *ten* values, and the base is now $10$. For example, the number $123$ can be expressed as $0*1000 + 1*100 + 2*10 + 3*1$. So, the binary representation follows the same logic as the decimal system taught in grade school. The decimal system is intuitive to us humans, mostly as we learn to count with our *ten* fingers. The $0$s and $1$s in a computer's memory are therefore no rocket science; they only feel unintuitive for a beginner." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "### Hexadecimal Representations" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "While in the binary and decimal systems there are two and ten distinct coefficients per digit, another convenient representation, the **hexadecimal representation**, uses a base of $16$. It is convenient as one digit stores the same amount of information as *four* bits and the binary representation quickly becomes unreadable for larger numbers. The letters \"a\" through \"f\" are used as digits \"10\" through \"15\".\n", + "\n", + "The following table summarizes the relationship between the three systems:" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "|Decimal|Hexadecimal|Binary|$~~~~~~$|Decimal|Hexadecimal|Binary|$~~~~~~$|Decimal|Hexadecimal|Binary|$~~~~~~$|...|\n", + "|-------|-----------|------|--------|-------|-----------|------|--------|-------|-----------|------|--------|---|\n", + "| 0 | 0 | 0000 |$~~~~~~$| 16 | 10 | 10000|$~~~~~~$| 32 | 20 |100000|$~~~~~~$|...|\n", + "| 1 | 1 | 0001 |$~~~~~~$| 17 | 11 | 10001|$~~~~~~$| 33 | 21 |100001|$~~~~~~$|...|\n", + "| 2 | 2 | 0010 |$~~~~~~$| 18 | 12 | 10010|$~~~~~~$| 34 | 22 |100010|$~~~~~~$|...|\n", + "| 3 | 3 | 0011 |$~~~~~~$| 19 | 13 | 10011|$~~~~~~$| 35 | 23 |100011|$~~~~~~$|...|\n", + "| 4 | 4 | 0100 |$~~~~~~$| 20 | 14 | 10100|$~~~~~~$| 36 | 24 |100100|$~~~~~~$|...|\n", + "| 5 | 5 | 0101 |$~~~~~~$| 21 | 15 | 10101|$~~~~~~$| 37 | 25 |100101|$~~~~~~$|...|\n", + "| 6 | 6 | 0110 |$~~~~~~$| 22 | 16 | 10110|$~~~~~~$| 38 | 26 |100110|$~~~~~~$|...|\n", + "| 7 | 7 | 0111 |$~~~~~~$| 23 | 17 | 10111|$~~~~~~$| 39 | 27 |100111|$~~~~~~$|...|\n", + "| 8 | 8 | 1000 |$~~~~~~$| 24 | 18 | 11000|$~~~~~~$| 40 | 28 |101000|$~~~~~~$|...|\n", + "| 9 | 9 | 1001 |$~~~~~~$| 25 | 19 | 11001|$~~~~~~$| 41 | 29 |101001|$~~~~~~$|...|\n", + "| 10 | a | 1010 |$~~~~~~$| 26 | 1a | 11010|$~~~~~~$| 42 | 2a |101010|$~~~~~~$|...|\n", + "| 11 | b | 1011 |$~~~~~~$| 27 | 1b | 11011|$~~~~~~$| 43 | 2b |101011|$~~~~~~$|...|\n", + "| 12 | c | 1100 |$~~~~~~$| 28 | 1c | 11100|$~~~~~~$| 44 | 2c |101100|$~~~~~~$|...|\n", + "| 13 | d | 1101 |$~~~~~~$| 29 | 1d | 11101|$~~~~~~$| 45 | 2d |101101|$~~~~~~$|...|\n", + "| 14 | e | 1110 |$~~~~~~$| 30 | 1e | 11110|$~~~~~~$| 46 | 2e |101110|$~~~~~~$|...|\n", + "| 15 | f | 1111 |$~~~~~~$| 31 | 1f | 11111|$~~~~~~$| 47 | 2f |101111|$~~~~~~$|...|" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "To show more examples of the above subscript convention, we pick three random entries from the table:\n", + "\n", + "$11_{10} = \\text{b}_{16} = 1011_2$\n", + "\n", + "$25_{10} = 19_{16} = 11001_2$\n", + "\n", + "$46_{10} = 2\\text{e}_{16} = 101110_2$\n", + "\n", + "The built-in [hex()](https://docs.python.org/3/library/functions.html#hex) function creates a `str` object starting with `\"0x\"` representing an `int` object's hexadecimal representation. The length depends on how many groups of four bits are implied by the corresponding binary representation.\n", + "\n", + "For `0` and `1`, the hexadecimal representation is similar to the binary one." + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0x0'" + ] + }, + "execution_count": 17, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "hex(0)" + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0x1'" + ] + }, + "execution_count": 18, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "hex(1)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Whereas `bin(3)` already requires two digits, one is enough for `hex(3)`." + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0x3'" + ] + }, + "execution_count": 19, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "hex(3) # bin(3) => \"0b11\"; two digits needed" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "For `10` and `15`, we see the letter digits for the first time." + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0xa'" + ] + }, + "execution_count": 20, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "hex(10)" + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0xf'" + ] + }, + "execution_count": 21, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "hex(15)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The binary representation of `177`, `\"0b10110001\"`, can be viewed as *two* groups of four bits, $1011$ and $0001$, that are encoded as $\\text{b}$ and $1$ in hexadecimal (cf., table above)." + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0b10110001'" + ] + }, + "execution_count": 22, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "bin(177)" + ] + }, + { + "cell_type": "code", + "execution_count": 23, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0xb1'" + ] + }, + "execution_count": 23, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "hex(177)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "To (re-)create an `int` object with the value `177`, we call the [int()](https://docs.python.org/3/library/functions.html#int) built-in with a properly formatted `str` object and `base=16` as arguments." + ] + }, + { + "cell_type": "code", + "execution_count": 24, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "177" + ] + }, + "execution_count": 24, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "int(\"0xb1\", base=16)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Alternatively, we could use a literal notation instead." + ] + }, + { + "cell_type": "code", + "execution_count": 25, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "177" + ] + }, + "execution_count": 25, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "0xb1" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Hexadecimals between $00_{16}$ and $\\text{ff}_{16}$ (i.e., $0_{10}$ and $255_{10}$) are commonly used to describe colors, for example, in web development but also graphics editors. See this [online tool](https://www.w3schools.com/colors/colors_hexadecimal.asp) for some more background." + ] + }, + { + "cell_type": "code", + "execution_count": 26, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0xff'" + ] + }, + "execution_count": 26, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "hex(255)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Just like binary representations, the hexadecimals extend to the left for larger numbers like `789`." + ] + }, + { + "cell_type": "code", + "execution_count": 27, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0x315'" + ] + }, + "execution_count": 27, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "hex(i) # = hex(789)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "For completeness sake, we mention that there is also the [oct()](https://docs.python.org/3/library/functions.html#oct) built-in to obtain an integer's **octal representation**. The logic is the same as for the hexadecimal representation, and we use *eight* instead of *sixteen* digits. That is the equivalent of viewing the binary representations in groups of three bits. As of today, octal representations have become less important, and the data science practitioner may probably live without them quite well." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "### Negative Values" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "While there are conventions that model negative integers with $0$s and $1$s in memory (cf., [Two's Complement](https://en.wikipedia.org/wiki/Two%27s_complement)), Python manages that for us, and we do not look into the theory here for brevity. We have learned all we need to know about how integers are modeled in a computer.\n", + "\n", + "The binary and hexadecimal representations of negative integers are identical to their positive counterparts except that they start with a minus sign `-`." + ] + }, + { + "cell_type": "code", + "execution_count": 28, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'-0b11'" + ] + }, + "execution_count": 28, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "bin(-3)" + ] + }, + { + "cell_type": "code", + "execution_count": 29, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'-0x3'" + ] + }, + "execution_count": 29, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "hex(-3)" + ] + }, + { + "cell_type": "code", + "execution_count": 30, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'-0b11111111'" + ] + }, + "execution_count": 30, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "bin(-255)" + ] + }, + { + "cell_type": "code", + "execution_count": 31, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'-0xff'" + ] + }, + "execution_count": 31, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "hex(-255)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "### The `bool` Type" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Whereas the boolean literals `True` and `False` are commonly *not* regarded as numeric types, they behave like `1` and `0` in an arithmetic context." + ] + }, + { + "cell_type": "code", + "execution_count": 32, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "1" + ] + }, + "execution_count": 32, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "True + False" + ] + }, + { + "cell_type": "code", + "execution_count": 33, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "42" + ] + }, + "execution_count": 33, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "41 + True" + ] + }, + { + "cell_type": "code", + "execution_count": 34, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "0.0" + ] + }, + "execution_count": 34, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "42.0 * False" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "We may explicitly cast `bool` objects as integers ourselves with the [int()](https://docs.python.org/3/library/functions.html#int) built-in." + ] + }, + { + "cell_type": "code", + "execution_count": 35, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "1" + ] + }, + "execution_count": 35, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "int(True)" + ] + }, + { + "cell_type": "code", + "execution_count": 36, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "0" + ] + }, + "execution_count": 36, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "int(False)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Of course, their binary representations only need *one* bit of information." + ] + }, + { + "cell_type": "code", + "execution_count": 37, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0b1'" + ] + }, + "execution_count": 37, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "bin(True)" + ] + }, + { + "cell_type": "code", + "execution_count": 38, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0b0'" + ] + }, + "execution_count": 38, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "bin(False)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Their hexadecimal representations occupy *four* bits in memory while only *one* bit is needed. This is because \"1\" and \"0\" are just two of the sixteen possible digits." + ] + }, + { + "cell_type": "code", + "execution_count": 39, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0x1'" + ] + }, + "execution_count": 39, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "hex(True)" + ] + }, + { + "cell_type": "code", + "execution_count": 40, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0x0'" + ] + }, + "execution_count": 40, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "hex(False)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "As a reminder, the `None` literal is a type on its own, namely the `NoneType`, and different from `False`. It *cannot* be cast as an integer as the `TypeError` indicates." + ] + }, + { + "cell_type": "code", + "execution_count": 41, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "ename": "TypeError", + "evalue": "int() argument must be a string, a bytes-like object or a number, not 'NoneType'", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;32mNone\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;31mTypeError\u001b[0m: int() argument must be a string, a bytes-like object or a number, not 'NoneType'" + ] + } + ], + "source": [ + "int(None)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "### Bitwise Operators" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Now that we know how integers are represented with $0$s and $1$s, we look at ways of working with the individual bits, in particular with the so-called **[bitwise operators](https://wiki.python.org/moin/BitwiseOperators)**: As the name suggests, the operators perform some operation on a bit by bit basis. They only work with and always return `int` objects.\n", + "\n", + "We keep this overview rather short as such \"low-level\" operations are not needed by the data science practitioner regularly. Yet, it is worthwhile to have heard about them as they form the basis of all of arithmetic in computers.\n", + "\n", + "The first operator is the **bitwise AND** operator `&`: It looks at the bits of its two operands, `11` and `13` in the example, in a pairwise fashion and *iff* both operands have a $1$ in the same position, will the resulting integer have a $1$ in this position as well. The binary representations of `11` and `13` have $1$s in their respective first and fourth bits, which is why `bin(11 & 13)` evaluates to `\"Ob1001\"`." + ] + }, + { + "cell_type": "code", + "execution_count": 42, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0b1011 & 0b1101'" + ] + }, + "execution_count": 42, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "bin(11) + \" & \" + bin(13) # to show the operands' binary representations" + ] + }, + { + "cell_type": "code", + "execution_count": 43, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0b1001'" + ] + }, + "execution_count": 43, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "bin(11 & 13)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`0b1001` is the binary representation of `9`." + ] + }, + { + "cell_type": "code", + "execution_count": 44, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "9" + ] + }, + "execution_count": 44, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "0b1001" + ] + }, + { + "cell_type": "code", + "execution_count": 45, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "9" + ] + }, + "execution_count": 45, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "11 & 13" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The **bitwise OR** operator `|` evaluates to an `int` object whose bits are set to $1$ if the corresponding bits of either *one* or *both* operands are $1$. So in the example `9 | 13` only the second bit is $0$ for both operands, which is why the expression evaluates to `\"0b1101\"`." + ] + }, + { + "cell_type": "code", + "execution_count": 46, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0b1001 | 0b1101'" + ] + }, + "execution_count": 46, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "bin(9) + \" | \" + bin(13) # to show the operands' binary representations" + ] + }, + { + "cell_type": "code", + "execution_count": 47, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0b1101'" + ] + }, + "execution_count": 47, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "bin(9 | 13)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`0b1101` evaluates to the `int` object `13`." + ] + }, + { + "cell_type": "code", + "execution_count": 48, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "13" + ] + }, + "execution_count": 48, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "0b1101" + ] + }, + { + "cell_type": "code", + "execution_count": 49, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "13" + ] + }, + "execution_count": 49, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "9 | 13" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The **bitwise XOR** operator `^` is a special case of the `|` operator in that it evaluates to an `int` object whose bits are set to $1$ if the corresponding bit of *exactly one* of the two operands is $1$. Colloquially, the \"X\" stands for \"exclusive.\" The `^` operator must *not* be confused with the exponentiation operator `**`! In the example, `9 ^ 13`, only the third bit differs between the two operands, which is why it evaluates to `\"0b100\"` omitting the fourth bit." + ] + }, + { + "cell_type": "code", + "execution_count": 50, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0b1001 ^ 0b1101'" + ] + }, + "execution_count": 50, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "bin(9) + \" ^ \" + bin(13) # to show the operands' binary representations" + ] + }, + { + "cell_type": "code", + "execution_count": 51, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0b100'" + ] + }, + "execution_count": 51, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "bin(9 ^ 13)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`0b100` evaluates to the `int` object `4`." + ] + }, + { + "cell_type": "code", + "execution_count": 52, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "4" + ] + }, + "execution_count": 52, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "0b100" + ] + }, + { + "cell_type": "code", + "execution_count": 53, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "4" + ] + }, + "execution_count": 53, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "9 ^ 13" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The **bitwise NOT** operator `~`, sometimes also called **inversion** operator, is said to \"flip\" the $0$s into $1$s and the $1$s into $0$s. However, it is based on the aforementioned [Two's Complement](https://en.wikipedia.org/wiki/Two%27s_complement) convention and `~x = -(x + 1)` by definition (cf., the [reference](https://docs.python.org/3/reference/expressions.html#unary-arithmetic-and-bitwise-operations)). The full logic behind this is considered out of scope in this book.\n", + "\n", + "We can at least verify the definition by comparing the binary representations of `8` and `-9`: They are indeed the same." + ] + }, + { + "cell_type": "code", + "execution_count": 54, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'-0b1001'" + ] + }, + "execution_count": 54, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "bin(~8)" + ] + }, + { + "cell_type": "code", + "execution_count": 55, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'-0b1001'" + ] + }, + "execution_count": 55, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "bin(-(8 + 1))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`~x = -(x + 1)` can be reformulated as `~x + x = -1`, which is slightly easier to check." + ] + }, + { + "cell_type": "code", + "execution_count": 56, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "-1" + ] + }, + "execution_count": 56, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "~8 + 8" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Lastly, the **bitwise left and right shift** operators, `<<` and `>>`, shift all the bits either to the left or to the right. This corresponds to multiplying or dividing an integer by powers of $2$.\n", + "\n", + "When shifting left, $0$s are filled in." + ] + }, + { + "cell_type": "code", + "execution_count": 57, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0b1001'" + ] + }, + "execution_count": 57, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "bin(9)" + ] + }, + { + "cell_type": "code", + "execution_count": 58, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0b100100'" + ] + }, + "execution_count": 58, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "bin(9 << 2)" + ] + }, + { + "cell_type": "code", + "execution_count": 59, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "36" + ] + }, + "execution_count": 59, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "9 << 2" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "When shifting right, some bits are always lost." + ] + }, + { + "cell_type": "code", + "execution_count": 60, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0b10'" + ] + }, + "execution_count": 60, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "bin(9 >> 2)" + ] + }, + { + "cell_type": "code", + "execution_count": 61, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "2" + ] + }, + "execution_count": 61, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "9 >> 2" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## The `float` Type" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "As we have seen above, some assumptions need to be made as to how the $0$s and $1$s in a computer's memory are to be translated into numbers. This process becomes a lot more involved when we go beyond integers and model [real numbers](https://en.wikipedia.org/wiki/Real_number) (i.e., the set $\\mathbb{R}$) with possibly infinitely many digits to the right of the period like $1.23$.\n", + "\n", + "The **[Institute of Electrical and Electronics Engineers](https://en.wikipedia.org/wiki/Institute_of_Electrical_and_Electronics_Engineers)** (IEEE, pronounced \"eye-triple-E\") is one of the important professional associations when it comes to standardizing all kinds of aspects regarding the implementation of soft- and hardware.\n", + "\n", + "The **[IEEE 754](https://en.wikipedia.org/wiki/IEEE_754)** standard defines the so-called **floating-point arithmetic** that is commonly used today by all major programming languages. The standard not only defines how the $0$s and $1$s are organized in memory but also, for example, how values are to be rounded, what happens in exceptional cases like divisions by zero, or what is a zero value in the first place.\n", + "\n", + "In Python, the simplest way to create a `float` object is to use a literal notation with a dot `.` in it." + ] + }, + { + "cell_type": "code", + "execution_count": 62, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "f = 1.23" + ] + }, + { + "cell_type": "code", + "execution_count": 63, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "139928698489496" + ] + }, + "execution_count": 63, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "id(f)" + ] + }, + { + "cell_type": "code", + "execution_count": 64, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "float" + ] + }, + "execution_count": 64, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "type(f)" + ] + }, + { + "cell_type": "code", + "execution_count": 65, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "1.23" + ] + }, + "execution_count": 65, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "f" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "As with integer literals above, we may use underscores `_` to make longer `float` objects easier to read." + ] + }, + { + "cell_type": "code", + "execution_count": 66, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "0.123456789" + ] + }, + "execution_count": 66, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "0.123_456_789" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "In cases where the dot `.` is unnecessary from a mathematical point of view, we either need to end the number with it nevertheless or use the [float()](https://docs.python.org/3/library/functions.html#float) built-in to cast the number explicitly. [float()](https://docs.python.org/3/library/functions.html#float) can process any numeric object or a properly formatted `str` object." + ] + }, + { + "cell_type": "code", + "execution_count": 67, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "1.0" + ] + }, + "execution_count": 67, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "1. # on the contrary, 1 creates an int object" + ] + }, + { + "cell_type": "code", + "execution_count": 68, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "1.0" + ] + }, + "execution_count": 68, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "float(1)" + ] + }, + { + "cell_type": "code", + "execution_count": 69, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "1.0" + ] + }, + "execution_count": 69, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "float(\"1.000\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`float` objects are implicitly created as the result of dividing an `int` object by another with the division operator `/`." + ] + }, + { + "cell_type": "code", + "execution_count": 70, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "0.3333333333333333" + ] + }, + "execution_count": 70, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "1 / 3" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "In general, if we combine `float` and `int` objects in arithmetic operations, we always end up with a `float` type: Python uses the \"broader\" representation." + ] + }, + { + "cell_type": "code", + "execution_count": 71, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "42.0" + ] + }, + "execution_count": 71, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "40.0 + 2" + ] + }, + { + "cell_type": "code", + "execution_count": 72, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "42.0" + ] + }, + "execution_count": 72, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "21 * 2.0" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "### Scientific Notation" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`float` objects may also be created with the **scientific literal notation**: We use the symbol `e` to indicate powers of $10$, so $1.23 * 10^0$ translates into `1.23e0`." + ] + }, + { + "cell_type": "code", + "execution_count": 73, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "1.23" + ] + }, + "execution_count": 73, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "1.23e0" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Syntactically, `e` needs a `float` or `int` object in its literal notation on its left and an `int` object on its right, both without a space. Otherwise, we get a `SyntaxError`." + ] + }, + { + "cell_type": "code", + "execution_count": 74, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "ename": "SyntaxError", + "evalue": "invalid syntax (, line 1)", + "output_type": "error", + "traceback": [ + "\u001b[0;36m File \u001b[0;32m\"\"\u001b[0;36m, line \u001b[0;32m1\u001b[0m\n\u001b[0;31m 1.23 e0\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m invalid syntax\n" + ] + } + ], + "source": [ + "1.23 e0" + ] + }, + { + "cell_type": "code", + "execution_count": 75, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "ename": "SyntaxError", + "evalue": "invalid syntax (, line 1)", + "output_type": "error", + "traceback": [ + "\u001b[0;36m File \u001b[0;32m\"\"\u001b[0;36m, line \u001b[0;32m1\u001b[0m\n\u001b[0;31m 1.23e 0\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m invalid syntax\n" + ] + } + ], + "source": [ + "1.23e 0" + ] + }, + { + "cell_type": "code", + "execution_count": 76, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "ename": "SyntaxError", + "evalue": "invalid syntax (, line 1)", + "output_type": "error", + "traceback": [ + "\u001b[0;36m File \u001b[0;32m\"\"\u001b[0;36m, line \u001b[0;32m1\u001b[0m\n\u001b[0;31m 1.23e0.0\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m invalid syntax\n" + ] + } + ], + "source": [ + "1.23e0.0" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "If we leave out the number to the left, Python raises a `NameError` as it unsuccessfully tries to look up a variable named `e1`." + ] + }, + { + "cell_type": "code", + "execution_count": 77, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "ename": "NameError", + "evalue": "name 'e1' is not defined", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0me1\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;31mNameError\u001b[0m: name 'e1' is not defined" + ] + } + ], + "source": [ + "e1" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "So, to write $10^1$ in Python, we need to think of it as $1*10^1$ and write `1e1`." + ] + }, + { + "cell_type": "code", + "execution_count": 78, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "10.0" + ] + }, + "execution_count": 78, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "1e1" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "### Special Values" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "There are also three special values representing \"**not a number,**\" called `nan`, and positive or negative **infinity**, called `inf` or `-inf`, that are created by passing in the corresponding abbreviation as a `str` object to the [float()](https://docs.python.org/3/library/functions.html#float) built-in. These values could be used, for example, as the result of a mathematically undefined operation like division by zero or to model the value of a mathematical function as it goes to infinity." + ] + }, + { + "cell_type": "code", + "execution_count": 79, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "nan" + ] + }, + "execution_count": 79, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "float(\"nan\") # also works as float(\"NaN\")" + ] + }, + { + "cell_type": "code", + "execution_count": 80, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "inf" + ] + }, + "execution_count": 80, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "float(\"+inf\") # also works as float(\"+infinity\")" + ] + }, + { + "cell_type": "code", + "execution_count": 81, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "inf" + ] + }, + "execution_count": 81, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "float(\"inf\") # by omitting the plus sign we mean positive infinity" + ] + }, + { + "cell_type": "code", + "execution_count": 82, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "-inf" + ] + }, + "execution_count": 82, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "float(\"-inf\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`nan` objects *never* compare equal to *anything*, not even to themselves. This happens in accordance with the [IEEE 754](https://en.wikipedia.org/wiki/IEEE_754) standard." + ] + }, + { + "cell_type": "code", + "execution_count": 83, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "False" + ] + }, + "execution_count": 83, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "float(\"nan\") == float(\"nan\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "On the contrary, as two values go to infinity, there is no such concept as difference and *everything* compares equal." + ] + }, + { + "cell_type": "code", + "execution_count": 84, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 84, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "float(\"inf\") == float(\"inf\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Adding `42` to `inf` makes no difference." + ] + }, + { + "cell_type": "code", + "execution_count": 85, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "inf" + ] + }, + "execution_count": 85, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "float(\"inf\") + 42" + ] + }, + { + "cell_type": "code", + "execution_count": 86, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 86, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "float(\"inf\") + 42 == float(\"inf\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "We observe the same for multiplication ..." + ] + }, + { + "cell_type": "code", + "execution_count": 87, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "inf" + ] + }, + "execution_count": 87, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "42 * float(\"inf\")" + ] + }, + { + "cell_type": "code", + "execution_count": 88, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 88, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "42 * float(\"inf\") == float(\"inf\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "... and even exponentiation!" + ] + }, + { + "cell_type": "code", + "execution_count": 89, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "inf" + ] + }, + "execution_count": 89, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "float(\"inf\") ** 42" + ] + }, + { + "cell_type": "code", + "execution_count": 90, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 90, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "float(\"inf\") ** 42 == float(\"inf\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Although absolute differences become unmeaningful as we approach infinity, signs are still respected." + ] + }, + { + "cell_type": "code", + "execution_count": 91, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "inf" + ] + }, + "execution_count": 91, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "-42 * float(\"-inf\")" + ] + }, + { + "cell_type": "code", + "execution_count": 92, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 92, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "-42 * float(\"-inf\") == float(\"inf\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "As a caveat, adding infinities of different signs is an *undefined operation* in math and results in a `nan` object. So, if we (accidentally or unknowingly) do this on a real dataset, we do *not* see any error messages, and our program may continue to run with non-meaningful results! This is an example of a piece of code **failing silently**." + ] + }, + { + "cell_type": "code", + "execution_count": 93, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "nan" + ] + }, + "execution_count": 93, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "float(\"inf\") + float(\"-inf\")" + ] + }, + { + "cell_type": "code", + "execution_count": 94, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "nan" + ] + }, + "execution_count": 94, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "float(\"inf\") - float(\"inf\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "### Imprecision" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`float` objects are *inherently* imprecise, and there is *nothing* we can do about it! In particular, arithmetic operations with two `float` objects may result in \"weird\" rounding \"errors\" that are strictly deterministic and occur in accordance with the [IEEE 754](https://en.wikipedia.org/wiki/IEEE_754) standard.\n", + "\n", + "For example, let's add `1e0` to `1e15` and `1e16`, respectively. In the latter case, the `1e0` somehow gets \"lost\"." + ] + }, + { + "cell_type": "code", + "execution_count": 95, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "1000000000000001.0" + ] + }, + "execution_count": 95, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "1e15 + 1e0" + ] + }, + { + "cell_type": "code", + "execution_count": 96, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "1e+16" + ] + }, + "execution_count": 96, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "1e16 + 1e0" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Of course, we may also add the `int` object `1` to the `float` objects in the literal `e` notation with the same outcome." + ] + }, + { + "cell_type": "code", + "execution_count": 97, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "1000000000000001.0" + ] + }, + "execution_count": 97, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "1e15 + 1" + ] + }, + { + "cell_type": "code", + "execution_count": 98, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "1e+16" + ] + }, + "execution_count": 98, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "1e16 + 1" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Interactions between sufficiently large and small `float` objects are not the only source of imprecision." + ] + }, + { + "cell_type": "code", + "execution_count": 99, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "from math import sqrt" + ] + }, + { + "cell_type": "code", + "execution_count": 100, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "2.0000000000000004" + ] + }, + "execution_count": 100, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "sqrt(2) ** 2" + ] + }, + { + "cell_type": "code", + "execution_count": 101, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "0.30000000000000004" + ] + }, + "execution_count": 101, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "0.1 + 0.2" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "This may become a problem if we rely on equality checks in our programs." + ] + }, + { + "cell_type": "code", + "execution_count": 102, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "False" + ] + }, + "execution_count": 102, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "sqrt(2) ** 2 == 2" + ] + }, + { + "cell_type": "code", + "execution_count": 103, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "False" + ] + }, + "execution_count": 103, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "0.1 + 0.2 == 0.3" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "A popular workaround is to benchmark the difference between the two numbers to be checked for equality against a pre-defined `threshold` *sufficiently* close to `0`, for example, `1e-15`." + ] + }, + { + "cell_type": "code", + "execution_count": 104, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "threshold = 1e-15" + ] + }, + { + "cell_type": "code", + "execution_count": 105, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 105, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "(sqrt(2) ** 2) - 2 < threshold" + ] + }, + { + "cell_type": "code", + "execution_count": 106, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 106, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "(0.1 + 0.2) - 0.3 < threshold" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The built-in [format()](https://docs.python.org/3/library/functions.html#format) function allows us to show the **significant digits** of a `float` number as they exist in memory to arbitrary precision. To exemplify it, let's view a couple of `float` objects with `50` digits. This analysis reveals that almost no `float` number is precise! After $14$ or $15$ digits \"weird\" things happen. As we see further below, the \"random\" digits ending the `float` numbers do *not* \"physically\" exist in memory.\n", + "\n", + "The [format()](https://docs.python.org/3/library/functions.html#format) function is different from the [format()](https://docs.python.org/3/library/stdtypes.html#str.format) method on `str` objects introduced in the next chapter and both work with the so-called [format specification mini-language](https://docs.python.org/3/library/string.html#format-specification-mini-language): `\".50f\"` is the instruction to show `50` digits of a `float` number. But let's not worry too much about these details for now." + ] + }, + { + "cell_type": "code", + "execution_count": 107, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0.10000000000000000555111512312578270211815834045410'" + ] + }, + "execution_count": 107, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "format(0.1, \".50f\")" + ] + }, + { + "cell_type": "code", + "execution_count": 108, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0.20000000000000001110223024625156540423631668090820'" + ] + }, + "execution_count": 108, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "format(0.2, \".50f\")" + ] + }, + { + "cell_type": "code", + "execution_count": 109, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0.29999999999999998889776975374843459576368331909180'" + ] + }, + "execution_count": 109, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "format(0.3, \".50f\")" + ] + }, + { + "cell_type": "code", + "execution_count": 110, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0.33333333333333331482961625624739099293947219848633'" + ] + }, + "execution_count": 110, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "format(1 / 3, \".50f\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The [format()](https://docs.python.org/3/library/functions.html#format) function does *not* round a `float` object in the mathematical sense! It just allows us to show an arbitrary number of the digits as stored in memory, and it also does *not* change these.\n", + "\n", + "On the contrary, the built-in [round()](https://docs.python.org/3/library/functions.html#round) function creates a *new* numeric object that is a rounded version of the one passed in as the argument. It adheres to the common rules of math.\n", + "\n", + "For example, let's round `1 / 3` to five decimals. The obtained value for `roughly_a_third` is also *imprecise* but different from the \"exact\" representation of `1 / 3` above." + ] + }, + { + "cell_type": "code", + "execution_count": 111, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "roughly_a_third = round(1 / 3, 5)" + ] + }, + { + "cell_type": "code", + "execution_count": 112, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "0.33333" + ] + }, + "execution_count": 112, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "roughly_a_third" + ] + }, + { + "cell_type": "code", + "execution_count": 113, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0.33333000000000001517008740847813896834850311279297'" + ] + }, + "execution_count": 113, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "format(roughly_a_third, \".50f\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Surprisingly, `0.125` and `0.25` appear to be *precise*, and equality comparison works without the `threshold` workaround: Both are powers of $2$ in disguise." + ] + }, + { + "cell_type": "code", + "execution_count": 114, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0.12500000000000000000000000000000000000000000000000'" + ] + }, + "execution_count": 114, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "format(0.125, \".50f\")" + ] + }, + { + "cell_type": "code", + "execution_count": 115, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0.25000000000000000000000000000000000000000000000000'" + ] + }, + "execution_count": 115, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "format(0.25, \".50f\")" + ] + }, + { + "cell_type": "code", + "execution_count": 116, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 116, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "0.125 + 0.125 == 0.25" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "### Binary Representations" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "To understand these subtleties, we need to look at the **[binary representation of floats](https://en.wikipedia.org/wiki/Double-precision_floating-point_format)** and review the basics of the **[IEEE 754](https://en.wikipedia.org/wiki/IEEE_754)** standard. On modern machines, floats are modeled in so-called double precision with $64$ bits that are grouped as in the figure below. The first bit determines the sign ($0$ for plus, $1$ for minus), the next $11$ bits represent an $exponent$ term, and the last $52$ bits resemble the actual significant digits, the so-called $fraction$ part. The three groups are put together like so:\n", + "\n", + "$$float = (-1)^{sign} * 1.fraction * 2^{exponent-1023}$$\n", + "\n", + "A $1.$ is implicitly prepended as the first digit, and both, $fraction$ and $exponent$, are stored in base $2$ representation (i.e., they both are interpreted like integers above). As $exponent$ is consequently non-negative, between $0_{10}$ and $2047_{10}$ to be precise, the $-1023$ centers the entire $2^{exponent-1023}$ term around $1$ and allows the period within the $1.fraction$ part be shifted into either direction by the same amount. Floating-point numbers received their name as the period, formally called the **[radix point](https://en.wikipedia.org/wiki/Radix_point)**, \"floats\" along the significant digits. As an aside, an $exponent$ of all $0$s or all $1$s is used to model the special values `nan` or `inf`.\n", + "\n", + "As the standard defines the exponent part to come as a power of $2$, we now see why `0.125` is a *precise* float: It can be represented as a power of $2$, i.e., $0.125 = (-1)^0 * 1.0 * 2^{1020-1023} = 2^{-3} = \\frac{1}{8}$. In other words, the floating-point representation of $0.125_{10}$ is $0_2$, $1111111100_2 = 1020_{10}$, and $0_2$ for the three groups, respectively." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The crucial fact for the data science practitioner to understand is that mapping the *infinite* set of the real numbers $\\mathbb{R}$ to a *finite* set of bits leads to the imprecisions shown above!\n", + "\n", + "So, floats are usually good approximations of real numbers only with their first $14$ or $15$ digits. If more precision is required, we need to revert to other data types such as a `Decimal` or a `Fraction`, as shown in the next two sections.\n", + "\n", + "This [blog post](http://fabiensanglard.net/floating_point_visually_explained/) gives another neat and *visual* way as to how to think of floats. It also explains why floats become worse approximations of the reals as their absolute values increase.\n", + "\n", + "The Python [documentation](https://docs.python.org/3/tutorial/floatingpoint.html) provides another good discussion of floats and the goodness of their approximations.\n", + "\n", + "If we are interested in the exact bits behind a `float` object, we use the [hex()](https://docs.python.org/3/library/stdtypes.html#float.hex) method that returns a `str` object beginning with `\"0x1.\"` followed by the $fraction$ in hexadecimal notation and the $exponent$ as an integer after subtraction of $1023$ and separated by a `\"p\"`." + ] + }, + { + "cell_type": "code", + "execution_count": 117, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "one_eighth = 1 / 8" + ] + }, + { + "cell_type": "code", + "execution_count": 118, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0x1.0000000000000p-3'" + ] + }, + "execution_count": 118, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "one_eighth.hex() # this basically says 2 ** (-3)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Also, the [as_integer_ratio()](https://docs.python.org/3/library/stdtypes.html#float.as_integer_ratio) method returns the two smallest integers whose ratio best approximates a `float` object." + ] + }, + { + "cell_type": "code", + "execution_count": 119, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(1, 8)" + ] + }, + "execution_count": 119, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "one_eighth.as_integer_ratio()" + ] + }, + { + "cell_type": "code", + "execution_count": 120, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0x1.555475a31a4bep-2'" + ] + }, + "execution_count": 120, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "roughly_a_third.hex()" + ] + }, + { + "cell_type": "code", + "execution_count": 121, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(3002369727582815, 9007199254740992)" + ] + }, + "execution_count": 121, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "roughly_a_third.as_integer_ratio()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`0.0` is also a power of $2$ and thus a *precise* `float` number." + ] + }, + { + "cell_type": "code", + "execution_count": 122, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "zero = 0.0" + ] + }, + { + "cell_type": "code", + "execution_count": 123, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0x0.0p+0'" + ] + }, + "execution_count": 123, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "zero.hex()" + ] + }, + { + "cell_type": "code", + "execution_count": 124, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(0, 1)" + ] + }, + "execution_count": 124, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "zero.as_integer_ratio()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "As seen in [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb#%28Data%29-Type-%2F-%22Behavior%22), the [is_integer()](https://docs.python.org/3/library/stdtypes.html#float.is_integer) method tells us if a `float` can be casted as an `int` object without any loss in precision." + ] + }, + { + "cell_type": "code", + "execution_count": 125, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "False" + ] + }, + "execution_count": 125, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "roughly_a_third.is_integer()" + ] + }, + { + "cell_type": "code", + "execution_count": 126, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 126, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "one = roughly_a_third / roughly_a_third\n", + "\n", + "one.is_integer()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "As the exact implementation of floats may vary and be dependent on a particular Python installation, we look up the [float_info](https://docs.python.org/3/library/sys.html#sys.float_info) attribute in the [sys](https://docs.python.org/3/library/sys.html) module in the [standard library](https://docs.python.org/3/library/index.html) to check the details. Usually, this is not necessary." + ] + }, + { + "cell_type": "code", + "execution_count": 127, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [], + "source": [ + "import sys" + ] + }, + { + "cell_type": "code", + "execution_count": 128, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "sys.float_info(max=1.7976931348623157e+308, max_exp=1024, max_10_exp=308, min=2.2250738585072014e-308, min_exp=-1021, min_10_exp=-307, dig=15, mant_dig=53, epsilon=2.220446049250313e-16, radix=2, rounds=1)" + ] + }, + "execution_count": 128, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "sys.float_info" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## The `Decimal` Type" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The [decimal](https://docs.python.org/3/library/decimal.html) module in the [standard library](https://docs.python.org/3/library/index.html) provides a [Decimal](https://docs.python.org/3/library/decimal.html#decimal.Decimal) type that may be used to represent any real number to a user-defined level of precision: \"User-defined\" does *not* mean an infinite or \"exact\" precision! The `Decimal` type merely allows us to work with a number of bits *different* from the $64$ as specified for the `float` type and also to customize the rounding rules and some other settings.\n", + "\n", + "We import the `Decimal` type and also the [getcontext()](https://docs.python.org/3/library/decimal.html#decimal.getcontext) function from the [decimal](https://docs.python.org/3/library/decimal.html) module." + ] + }, + { + "cell_type": "code", + "execution_count": 129, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "from decimal import Decimal, getcontext" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "[getcontext()](https://docs.python.org/3/library/decimal.html#decimal.getcontext) shows us how the [decimal](https://docs.python.org/3/library/decimal.html) module is set up. By default, the precision is set to `28` significant digits, which is roughly twice as many as with `float` objects." + ] + }, + { + "cell_type": "code", + "execution_count": 130, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "Context(prec=28, rounding=ROUND_HALF_EVEN, Emin=-999999, Emax=999999, capitals=1, clamp=0, flags=[], traps=[InvalidOperation, DivisionByZero, Overflow])" + ] + }, + "execution_count": 130, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "getcontext()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The two simplest ways to create a `Decimal` object is to either **instantiate** it with an `int` or a `str` object consisting of all the significant digits. In the latter case, the scientific notation is also possible." + ] + }, + { + "cell_type": "code", + "execution_count": 131, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "Decimal('42')" + ] + }, + "execution_count": 131, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "Decimal(42)" + ] + }, + { + "cell_type": "code", + "execution_count": 132, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "Decimal('0.1')" + ] + }, + "execution_count": 132, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "Decimal(\"0.1\")" + ] + }, + { + "cell_type": "code", + "execution_count": 133, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "Decimal('1E+5')" + ] + }, + "execution_count": 133, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "Decimal(\"1e5\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "It is *not* a good idea to create a `Decimal` from a `float` object. If we did so, we would create a `Decimal` object that internally used extra bits to store the \"random\" digits that are not stored in the `float` object in the first place." + ] + }, + { + "cell_type": "code", + "execution_count": 134, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "Decimal('0.1000000000000000055511151231257827021181583404541015625')" + ] + }, + "execution_count": 134, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "Decimal(0.1) # do not create a Decimal from a float" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "With the `Decimal` type, the imprecisions in the arithmetic and equality comparisons from above go away." + ] + }, + { + "cell_type": "code", + "execution_count": 135, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "Decimal('0.3')" + ] + }, + "execution_count": 135, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "Decimal(\"0.1\") + Decimal(\"0.2\")" + ] + }, + { + "cell_type": "code", + "execution_count": 136, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 136, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "Decimal(\"0.1\") + Decimal(\"0.2\") == Decimal(\"0.3\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`Decimal` numbers *preserve* the **significant digits**, even in cases where this is not needed." + ] + }, + { + "cell_type": "code", + "execution_count": 137, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "Decimal('0.30000')" + ] + }, + "execution_count": 137, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "Decimal(\"0.10000\") + Decimal(\"0.20000\")" + ] + }, + { + "cell_type": "code", + "execution_count": 138, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 138, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "Decimal(\"0.10000\") + Decimal(\"0.20000\") == Decimal(\"0.3\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Arithmetic operations between `Decimal` and `int` objects work as the latter are inherently precise: The results are *new* `Decimal` objects." + ] + }, + { + "cell_type": "code", + "execution_count": 139, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "Decimal('42')" + ] + }, + "execution_count": 139, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "42 + Decimal(42) - 42" + ] + }, + { + "cell_type": "code", + "execution_count": 140, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "Decimal('42')" + ] + }, + "execution_count": 140, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "10 * Decimal(42) / 10" + ] + }, + { + "cell_type": "code", + "execution_count": 141, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "Decimal('0.1')" + ] + }, + "execution_count": 141, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "Decimal(1) / 10" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "To verify the precision, we apply the built-in [format()](https://docs.python.org/3/library/functions.html#format) function to the previous code cell and compare it with the same division resulting in a `float` object." + ] + }, + { + "cell_type": "code", + "execution_count": 142, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0.10000000000000000000000000000000000000000000000000'" + ] + }, + "execution_count": 142, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "format(Decimal(1) / 10, \".50f\")" + ] + }, + { + "cell_type": "code", + "execution_count": 143, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0.10000000000000000555111512312578270211815834045410'" + ] + }, + "execution_count": 143, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "format(1 / 10, \".50f\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "However, mixing `Decimal` and `float` objects raises a `TypeError`: So, Python prevents us from potentially introducing imprecisions via innocent-looking arithmetic by **failing loudly**." + ] + }, + { + "cell_type": "code", + "execution_count": 144, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "ename": "TypeError", + "evalue": "unsupported operand type(s) for *: 'float' and 'decimal.Decimal'", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0;36m1.0\u001b[0m \u001b[0;34m*\u001b[0m \u001b[0mDecimal\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m42\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;31mTypeError\u001b[0m: unsupported operand type(s) for *: 'float' and 'decimal.Decimal'" + ] + } + ], + "source": [ + "1.0 * Decimal(42)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "To preserve the precision for more advanced mathematical functions, `Decimal` objects come with many **methods bound** on them. For example, [ln()](https://docs.python.org/3/library/decimal.html#decimal.Decimal.ln) and [log10()](https://docs.python.org/3/library/decimal.html#decimal.Decimal.log10) take the logarithm while [sqrt()](https://docs.python.org/3/library/decimal.html#decimal.Decimal.sqrt) calculates the square root. In general, the functions in the [math](https://docs.python.org/3/library/math.html) module in the [standard library](https://docs.python.org/3/library/index.html) should only be used with `float` objects as they do *not* preserve precision." + ] + }, + { + "cell_type": "code", + "execution_count": 145, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "Decimal('2')" + ] + }, + "execution_count": 145, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "Decimal(100).log10()" + ] + }, + { + "cell_type": "code", + "execution_count": 146, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "Decimal('1.414213562373095048801688724')" + ] + }, + "execution_count": 146, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "Decimal(2).sqrt()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The object returned by the [sqrt()](https://docs.python.org/3/library/decimal.html#decimal.Decimal.sqrt) method is still limited in precision: This must be so as, for example, $\\sqrt{2}$ is an **[irrational number](https://en.wikipedia.org/wiki/Irrational_number)** that *cannot* be expressed with absolute precision using *any* number of bits, even in theory.\n", + "\n", + "We see this as raising $\\sqrt{2}$ to the power of $2$ results in an imprecise value as before!" + ] + }, + { + "cell_type": "code", + "execution_count": 147, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "Decimal('1.999999999999999999999999999')" + ] + }, + "execution_count": 147, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "two = Decimal(2).sqrt() ** 2\n", + "\n", + "two" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The [quantize()](https://docs.python.org/3/library/decimal.html#decimal.Decimal.quantize) method allows us to [quantize](https://www.dictionary.com/browse/quantize) (i.e., \"round\") a `Decimal` number at any precision that is *smaller* than the set precision. It looks at the number of decimals (i.e., to the right of the period) of the numeric argument we pass in.\n", + "\n", + "For example, as the overall imprecise value of `two` still has an internal precision of `28` digits, we can correctly round it to *four* decimals (i.e., `Decimal(\"0.0001\")` has four decimals)." + ] + }, + { + "cell_type": "code", + "execution_count": 148, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "Decimal('2.0000')" + ] + }, + "execution_count": 148, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "two.quantize(Decimal(\"0.0001\"))" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "We can never round a `Decimal` number and obtain a greater precision than before: The `InvalidOperation` exception tells us that *loudly*." + ] + }, + { + "cell_type": "code", + "execution_count": 149, + "metadata": {}, + "outputs": [ + { + "ename": "InvalidOperation", + "evalue": "[]", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mInvalidOperation\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mtwo\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mquantize\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mDecimal\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"0.1\"\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m**\u001b[0m \u001b[0;36m28\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;31mInvalidOperation\u001b[0m: []" + ] + } + ], + "source": [ + "two.quantize(Decimal(\"0.1\") ** 28)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Consequently, with this little workaround $\\sqrt{2}^2 = 2$ works, even in Python." + ] + }, + { + "cell_type": "code", + "execution_count": 150, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 150, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "two.quantize(Decimal(\"0.0001\")) == 2" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The downside is that the entire expression is not as pretty as `sqrt(2) ** 2 == 2` from above." + ] + }, + { + "cell_type": "code", + "execution_count": 151, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 151, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "(Decimal(2).sqrt() ** 2).quantize(Decimal(\"0.0001\")) == 2" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`nan` and positive and negative `inf` exist as well, and the same remarks from above apply." + ] + }, + { + "cell_type": "code", + "execution_count": 152, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "Decimal('NaN')" + ] + }, + "execution_count": 152, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "Decimal(\"nan\")" + ] + }, + { + "cell_type": "code", + "execution_count": 153, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "False" + ] + }, + "execution_count": 153, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "Decimal(\"nan\") == Decimal(\"nan\") # nan's never compare equal to anything, not even to themselves" + ] + }, + { + "cell_type": "code", + "execution_count": 154, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "Decimal('Infinity')" + ] + }, + "execution_count": 154, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "Decimal(\"inf\")" + ] + }, + { + "cell_type": "code", + "execution_count": 155, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "Decimal('-Infinity')" + ] + }, + "execution_count": 155, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "Decimal(\"-inf\")" + ] + }, + { + "cell_type": "code", + "execution_count": 156, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "Decimal('Infinity')" + ] + }, + "execution_count": 156, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "Decimal(\"inf\") + 42 # Infinity is infinity, concrete numbers loose its meaning" + ] + }, + { + "cell_type": "code", + "execution_count": 157, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 157, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "Decimal(\"inf\") + 42 == Decimal(\"inf\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "As with `float` objects, we cannot add infinities of different signs: Now get a module-specific `InvalidOperation` exception instead of a `nan` value. Here, **failing loudly** is a good thing as it prevents us from working with invalid results." + ] + }, + { + "cell_type": "code", + "execution_count": 158, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "ename": "InvalidOperation", + "evalue": "[]", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mInvalidOperation\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mDecimal\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"inf\"\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0mDecimal\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"-inf\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;31mInvalidOperation\u001b[0m: []" + ] + } + ], + "source": [ + "Decimal(\"inf\") + Decimal(\"-inf\")" + ] + }, + { + "cell_type": "code", + "execution_count": 159, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "ename": "InvalidOperation", + "evalue": "[]", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mInvalidOperation\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mDecimal\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"inf\"\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m-\u001b[0m \u001b[0mDecimal\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"inf\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;31mInvalidOperation\u001b[0m: []" + ] + } + ], + "source": [ + "Decimal(\"inf\") - Decimal(\"inf\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "For more information on the `Decimal` type, see the tutorial at [PYMOTW](https://pymotw.com/3/decimal/index.html) or the official [documentation](https://docs.python.org/3/library/decimal.html)." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## The `Fraction` Type" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "If the numbers in an application can be expressed as [rational numbers](https://en.wikipedia.org/wiki/Rational_number) (i.e., the set $\\mathbb{Q}$), we may model them as a [Fraction](https://docs.python.org/3/library/fractions.html#fractions.Fraction) type from the [fractions](https://docs.python.org/3/library/fractions.html) module in the [standard library](https://docs.python.org/3/library/index.html). As any fraction can always be formulated as the division of one integer by another, `Fraction` objects are inherently precise, just as `int` objects on their own. Further, we maintain the precision as long as we do not use them in a mathematical operation that could result in an irrational number (e.g., taking the square root).\n", + "\n", + "We import the `Fraction` type from the [fractions](https://docs.python.org/3/library/fractions.html) module." + ] + }, + { + "cell_type": "code", + "execution_count": 160, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "from fractions import Fraction" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Among others, there are two simple ways to create a `Fraction` object: We either instantiate one with two `int` objects representing the numerator and denominator or with a `str` object. In the latter case, we have two options again and use either the format \"numerator/denominator\" (i.e., *without* any spaces) or the same format as for `float` and `Decimal` objects above." + ] + }, + { + "cell_type": "code", + "execution_count": 161, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "Fraction(1, 3)" + ] + }, + "execution_count": 161, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "Fraction(1, 3) # this is now 1/3 with \"full\" precision" + ] + }, + { + "cell_type": "code", + "execution_count": 162, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "Fraction(1, 3)" + ] + }, + "execution_count": 162, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "Fraction(\"1/3\") # this is 1/3 with \"full\" precision again" + ] + }, + { + "cell_type": "code", + "execution_count": 163, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "Fraction(3333333333, 10000000000)" + ] + }, + "execution_count": 163, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "Fraction(\"0.3333333333\") # this is 1/3 with a precision of 10 significant digits" + ] + }, + { + "cell_type": "code", + "execution_count": 164, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "Fraction(3333333333, 10000000000)" + ] + }, + "execution_count": 164, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "Fraction(\"3333333333e-10\") # the same in scientific notation" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Only the lowest common denominator version is maintained after creation: For example, $\\frac{3}{2}$ and $\\frac{6}{4}$ are the same, and both become `Fraction(3, 2)`." + ] + }, + { + "cell_type": "code", + "execution_count": 165, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "Fraction(3, 2)" + ] + }, + "execution_count": 165, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "Fraction(3, 2)" + ] + }, + { + "cell_type": "code", + "execution_count": 166, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "Fraction(3, 2)" + ] + }, + "execution_count": 166, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "Fraction(6, 4)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "We could also cast a `Decimal` object as a `Fraction` object: This only makes sense as `Decimal` objects come with a pre-defined precision." + ] + }, + { + "cell_type": "code", + "execution_count": 167, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "Fraction(1, 10)" + ] + }, + "execution_count": 167, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "Fraction(Decimal(\"0.1\"))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`float` objects may *syntactically* be cast as `Fraction` objects as well. However, then we create a `Fraction` object that precisely remembers the `float` object's imprecision: A *bad* idea!" + ] + }, + { + "cell_type": "code", + "execution_count": 168, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "Fraction(3602879701896397, 36028797018963968)" + ] + }, + "execution_count": 168, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "Fraction(0.1)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`Fraction` objects follow the arithmetic rules from middle school and may be mixed with `int` objects *without* any loss of precision. The result is always a *new* `Fraction` object." + ] + }, + { + "cell_type": "code", + "execution_count": 169, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "Fraction(7, 4)" + ] + }, + "execution_count": 169, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "Fraction(3, 2) + Fraction(1, 4)" + ] + }, + { + "cell_type": "code", + "execution_count": 170, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "Fraction(1, 2)" + ] + }, + "execution_count": 170, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "Fraction(5, 2) - 2" + ] + }, + { + "cell_type": "code", + "execution_count": 171, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "Fraction(1, 1)" + ] + }, + "execution_count": 171, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "3 * Fraction(1, 3)" + ] + }, + { + "cell_type": "code", + "execution_count": 172, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "Fraction(1, 1)" + ] + }, + "execution_count": 172, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "Fraction(3, 2) * Fraction(2, 3)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`Fraction` and `float` objects may also be mixed *syntactically*. However, then the results may exhibit imprecision again, even if we do not see them at first sight! This is another example of code **failing silently**." + ] + }, + { + "cell_type": "code", + "execution_count": 173, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "0.1" + ] + }, + "execution_count": 173, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "10.0 * Fraction(1, 100)" + ] + }, + { + "cell_type": "code", + "execution_count": 174, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'0.10000000000000000555111512312578270211815834045410'" + ] + }, + "execution_count": 174, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "format(10.0 * Fraction(1, 100), \".50f\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "For more examples and discussions, see the tutorial at [PYMOTW](https://pymotw.com/3/fractions/index.html) or the official [documentation](https://docs.python.org/3/library/fractions.html)." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## The `complex` Type" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Some mathematical equations cannot be solved if the solution has to be in the set of the real numbers $\\mathbb{R}$. For example, $x^2 = -1$ can be rearranged into $x = \\sqrt{-1}$, but the square root is not defined for negative numbers. To mitigate this, mathematicians introduced the concept of an [imaginary number](https://en.wikipedia.org/wiki/Imaginary_number) $\\textbf{i}$ that is *defined* as $\\textbf{i} = \\sqrt{-1}$ or often as the solution to the equation $\\textbf{i}^2 = -1$. So, the solution to $x = \\sqrt{-1}$ then becomes $x = \\textbf{i}$.\n", + "\n", + "If we generalize the example equation into $(mx-n)^2 = -1 \\implies x = \\frac{1}{m}(\\sqrt{-1} + n)$ where $m$ and $n$ are constants chosen from the reals $\\mathbb{R}$, then the solution to the equation comes in the form $x = a + b\\textbf{i}$, the sum of a real number and an imaginary number, with $a=\\frac{n}{m}$ and $b = \\frac{1}{m}$.\n", + "\n", + "Such \"compound\" numbers are called **[complex numbers](https://en.wikipedia.org/wiki/Complex_number)**, and the set of all such numbers is commonly denoted by $\\mathbb{C}$. The reals $\\mathbb{R}$ are a strict subset of $\\mathbb{C}$ with $b=0$. Further, $a$ is referred to as the **real part** and $b$ as the **imaginary part** of the complex number.\n", + "\n", + "Complex numbers are often visualized in a plane like below, where the real part is depicted on the x-axis and the imaginary part on the y-axis." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`complex` numbers are part of core Python. The simplest way to create one is to write an arithmetic expression with the literal `j` notation for $\\textbf{i}$. The `j` is commonly used in many engineering disciplines instead of the symbol $\\textbf{i}$ from math as $I$ in engineering more often than not means [electric current](https://en.wikipedia.org/wiki/Electric_current).\n", + "\n", + "For example, the answer to $x^2 = -1$ can be written in Python as `1j` like below. This creates a `complex` object with value `1j`. The same syntactic rules apply as with the above `e` notation: No spaces are allowed between the number and the `j`. The number may be any `int` or `float` literal; however, it is stored as a `float` internally. So, `complex` numbers suffer from the same imprecision as `float` numbers." + ] + }, + { + "cell_type": "code", + "execution_count": 175, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "x = 1j" + ] + }, + { + "cell_type": "code", + "execution_count": 176, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "139928697740272" + ] + }, + "execution_count": 176, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "id(x)" + ] + }, + { + "cell_type": "code", + "execution_count": 177, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "complex" + ] + }, + "execution_count": 177, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "type(x)" + ] + }, + { + "cell_type": "code", + "execution_count": 178, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "1j" + ] + }, + "execution_count": 178, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "x" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "To verify that it solves the equation, let's raise it to the power of $2$." + ] + }, + { + "cell_type": "code", + "execution_count": 179, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 179, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "x ** 2 == -1" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Often, we write an expression of the form $a + b\\textbf{i}$." + ] + }, + { + "cell_type": "code", + "execution_count": 180, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(2+0.5j)" + ] + }, + "execution_count": 180, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "2 + 0.5j" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Alternatively, we may use the [complex()](https://docs.python.org/3/library/functions.html#complex) built-in: This takes two parameters where the second is optional and defaults to `0`. We may either call it with one or two arguments of any numeric type or a `str` object in the format of the previous code cell without any spaces." + ] + }, + { + "cell_type": "code", + "execution_count": 181, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(2+0.5j)" + ] + }, + "execution_count": 181, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "complex(2, 0.5) # an integer and a float work just fine as arguments" + ] + }, + { + "cell_type": "code", + "execution_count": 182, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(2+0j)" + ] + }, + "execution_count": 182, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "complex(2) # omitting the second argument sets the imaginary part to 0" + ] + }, + { + "cell_type": "code", + "execution_count": 183, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(2+0.5j)" + ] + }, + "execution_count": 183, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "complex(Decimal(\"2.0\"), Fraction(1, 2)) # the arguments may be any numeric type" + ] + }, + { + "cell_type": "code", + "execution_count": 184, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(2+0.5j)" + ] + }, + "execution_count": 184, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "complex(\"2+0.5j\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Arithmetic expressions work with `complex` numbers. They may be mixed with the other numeric types, and the result is always a `complex` number." + ] + }, + { + "cell_type": "code", + "execution_count": 185, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "c1 = 1 + 2j\n", + "c2 = 3 + 4j" + ] + }, + { + "cell_type": "code", + "execution_count": 186, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(4+6j)" + ] + }, + "execution_count": 186, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "c1 + c2" + ] + }, + { + "cell_type": "code", + "execution_count": 187, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(-2-2j)" + ] + }, + "execution_count": 187, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "c1 - c2" + ] + }, + { + "cell_type": "code", + "execution_count": 188, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(40+2j)" + ] + }, + "execution_count": 188, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "c1 + 39" + ] + }, + { + "cell_type": "code", + "execution_count": 189, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "-4j" + ] + }, + "execution_count": 189, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "3.0 - c2" + ] + }, + { + "cell_type": "code", + "execution_count": 190, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(5+10j)" + ] + }, + "execution_count": 190, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "5 * c1" + ] + }, + { + "cell_type": "code", + "execution_count": 191, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(0.5+0.6666666666666666j)" + ] + }, + "execution_count": 191, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "c2 / 6" + ] + }, + { + "cell_type": "code", + "execution_count": 192, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(-5+10j)" + ] + }, + "execution_count": 192, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "c1 * c2" + ] + }, + { + "cell_type": "code", + "execution_count": 193, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(0.44+0.08j)" + ] + }, + "execution_count": 193, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "c1 / c2" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The absolute value of a `complex` number $x$ is defined with the Pythagorean Theorem where $\\lVert x \\rVert = \\sqrt{a^2 + b^2}$ and $a$ and $b$ are the real and imaginary parts. The [abs()](https://docs.python.org/3/library/functions.html#abs) built-in function implements that in Python." + ] + }, + { + "cell_type": "code", + "execution_count": 194, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "5.0" + ] + }, + "execution_count": 194, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "abs(3 + 4j)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "A `complex` number comes with two **attributes** `real` and `imag` that return the two parts as `float` objects on their own." + ] + }, + { + "cell_type": "code", + "execution_count": 195, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "1.0" + ] + }, + "execution_count": 195, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "c1.real" + ] + }, + { + "cell_type": "code", + "execution_count": 196, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "2.0" + ] + }, + "execution_count": 196, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "c1.imag" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Also, a conjugate() method is bound to every `complex` object. The [complex conjugate](https://en.wikipedia.org/wiki/Complex_conjugate) is defined to be the complex number with identical real part but an imaginary part reversed in sign." + ] + }, + { + "cell_type": "code", + "execution_count": 197, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "(1-2j)" + ] + }, + "execution_count": 197, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "c1.conjugate()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The [cmath](https://docs.python.org/3/library/cmath.html) module in the [standard library](https://docs.python.org/3/library/index.html) implements many of the functions from the [math](https://docs.python.org/3/library/math.html) module such that they work with complex numbers." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## The Numeric Tower" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Analogous to the discussion of containers and iterables in [Chapter 4](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/04_iteration.ipynb#Containers-vs.-Iterables), we contrast the *concrete* numeric data types in this chapter with the *abstract* ideas behind [numbers in mathematics](https://en.wikipedia.org/wiki/Number).\n", + "\n", + "The figure below summarizes five *major* sets of [numbers in mathematics](https://en.wikipedia.org/wiki/Number) as we know them from high school:\n", + "\n", + "- $\\mathbb{N}$: [Natural numbers](https://en.wikipedia.org/wiki/Natural_number) are all non-negative count numbers, e.g., $0, 1, 2, ...$\n", + "- $\\mathbb{Z}$: [Integers](https://en.wikipedia.org/wiki/Integer) are all numbers *without* a fractional component, e.g., $-1, 0, 1, ...$\n", + "- $\\mathbb{Q}$: [Rational numbers](https://en.wikipedia.org/wiki/Rational_number) are all numbers that can be expressed as a quotient of two integers, e.g., $-\\frac{1}{2}, 0, \\frac{1}{2}, ...$\n", + "- $\\mathbb{R}$: [Real numbers](https://en.wikipedia.org/wiki/Real_number) are all numbers that can be represented as a distance along a line (negative means \"reversed\"), e.g., $\\sqrt{2}, \\pi, \\text{e}, ...$\n", + "- $\\mathbb{C}$: [Complex numbers](https://en.wikipedia.org/wiki/Complex_number) are all numbers of the form $a + b\\textbf{i}$ where $a$ and $b$ are real numbers and $\\textbf{i}$ is the [imaginary number](https://en.wikipedia.org/wiki/Imaginary_number), e.g., $0, \\textbf{i}, 1 + \\textbf{i}, ...$\n", + "\n", + "In the listed order, the five sets are perfect subsets and $\\mathbb{C}$ is the largest set (to be precise, all sets are infinite, but they still have a different number of elements)." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The *concrete* data types introduced in this chapter are all *imperfect* models of *abstract* mathematical ideas.\n", + "\n", + "The `int` and `Fraction` types are the models \"closest\" to the idea they implement: Whereas $\\mathbb{Z}$ and $\\mathbb{Q}$ are, by definition, infinite, every computer runs out of bits when representing sufficiently large integers or fractions with a sufficiently large number of decimals. However, within a system-dependent minimum and maximum integer range, we can model an integer or fraction without any loss in precision.\n", + "\n", + "For the other types, in particular, the `float` type, the implications of their imprecision are discussed in detail above.\n", + "\n", + "The abstract concepts behind the four outer-most mathematical sets are part of Python since [PEP 3141](https://www.python.org/dev/peps/pep-3141/) in 2007. The [numbers](https://docs.python.org/3/library/numbers.html) module in the [standard library](https://docs.python.org/3/library/index.html) defines what Pythonistas call the **numeric tower**, a collection of five **[abstract data types](https://en.wikipedia.org/wiki/Abstract_data_type)**, or **abstract base classes** as they are called in Python jargon:\n", + "\n", + "- `numbers.Number`: \"any number\" (cf., [documentation](https://docs.python.org/3/library/numbers.html#numbers.Number))\n", + "- `numbers.Complex`: \"all complex numbers\" (cf., [documentation](https://docs.python.org/3/library/numbers.html#numbers.Complex))\n", + "- `numbers.Real`: \"all real numbers\" (cf., [documentation](https://docs.python.org/3/library/numbers.html#numbers.Real))\n", + "- `numbers.Rational`: \"all rational numbers\" (cf., [documentation](https://docs.python.org/3/library/numbers.html#numbers.Rational))\n", + "- `numbers.Integral`: \"all integers\" (cf., [documentation](https://docs.python.org/3/library/numbers.html#numbers.Integral))" + ] + }, + { + "cell_type": "code", + "execution_count": 198, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "import numbers" + ] + }, + { + "cell_type": "code", + "execution_count": 199, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "['ABCMeta',\n", + " 'Complex',\n", + " 'Integral',\n", + " 'Number',\n", + " 'Rational',\n", + " 'Real',\n", + " '__all__',\n", + " '__builtins__',\n", + " '__cached__',\n", + " '__doc__',\n", + " '__file__',\n", + " '__loader__',\n", + " '__name__',\n", + " '__package__',\n", + " '__spec__',\n", + " 'abstractmethod']" + ] + }, + "execution_count": 199, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "dir(numbers)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "As a reminder, the built-in [help()](https://docs.python.org/3/library/functions.html#help) function is always our friend.\n", + "\n", + "The abstract types' docstrings are unsurprisingly similar to the corresponding concrete types' docstrings (for now, let's not worry about the dunder-style names in the docstrings).\n", + "\n", + "For example, both `numbers.Complex` and `complex` list the `imag` and `real` attributes." + ] + }, + { + "cell_type": "code", + "execution_count": 200, + "metadata": { + "scrolled": true, + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Help on class Complex in module numbers:\n", + "\n", + "class Complex(Number)\n", + " | Complex defines the operations that work on the builtin complex type.\n", + " | \n", + " | In short, those are: a conversion to complex, .real, .imag, +, -,\n", + " | *, /, abs(), .conjugate, ==, and !=.\n", + " | \n", + " | If it is given heterogeneous arguments, and doesn't have special\n", + " | knowledge about them, it should fall back to the builtin complex\n", + " | type as described below.\n", + " | \n", + " | Method resolution order:\n", + " | Complex\n", + " | Number\n", + " | builtins.object\n", + " | \n", + " | Methods defined here:\n", + " | \n", + " | __abs__(self)\n", + " | Returns the Real distance from 0. Called for abs(self).\n", + " | \n", + " | __add__(self, other)\n", + " | self + other\n", + " | \n", + " | __bool__(self)\n", + " | True if self != 0. Called for bool(self).\n", + " | \n", + " | __complex__(self)\n", + " | Return a builtin complex instance. Called for complex(self).\n", + " | \n", + " | __eq__(self, other)\n", + " | self == other\n", + " | \n", + " | __mul__(self, other)\n", + " | self * other\n", + " | \n", + " | __neg__(self)\n", + " | -self\n", + " | \n", + " | __pos__(self)\n", + " | +self\n", + " | \n", + " | __pow__(self, exponent)\n", + " | self**exponent; should promote to float or complex when necessary.\n", + " | \n", + " | __radd__(self, other)\n", + " | other + self\n", + " | \n", + " | __rmul__(self, other)\n", + " | other * self\n", + " | \n", + " | __rpow__(self, base)\n", + " | base ** self\n", + " | \n", + " | __rsub__(self, other)\n", + " | other - self\n", + " | \n", + " | __rtruediv__(self, other)\n", + " | other / self\n", + " | \n", + " | __sub__(self, other)\n", + " | self - other\n", + " | \n", + " | __truediv__(self, other)\n", + " | self / other: Should promote to float when necessary.\n", + " | \n", + " | conjugate(self)\n", + " | (x+y*i).conjugate() returns (x-y*i).\n", + " | \n", + " | ----------------------------------------------------------------------\n", + " | Data descriptors defined here:\n", + " | \n", + " | imag\n", + " | Retrieve the imaginary component of this number.\n", + " | \n", + " | This should subclass Real.\n", + " | \n", + " | real\n", + " | Retrieve the real component of this number.\n", + " | \n", + " | This should subclass Real.\n", + " | \n", + " | ----------------------------------------------------------------------\n", + " | Data and other attributes defined here:\n", + " | \n", + " | __abstractmethods__ = frozenset({'__abs__', '__add__', '__complex__', ...\n", + " | \n", + " | __hash__ = None\n", + "\n" + ] + } + ], + "source": [ + "help(numbers.Complex)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "For sure, Python understands the built-in types as literals." + ] + }, + { + "cell_type": "code", + "execution_count": 201, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Help on class complex in module builtins:\n", + "\n", + "class complex(object)\n", + " | complex(real=0, imag=0)\n", + " | \n", + " | Create a complex number from a real part and an optional imaginary part.\n", + " | \n", + " | This is equivalent to (real + imag*1j) where imag defaults to 0.\n", + " | \n", + " | Methods defined here:\n", + " | \n", + " | __abs__(self, /)\n", + " | abs(self)\n", + " | \n", + " | __add__(self, value, /)\n", + " | Return self+value.\n", + " | \n", + " | __bool__(self, /)\n", + " | self != 0\n", + " | \n", + " | __divmod__(self, value, /)\n", + " | Return divmod(self, value).\n", + " | \n", + " | __eq__(self, value, /)\n", + " | Return self==value.\n", + " | \n", + " | __float__(self, /)\n", + " | float(self)\n", + " | \n", + " | __floordiv__(self, value, /)\n", + " | Return self//value.\n", + " | \n", + " | __format__(...)\n", + " | complex.__format__() -> str\n", + " | \n", + " | Convert to a string according to format_spec.\n", + " | \n", + " | __ge__(self, value, /)\n", + " | Return self>=value.\n", + " | \n", + " | __getattribute__(self, name, /)\n", + " | Return getattr(self, name).\n", + " | \n", + " | __getnewargs__(...)\n", + " | \n", + " | __gt__(self, value, /)\n", + " | Return self>value.\n", + " | \n", + " | __hash__(self, /)\n", + " | Return hash(self).\n", + " | \n", + " | __int__(self, /)\n", + " | int(self)\n", + " | \n", + " | __le__(self, value, /)\n", + " | Return self<=value.\n", + " | \n", + " | __lt__(self, value, /)\n", + " | Return self complex\n", + " | \n", + " | Return the complex conjugate of its argument. (3-4j).conjugate() == 3+4j.\n", + " | \n", + " | ----------------------------------------------------------------------\n", + " | Static methods defined here:\n", + " | \n", + " | __new__(*args, **kwargs) from builtins.type\n", + " | Create and return a new object. See help(type) for accurate signature.\n", + " | \n", + " | ----------------------------------------------------------------------\n", + " | Data descriptors defined here:\n", + " | \n", + " | imag\n", + " | the imaginary part of a complex number\n", + " | \n", + " | real\n", + " | the real part of a complex number\n", + "\n" + ] + } + ], + "source": [ + "help(complex)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "One way to use *abstract* data types is to use them in place of a *concrete* data type.\n", + "\n", + "For example, we may pass them as arguments to the built-in [isinstance()](https://docs.python.org/3/library/functions.html#isinstance) function and check in which of the five mathematical sets the object `1 / 10` is." + ] + }, + { + "cell_type": "code", + "execution_count": 202, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 202, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "isinstance(1 / 10, float)" + ] + }, + { + "cell_type": "code", + "execution_count": 203, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 203, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "isinstance(1 / 10, numbers.Number)" + ] + }, + { + "cell_type": "code", + "execution_count": 204, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 204, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "isinstance(1 / 10, numbers.Complex)" + ] + }, + { + "cell_type": "code", + "execution_count": 205, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 205, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "isinstance(1 / 10, numbers.Real)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Due to the `float` type's inherent imprecision, `1 / 10` is *not* a rational number." + ] + }, + { + "cell_type": "code", + "execution_count": 206, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "False" + ] + }, + "execution_count": 206, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "isinstance(1 / 10, numbers.Rational)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The `Fraction` type qualifies as a rational number; however, the `Decimal` type does not." + ] + }, + { + "cell_type": "code", + "execution_count": 207, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 207, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "isinstance(Fraction(1, 10), numbers.Rational)" + ] + }, + { + "cell_type": "code", + "execution_count": 208, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "False" + ] + }, + "execution_count": 208, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "isinstance(Decimal(\"0.1\"), numbers.Rational)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "#### Example: [Factorial](https://en.wikipedia.org/wiki/Factorial) (revisited)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Replacing *concrete* data types with *abstract* ones is particularly valuable in the context of input validation: The revised version of the `factorial()` function below allows its user to *take advantage of duck typing*. If a real but non-integer argument `n` is passed in, `factorial()` tries to cast `n` as an `int` object with the [trunc()](https://docs.python.org/3/library/math.html#math.trunc) function from the [math](https://docs.python.org/3/library/math.html) module in the [standard library](https://docs.python.org/3/library/index.html). [trunc()](https://docs.python.org/3/library/math.html#math.trunc) cuts off all decimals and any *concrete* numeric type implementing the *abstract* `numbers.Real` type supports it (cf., [documentation](https://docs.python.org/3/library/numbers.html#numbers.Real))." + ] + }, + { + "cell_type": "code", + "execution_count": 209, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "import math" + ] + }, + { + "cell_type": "code", + "execution_count": 210, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "0" + ] + }, + "execution_count": 210, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "math.trunc(1 / 10)" + ] + }, + { + "cell_type": "code", + "execution_count": 211, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "def factorial(n, *, strict=True):\n", + " \"\"\"Calculate the factorial of a number.\n", + "\n", + " Args:\n", + " n (int): number to calculate the factorial for; must be positive\n", + " strict (bool): if n must not contain decimals; defaults to True\n", + "\n", + " Returns:\n", + " factorial (int)\n", + "\n", + " Raises:\n", + " TypeError: if n is not an integer or cannot be cast as such\n", + " ValueError: if n is negative\n", + " \"\"\"\n", + " if not isinstance(n, numbers.Integral):\n", + " if isinstance(n, numbers.Real):\n", + " if n != math.trunc(n) and strict:\n", + " raise TypeError(\"n is not an integer-like value; it has decimals\")\n", + " n = math.trunc(n)\n", + " else:\n", + " raise TypeError(\"Factorial is only defined for integers\")\n", + "\n", + " if n < 0:\n", + " raise ValueError(\"Factorial is not defined for negative integers\")\n", + " elif n == 0:\n", + " return 1\n", + " return n * factorial(n - 1)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`factorial()` works as before, but now also accepts, for example, `float` numbers." + ] + }, + { + "cell_type": "code", + "execution_count": 212, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "1" + ] + }, + "execution_count": 212, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "factorial(0)" + ] + }, + { + "cell_type": "code", + "execution_count": 213, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "6" + ] + }, + "execution_count": 213, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "factorial(3)" + ] + }, + { + "cell_type": "code", + "execution_count": 214, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "6" + ] + }, + "execution_count": 214, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "factorial(3.0)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "With the keyword-only argument `strict`, we can control whether or not a passed in `float` object may be rounded. By default, this is not allowed." + ] + }, + { + "cell_type": "code", + "execution_count": 215, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "ename": "TypeError", + "evalue": "n is not an integer-like value; it has decimals", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mfactorial\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m3.1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;32m\u001b[0m in \u001b[0;36mfactorial\u001b[0;34m(n, strict)\u001b[0m\n\u001b[1;32m 16\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0misinstance\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mn\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mnumbers\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mReal\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 17\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mn\u001b[0m \u001b[0;34m!=\u001b[0m \u001b[0mmath\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtrunc\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mn\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32mand\u001b[0m \u001b[0mstrict\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 18\u001b[0;31m \u001b[0;32mraise\u001b[0m \u001b[0mTypeError\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"n is not an integer-like value; it has decimals\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 19\u001b[0m \u001b[0mn\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mmath\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtrunc\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mn\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 20\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;31mTypeError\u001b[0m: n is not an integer-like value; it has decimals" + ] + } + ], + "source": [ + "factorial(3.1)" + ] + }, + { + "cell_type": "code", + "execution_count": 216, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "6" + ] + }, + "execution_count": 216, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "factorial(3.1, strict=False)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "For `complex` numbers, `factorial()` still raises a `TypeError`." + ] + }, + { + "cell_type": "code", + "execution_count": 217, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "ename": "TypeError", + "evalue": "Factorial is only defined for integers", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mfactorial\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m1\u001b[0m \u001b[0;34m+\u001b[0m \u001b[0;36m2j\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;32m\u001b[0m in \u001b[0;36mfactorial\u001b[0;34m(n, strict)\u001b[0m\n\u001b[1;32m 19\u001b[0m \u001b[0mn\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mmath\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mtrunc\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mn\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 20\u001b[0m \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 21\u001b[0;31m \u001b[0;32mraise\u001b[0m \u001b[0mTypeError\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"Factorial is only defined for integers\"\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 22\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 23\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mn\u001b[0m \u001b[0;34m<\u001b[0m \u001b[0;36m0\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", + "\u001b[0;31mTypeError\u001b[0m: Factorial is only defined for integers" + ] + } + ], + "source": [ + "factorial(1 + 2j)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "## TL;DR" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "There exist three numeric types in core Python:\n", + "- `int`: a perfect model for whole numbers (i.e., the set $\\mathbb{Z}$); inherently precise\n", + "- `float`: the \"gold\" standard to approximate real numbers (i.e., the set $\\mathbb{R}$); inherently imprecise\n", + "- `complex`: layer on top of the `float` type; therefore inherently imprecise\n", + "\n", + "Furthermore, the [standard library](https://docs.python.org/3/library/index.html) adds two more types that can be used as substitutes for `float` objects:\n", + "- `Decimal`: similar to `float` but allows customizing the precision; still inherently imprecise\n", + "- `Fraction`: a perfect model for rational numbers (i.e., the set $\\mathbb{Q}$); built on top of the `int` type and therefore inherently precise\n", + "\n", + "The *important* takeaways for the data science practitioner are:\n", + "\n", + "1. **Do not mix** precise and imprecise data types, and\n", + "2. actively expect `nan` results when working with `float` numbers as there are no **loud failures**.\n", + "\n", + "The **numeric tower** is Python's way of implementing the various **abstract** ideas of what a number is in mathematics." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.3" + }, + "livereveal": { + "auto_select": "code", + "auto_select_fragment": true, + "scroll": true, + "theme": "serif" + }, + "toc": { + "base_numbering": 1, + "nav_menu": {}, + "number_sections": false, + "sideBar": true, + "skip_h1_title": true, + "title_cell": "Table of Contents", + "title_sidebar": "Contents", + "toc_cell": false, + "toc_position": { + "height": "calc(100% - 180px)", + "left": "10px", + "top": "150px", + "width": "384px" + }, + "toc_section_display": false, + "toc_window_display": false + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/05_numbers_review_and_exercises.ipynb b/05_numbers_review_and_exercises.ipynb new file mode 100644 index 0000000..4246965 --- /dev/null +++ b/05_numbers_review_and_exercises.ipynb @@ -0,0 +1,722 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "# Chapter 5: Numbers" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Content Review" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Read [Chapter 5](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/05_numbers.ipynb) of the book. Then work through the eighteen review questions." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Essay Questions " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Answer the following questions briefly with *at most* 300 characters per question!" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q1**: In what way is the **binary representation** of `int` objects *similar* to the **decimal system** taught in elementary school?" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q2**: Why may objects of type `bool` be regarded a **numeric type** as well?" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q3**: Why is it *inefficient* to store `bool` objects in bits resembling a **hexadecimal representation**?" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q4**: Colors are commonly expressed in the **hexadecimal system** in websites (cf., the [HTML](https://en.wikipedia.org/wiki/HTML) and [CSS](https://en.wikipedia.org/wiki/Cascading_Style_Sheets) formats).\n", + "\n", + "For example, $#000000$, $#ff9900$, and $#ffffff$ turn out to be black, orange, and white. The six digits are read in *pairs of two* from left to right, and the *three pairs* correspond to the proportions of red, green, and blue mixed together. They reach from $0_{16} = 0_{10}$ for $0$% to $\\text{ff}_{16} = 255_{10}$ for $100$% (cf., this [article](https://en.wikipedia.org/wiki/RGB_color_model) for an in-depth discussion).\n", + "\n", + "In percent, what are the proportions of red, green, and blue that make up orange? Calculate the three percentages separately! How many **bytes** are needed to encode a color? How many **bits** are that?" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q5**: What does it mean for a code fragment to **fail silently**?" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q6**: Explain why the mathematical set of all real numbers $\\mathbb{R}$ can only be **approximated** by floating-point numbers on a computer!" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q7**: How do we deal with a `float` object's imprecision if we need to **check for equality**?" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q8**: What data type, built-in or from the [standard library](https://docs.python.org/3/library/index.html), is best suited to represent the [transcendental numbers](https://en.wikipedia.org/wiki/Transcendental_number) $\\pi$ and $\\text{e}$?" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q9**: How can **abstract data types**, for example, as defined in the **numeric tower**, be helpful in enabling **duck typing**?" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### True / False Questions" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Motivate your answer with *one short* sentence!" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q10**: The precision of `int` objects depends on how we choose to represent them in memory. For example, using a **hexadecimal representation** gives us $16^8$ digits whereas with a **binary representation** an `int` object can have *at most* $2^8$ digits." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q11**: With the built-in [round()](https://docs.python.org/3/library/functions.html#round) function, we obtain a *precise* representation for any `float` object if we can live with *less than* $15$ digits of precision." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q12**: As most currencies operate with $2$ or $3$ decimals (e.g., EUR $9.99$), the `float` type's limitation of *at most* $15$ digits is *not* a problem in practice." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q13**: The [IEEE 754](https://en.wikipedia.org/wiki/IEEE_754) standard's **special values** provide no benefit in practice as we could always use a **[sentinel](https://en.wikipedia.org/wiki/Sentinel_value)** value (i.e., a \"dummy\"). For example, instead of `nan`, we can always use `0` to indicate a *missing* value." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q14**: The following code fragment raises an `InvalidOperation` exception. That is an example of code **failing loudly**.\n", + "```python\n", + "float(\"inf\") + float(\"-inf\")\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q15**: Python provides a `scientific` type (e.g., `1.23e4`) that is useful mainly to model problems in the domains of physics or astrophysics." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q16**: From a practitioner's point of view, the built-in [format()](https://docs.python.org/3/library/functions.html#format) function does the *same* as the built-in [round()](https://docs.python.org/3/library/functions.html#round) function." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q17**: The `Decimal` type from the [decimal](https://docs.python.org/3/library/decimal.html) module in the [standard library](https://docs.python.org/3/library/index.html) allows us to model the set of the real numbers $\\mathbb{R}$ *precisely*." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q18**: The `Fraction` type from the [fractions](https://docs.python.org/3/library/fractions.html) module in the [standard library](https://docs.python.org/3/library/index.html) allows us to model the set of the rational numbers $\\mathbb{Q}$ *precisely*." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Coding Exercises" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Discounting Customer Orders (revisited)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q11** in [Chapter 2's Review & Exercises](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/02_functions_review_and_exercises.ipynb#Volume-of-a-Sphere) section already revealed that we must consider the effects of the `float` type's imprecision.\n", + "\n", + "This becomes even more important when we deal with numeric data modeling accounting or finance data (cf., [this comment](https://stackoverflow.com/a/24976426) on \"falsehoods programmers believe about money\").\n", + "\n", + "In addition to the *inherent imprecision* of numbers in general, the topic of **[rounding numbers](https://en.wikipedia.org/wiki/Rounding)** is also not as trivial as we might expect! [This article](https://realpython.com/python-rounding/) summarizes everything the data science practitioner needs to know.\n", + "\n", + "In this exercise, we revisit **Q9** from [Chapter 3's Review & Exercises](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/03_conditionals_review_and_exercises.ipynb#Discounting-Customer-Orders) section, and make the `discounted_price()` function work *correctly* for real-life sales data." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q19.1**: Execute the code cells below! What results would you have *expected*, and why?" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "round(1.5)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "round(2.5)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "round(2.675, 2)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q19.2**: The built-in [round()](https://docs.python.org/3/library/functions.html#round) function implements the \"**[round half to even](https://en.wikipedia.org/wiki/Rounding#Round_half_to_even)**\" strategy. Describe in one or two sentences what that means!" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q19.3**: For the revised `discounted_price()` function, we have to tackle *two* issues: First, we have to replace the built-in `float` type with a data type that allows us to control the precision. Second, the discounted price should be rounded according to a more human-friendly rounding strategy, namely \"**[round half away from zero](https://en.wikipedia.org/wiki/Rounding#Round_half_away_from_zero)**.\"\n", + "\n", + "Describe in one or two sentences how \"**[round half away from zero](https://en.wikipedia.org/wiki/Rounding#Round_half_away_from_zero)**\" is more in line with how humans think of rounding!" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q19.4**: We use the `Decimal` type from the [decimal](https://docs.python.org/3/library/decimal.html) module in the [standard library](https://docs.python.org/3/library/index.html) to tackle *both* issues simultaneously.\n", + "\n", + "Assign `euro` a numeric object such that both `Decimal(\"1.5\")` and `Decimal(\"2.5\")` are rounded to `Decimal(\"2\")` (i.e., no decimal) with the [quantize()](https://docs.python.org/3/library/decimal.html#decimal.Decimal.quantize) method!" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from decimal import Decimal" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "euro = ..." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "Decimal(\"1.5\").quantize(...)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "Decimal(\"2.5\").quantize(...)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q19.5**: Obviously, the two preceding code cells still [round half to even](https://en.wikipedia.org/wiki/Rounding#Round_half_to_even).\n", + "\n", + "The [decimal](https://docs.python.org/3/library/decimal.html) module defines a `ROUND_HALF_UP` flag that we can pass as the second argument to the [quantize()](https://docs.python.org/3/library/decimal.html#decimal.Decimal.quantize) method. Then, it [rounds half away from zero](https://en.wikipedia.org/wiki/Rounding#Round_half_away_from_zero).\n", + "\n", + "Add `ROUND_HALF_UP` to the code cells! `Decimal(\"2.5\")` should now be rounded to `Decimal(\"3\")`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "from decimal import ROUND_HALF_UP" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "Decimal(\"1.5\").quantize(...)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "Decimal(\"2.5\").quantize(...)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q19.6**: Instead of `euro`, define `cents` such that rounding occurs to *two* decimals! `Decimal(\"2.675\")` should now be rounded to `Decimal(\"2.68\")`. Do *not* forget to include the `ROUND_HALF_UP` flag!" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "cents = ..." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "Decimal(\"2.675\").quantize(...)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q19.7**: Re-write the function `discounted_price()` from [Chapter 3's Review & Exercises](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/03_conditionals_review_and_exercises.ipynb#Discounting-Customer-Orders) section!\n", + "\n", + "It takes the *positional* arguments `unit_price` and `quantity` and implements a discount scheme for a line item in a customer order as follows:\n", + "\n", + "- if the unit price is over 100 dollars, grant 10% relative discount\n", + "- if a customer orders more than 10 items, one in every five items is for free\n", + "\n", + "Only one of the two discounts is granted, whichever is better for the customer.\n", + "\n", + "The function then returns the overall price for the line item as a `Decimal` number with a precision of *two* decimals.\n", + "\n", + "Enable **duck typing** by allowing the function to be called with various numeric types as the arguments, in particular, `quantity` may be a non-integer as well: Use an appropriate **abstract data type** from the [numbers](https://docs.python.org/3/library/numbers.html) module in the [standard library](https://docs.python.org/3/library/index.html) to verify the arguments' types and also that they are both positive!\n", + "\n", + "It is considered a *best practice* to only round towards the *end* of the calculations." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import numbers" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "def discounted_price(...):\n", + " ...\n", + " ...\n", + " ...\n", + " ...\n", + " ...\n", + " ...\n", + " ...\n", + " ...\n", + " ...\n", + " ...\n", + " ...\n", + " ...\n", + " ...\n", + " ...\n", + " ...\n", + " ...\n", + " ..." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q19.8**: Execute the code cells below and verify the final price for the following four test cases:\n", + "\n", + "- $7$ smartphones @ $99.00$ USD\n", + "- $3$ workstations @ $999.00$ USD\n", + "- $19$ GPUs @ $879.95$ USD\n", + "- $14$ Raspberry Pis @ $35.00$ USD\n", + "\n", + "The output should now *always* be a `Decimal` number with *two* decimals!" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "discounted_price(99, 7)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "discounted_price(999, 3)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "discounted_price(879.95, 19)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "discounted_price(35, 14)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "This also works if `quantity` is passed in as a `float` type." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "discounted_price(99, 7.0)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Decimals beyond the first two are gracefully discarded (i.e., *without* rounding errors accumulating)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "discounted_price(99.0001, 7)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The basic input validation ensures that the user of `discounted_price()` does not pass in invalid data. Here, the `\"abc\"` creates a `TypeError`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "discounted_price(\"abc\", 7)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "A `-1` passed in as `unit_price` results in a `ValueError`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "discounted_price(-1, 7)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.3" + }, + "toc": { + "base_numbering": 1, + "nav_menu": {}, + "number_sections": false, + "sideBar": true, + "skip_h1_title": true, + "title_cell": "Table of Contents", + "title_sidebar": "Contents", + "toc_cell": false, + "toc_position": {}, + "toc_section_display": false, + "toc_window_display": false + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/06_text.ipynb b/06_text.ipynb new file mode 100644 index 0000000..4cdd4ab --- /dev/null +++ b/06_text.ipynb @@ -0,0 +1,2787 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "# Chapter 6: Text" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "In this chapter, we continue the study of the built-in data types. Building on our knowledge of numbers, the next layer consists of textual data that are modeled primarily with the `str` type in Python. `str` objects are naturally more \"complex\" than numeric objects as any text consists of an arbitrary and possibly large number of individual characters that may be chosen from any alphabet in the history of humankind. Luckily, Python abstracts away most of this complexity." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## The `str` Type" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The `str` type is the default way of modeling **textual data**. To create a `str` object, we use a **literal notation** and type the text between enclosing **double quotes** `\"`." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "school = \"WHU - Otto Beisheim School of Management\"" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Like everything in Python, `school` is an object." + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "140148947850512" + ] + }, + "execution_count": 2, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "id(school)" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "str" + ] + }, + "execution_count": 3, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "type(school)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "A `str` object evaluates to itself in a literal notation with enclosing **single quotes** `'` by default. In [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb), we already specified the double quotes `\"` convention we stick to in this book. Yet, single quotes `'` and double quotes `\"` are *perfect* substitutes for all `str` objects that do *not* contain any of the two symbols in it. We could have used the reverse convention, as well.\n", + "\n", + "As [this discussion](https://stackoverflow.com/questions/56011/single-quotes-vs-double-quotes-in-python) shows, many programmers have *strong* opinions about that and make up *new* conventions for their projects. Consequently, the discussion was \"closed as not constructive\" by the moderators." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'WHU - Otto Beisheim School of Management'" + ] + }, + "execution_count": 4, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "school" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "As the single quote `'` is often used in the English language as a shortener, we could make an argument in favor of using the double quotes `\"`: There are possibly fewer situations like in the two code cells below, in which we must revert to using a `\\` to **escape** a single quote `'` in a text (cf., the section on special characters further below). However, double quotes `\"` are often used as well. So, this argument is somewhat not convincing.\n", + "\n", + "Many proponents of the single quote `'` usage claim that double quotes `\"` make more **visual noise** on the screen. This argument is also not convincing. On the contrary, one could claim that *two* single quotes `''` look so similar to *one* double quote `\"` that it might not be apparent right away what we are looking at. By sticking to double quotes `\"`, we avoid such danger of confusion.\n", + "\n", + "This discussion is an exellent example of a [flame war](https://en.wikipedia.org/wiki/Flaming_%28Internet%29#Flame_war) in the programming world that leads to *no* result.\n", + "\n", + "An *important* fact to know is that enclosing quotes of either kind are *not* part of the `str` object's *value*! They are merely *syntax* to make the text in a code cell a *literal* that Python converts into a `str` object upon reading." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "\"It's cool that strings are so versatile in Python!\"" + ] + }, + "execution_count": 5, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "\"It's cool that strings are so versatile in Python!\"" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "\"It's cool that strings are so versatile in Python!\"" + ] + }, + "execution_count": 6, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "'It\\'s cool that strings are so versatile in Python!'" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Alternatively, we can use the [str()](https://docs.python.org/3/library/stdtypes.html#str) built-in to **cast** non-`str` objects as a `str`." + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'123'" + ] + }, + "execution_count": 7, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "str(123)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## A \"String\" of Characters" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The idea of a **sequence** is yet another *abstract* concept.\n", + "\n", + "It unifies *four* orthogonal *abstract* concepts into one: Any *concrete* data type, such as `str` in this chapter, is considered a sequence if it simultaneously\n", + "\n", + "1. behaves like a **container** and\n", + "2. an **iterable**, and \n", + "3. comes with a predictable **order** of its\n", + "4. **finite** number of elements.\n", + "\n", + "Chapter 7 discusses sequences in a generic sense and in greater detail. Here, we keep our focus on the `str` type that historically received its name as it models a \"**[string of characters](https://en.wikipedia.org/wiki/String_%28computer_science%29)**,\" and a \"string\" is more formally called a sequence in the computer science literature.\n", + "\n", + "Behaving like a sequence, `str` objects may be treated like `list` objects in many cases. For example, the built-in [len()](https://docs.python.org/3/library/functions.html#len) function tells us how many elements (i.e., characters) make up `school`. [len()](https://docs.python.org/3/library/functions.html#len) would not work on an \"*infinite*\" object: As anything modeled in a program must fit into a computer's finite memory at runtime, there cannot exist objects containing a truly infinite number of elements; however, Chapter 7 introduces a *concrete* iterable data type that can be used to model an *infinite* series of elements and that, consequently, has no concept of \"length.\"" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "40" + ] + }, + "execution_count": 8, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "len(school)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Being iterable, we can iterate over a `str` object, for example, with a `for`-loop, and do something with the individual characters, for example, print them out with extra space in between them.\n", + "\n", + "[Chapter 4](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/04_iteration.ipynb#Containers-vs.-Iterables) already shows that we can loop over *different* concrete types: The example there first loops over the `list` object `[0, 1, 2, 3, 4]` and then the `range` object `range(5)`, with the same outcome. Now, we add the `str` type to the list of *concrete* types we can loop over. *Abstractly* speaking, all three are *iterable*, and there are many more iterable types in Python." + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "W H U - O t t o B e i s h e i m S c h o o l o f M a n a g e m e n t " + ] + } + ], + "source": [ + "for letter in school:\n", + " print(letter, end=\" \")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Being a container, we can check if a given object is a member of a sequence with the `in` operator. In the context of strings, the `in` operator has *two* usages: First, it checks if a *single* character is contained in a `str` object. Second, it may also check if a shorter `str` object, then called a **substring**, is contained in a longer one." + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 10, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "\"O\" in school" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 11, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "\"WHU\" in school" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "False" + ] + }, + "execution_count": 12, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "\"EBS\" in school" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## Indexing" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "As `str` objects have the additional property of being *ordered*, we may **index** into them to obtain individual characters with the **indexing operator** `[]`. This is analogous to how we obtained individual elements of a `list` object in [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb#Who-am-I?-And-how-many?)." + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'W'" + ] + }, + "execution_count": 13, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "school[0]" + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'H'" + ] + }, + "execution_count": 14, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "school[1]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The index must be of type `int`." + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "ename": "TypeError", + "evalue": "string indices must be integers", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mschool\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m1.0\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;31mTypeError\u001b[0m: string indices must be integers" + ] + } + ], + "source": [ + "school[1.0]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The last index is one less than the above \"length\" of the `str` object as we start counting at 0." + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'t'" + ] + }, + "execution_count": 16, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "school[39]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "An `IndexError` is raised whenever the index is too large." + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "ename": "IndexError", + "evalue": "string index out of range", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mIndexError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mschool\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m40\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;31mIndexError\u001b[0m: string index out of range" + ] + } + ], + "source": [ + "school[40]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "We may use *negative* indexes to start counting from the end of the `str` object." + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'t'" + ] + }, + "execution_count": 18, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "school[-1]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "One reason why programmers like to start counting at 0 is that a positive index and its *corresponding* negative index always add up to the length of the sequence." + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'O'" + ] + }, + "execution_count": 19, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "school[6]" + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'O'" + ] + }, + "execution_count": 20, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "school[-34]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## Slicing" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "A **slice** is a substring of a `str` object.\n", + "\n", + "The **slicing operator** is a generalization of the indexing operator: We can put one, two, or three integers within the brackets, separated by colons `:`. The three integers are then referred to as the *start*, *end*, and *step* values." + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'WHU'" + ] + }, + "execution_count": 21, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "school[0:3]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Whereas the *start* is always included in the result, the *end* is not. Counter-intuitive at first, this makes working with individual slices easier as they add up to the original `str` object again. As the *end* is not included, we must end the second slice with `len(school)` or `40` below." + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'WHU - Otto Beisheim School of Management'" + ] + }, + "execution_count": 22, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "school[0:3] + school[3:40]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "For convenience, the indexes do not need to lie in the range from 0 to the string's \"length\" when slicing. This is *not* the case for indexing as the `IndexError` above shows." + ] + }, + { + "cell_type": "code", + "execution_count": 23, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'WHU - Otto Beisheim School of Management'" + ] + }, + "execution_count": 23, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "school[0:999]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Commonly, we leave out the `0` for the *start* and the *end* if it is equal to the \"length.\"" + ] + }, + { + "cell_type": "code", + "execution_count": 24, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'WHU - Otto Beisheim School of Management'" + ] + }, + "execution_count": 24, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "school[:3] + school[3:]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Slicing makes it easy to obtain shorter versions of the original `str` object." + ] + }, + { + "cell_type": "code", + "execution_count": 25, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'WHU Otto Beisheim School'" + ] + }, + "execution_count": 25, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "school[:3] + school[5:26]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "A *step* value of $i$ can be used to obtain only every $i$th letter." + ] + }, + { + "cell_type": "code", + "execution_count": 26, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'WU-Ot esemSho fMngmn'" + ] + }, + "execution_count": 26, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "school[::2]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "A negative *step* size reverses the order of the characters." + ] + }, + { + "cell_type": "code", + "execution_count": 27, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'tnemeganaM fo loohcS miehsieB ottO - UHW'" + ] + }, + "execution_count": 27, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "school[::-1]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## Immutability" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Whereas elements of a `list` object *may* be *re-assigned*, as shortly hinted at in [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb#Who-am-I?-And-how-many?), this is *not* allowed for `str` objects. Once created, they *cannot* be *changed*. Formally, we say that they are **immutable**. In that regard, `str` objects and all the numeric types in [Chapter 5](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/05_numbers.ipynb) are alike.\n", + "\n", + "On the contrary, objects that may be changed after creation, are called **mutable**. We already saw in [Chapter 1](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/01_elements.ipynb#Who-am-I?-And-how-many?) how mutable objects are more difficult to reason about for a beginner, in particular, if more than *one* variable point to one. Yet, mutability does have its place in a programmer's toolbox, and we revisit this idea in the next chapters.\n", + "\n", + "`str` objects are *immutable* as the `TypeError` indicates: Assignment to an index is *not* supported by this type." + ] + }, + { + "cell_type": "code", + "execution_count": 28, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "ename": "TypeError", + "evalue": "'str' object does not support item assignment", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", + "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mschool\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m\"E\"\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;31mTypeError\u001b[0m: 'str' object does not support item assignment" + ] + } + ], + "source": [ + "school[0] = \"E\"" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The only thing we can do is to create a *new* `str` object in memory." + ] + }, + { + "cell_type": "code", + "execution_count": 29, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "new_school = \"EBS\" + school[3:]" + ] + }, + { + "cell_type": "code", + "execution_count": 30, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'EBS - Otto Beisheim School of Management'" + ] + }, + "execution_count": 30, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "new_school" + ] + }, + { + "cell_type": "code", + "execution_count": 31, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "140148946876144" + ] + }, + "execution_count": 31, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "id(new_school)" + ] + }, + { + "cell_type": "code", + "execution_count": 32, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "140148947850512" + ] + }, + "execution_count": 32, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "id(school)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## String Operations" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "As mentioned before, the `+` and `*` operators are *overloaded* and used for **string concatenation**." + ] + }, + { + "cell_type": "code", + "execution_count": 33, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "greeting = \"Hello \"" + ] + }, + { + "cell_type": "code", + "execution_count": 34, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'Hello WHU'" + ] + }, + "execution_count": 34, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "greeting + school[:3]" + ] + }, + { + "cell_type": "code", + "execution_count": 35, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'WHU WHU WHU WHU WHU WHU WHU WHU WHU WHU '" + ] + }, + "execution_count": 35, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "10 * school[:4]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## String Methods" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Objects of type `str` come with many **methods** bound on them (cf., the [documentation](https://docs.python.org/3/library/stdtypes.html#string-methods) for a full list). As seen before, they work like *normal* functions and are accessed via the **dot operator** `.`. Calling a method is also referred to as **method invocation**.\n", + "\n", + "The [find()](https://docs.python.org/3/library/stdtypes.html#str.find) method returns the index of the first occurrence of a character or a substring. If no match is found, it returns `-1`." + ] + }, + { + "cell_type": "code", + "execution_count": 36, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "6" + ] + }, + "execution_count": 36, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "school.find(\"O\")" + ] + }, + { + "cell_type": "code", + "execution_count": 37, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "-1" + ] + }, + "execution_count": 37, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "school.find(\"Z\")" + ] + }, + { + "cell_type": "code", + "execution_count": 38, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "11" + ] + }, + "execution_count": 38, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "school.find(\"Beisheim\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "[find()](https://docs.python.org/3/library/stdtypes.html#str.find) takes optional *start* and *end* arguments that allow us to find occurrences other than the first one." + ] + }, + { + "cell_type": "code", + "execution_count": 39, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "12" + ] + }, + "execution_count": 39, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "school.find(\"e\")" + ] + }, + { + "cell_type": "code", + "execution_count": 40, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "16" + ] + }, + "execution_count": 40, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "school.find(\"e\", 13) # 13 not 12 as otherwise the same character is found" + ] + }, + { + "cell_type": "code", + "execution_count": 41, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "-1" + ] + }, + "execution_count": 41, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "school.find(\"e\", 13, 15) # \"e\" does not occur in the specified slice" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "[count()](https://docs.python.org/3/library/stdtypes.html#str.count) does what we expect." + ] + }, + { + "cell_type": "code", + "execution_count": 42, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "4" + ] + }, + "execution_count": 42, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "school.count(\"o\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "As [count()](https://docs.python.org/3/library/stdtypes.html#str.count) is *case-sensitive*, we must **chain** it with the [lower()](https://docs.python.org/3/library/stdtypes.html#str.lower) method to get the count of all \"O\"s and \"o\"s." + ] + }, + { + "cell_type": "code", + "execution_count": 43, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "5" + ] + }, + "execution_count": 43, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "school.lower().count(\"o\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Alternatively, we may use the [upper()](https://docs.python.org/3/library/stdtypes.html#str.upper) method and search for \"O\"s." + ] + }, + { + "cell_type": "code", + "execution_count": 44, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "5" + ] + }, + "execution_count": 44, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "school.upper().count(\"O\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Because `str` objects are *immutable*, the methods always return *new* objects, even if a method does *not* change the value of the `str` object at all." + ] + }, + { + "cell_type": "code", + "execution_count": 45, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "example = \"test\"" + ] + }, + { + "cell_type": "code", + "execution_count": 46, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "140149036799680" + ] + }, + "execution_count": 46, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "id(example)" + ] + }, + { + "cell_type": "code", + "execution_count": 47, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "lower = example.lower()" + ] + }, + { + "cell_type": "code", + "execution_count": 48, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "140148946626016" + ] + }, + "execution_count": 48, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "id(lower)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`example` and `lower` are *different* objects with the *same* value." + ] + }, + { + "cell_type": "code", + "execution_count": 49, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "False" + ] + }, + "execution_count": 49, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "example is lower" + ] + }, + { + "cell_type": "code", + "execution_count": 50, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 50, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "example == lower" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Another popular string method is [split()](https://docs.python.org/3/library/stdtypes.html#str.split): It separates a longer `str` object into a list of smaller ones. By default, groups of whitespace are used as the *separator*." + ] + }, + { + "cell_type": "code", + "execution_count": 51, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "WHU\n", + "-\n", + "Otto\n", + "Beisheim\n", + "School\n", + "of\n", + "Management\n" + ] + } + ], + "source": [ + "for word in school.split():\n", + " print(word)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The opposite of splitting is done with the [join()](https://docs.python.org/3/library/stdtypes.html#str.join) method. It is typically invoked on a `str` object that represents a separator (e.g., `\" \"` or `\", \"`) and connects the elements of an *iterable* argument passed in (e.g., `words` below) into one *new* `str` object." + ] + }, + { + "cell_type": "code", + "execution_count": 52, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "words = [\"This\", \"will\", \"become\", \"a\", \"sentence\"]" + ] + }, + { + "cell_type": "code", + "execution_count": 53, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [], + "source": [ + "sentence = \" \".join(words)" + ] + }, + { + "cell_type": "code", + "execution_count": 54, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'This will become a sentence'" + ] + }, + "execution_count": 54, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "sentence" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "As the `str` object `\"abcde\"` below is an *iterable* itself, its elements (i.e., characters) are joined together with a space `\" \"` in between." + ] + }, + { + "cell_type": "code", + "execution_count": 55, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'a b c d e'" + ] + }, + "execution_count": 55, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "\" \".join(\"abcde\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The [replace()](https://docs.python.org/3/library/stdtypes.html#str.replace) method creates a *new* `str` object with parts of the text replaced." + ] + }, + { + "cell_type": "code", + "execution_count": 56, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'This is a sentence'" + ] + }, + "execution_count": 56, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "sentence.replace(\"will become\", \"is\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## String Comparison" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The *relational* operators also work with `str` objects, another example of operator overloading. Comparison is done one character at a time until the first pair differs or one operand ends. However, `str` objects are sorted in a \"weird\" way. The reason for this is that computers store characters internally as numbers (i.e., $0$s and $1$s). Depending on the character encoding, these numbers vary. Commonly, characters and symbols used in the American language are encoded with the numbers 0 through 127, the so-called [ASCII standard](https://en.wikipedia.org/wiki/ASCII). However, Python works with the more general [Unicode/UTF-8 standard](https://en.wikipedia.org/wiki/UTF-8) that understands every language ever used by humans, even emojis." + ] + }, + { + "cell_type": "code", + "execution_count": 57, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "A = \"Apple\" # ignore snake_case for variable names in this example\n", + "a = \"apple\"\n", + "B = \"Banana\"" + ] + }, + { + "cell_type": "code", + "execution_count": 58, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 58, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "A < B" + ] + }, + { + "cell_type": "code", + "execution_count": 59, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "False" + ] + }, + "execution_count": 59, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "a < B" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "One way to fix this is to only compare lower-case strings." + ] + }, + { + "cell_type": "code", + "execution_count": 60, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 60, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "a < B.lower()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "To provide a simple intuition for the \"weird\" sorting above, let's think of the American alphabet as being represented by the numbers as listed below. Then `\"Banana\"` is clearly \"smaller\" than `\"apple\"`." + ] + }, + { + "cell_type": "code", + "execution_count": 61, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "A -> 65 \t a -> 97\n", + "B -> 66 \t b -> 98\n", + "C -> 67 \t c -> 99\n", + "D -> 68 \t d -> 100\n", + "E -> 69 \t e -> 101\n", + "F -> 70 \t f -> 102\n", + "G -> 71 \t g -> 103\n", + "H -> 72 \t h -> 104\n", + "I -> 73 \t i -> 105\n", + "J -> 74 \t j -> 106\n", + "K -> 75 \t k -> 107\n", + "L -> 76 \t l -> 108\n", + "M -> 77 \t m -> 109\n", + "N -> 78 \t n -> 110\n", + "O -> 79 \t o -> 111\n", + "P -> 80 \t p -> 112\n", + "Q -> 81 \t q -> 113\n", + "R -> 82 \t r -> 114\n", + "S -> 83 \t s -> 115\n", + "T -> 84 \t t -> 116\n", + "U -> 85 \t u -> 117\n", + "V -> 86 \t v -> 118\n", + "W -> 87 \t w -> 119\n", + "X -> 88 \t x -> 120\n", + "Y -> 89 \t y -> 121\n", + "Z -> 90 \t z -> 122\n" + ] + } + ], + "source": [ + "for lower_i in range(65, 91):\n", + " upper_i = lower_i + 32 # all the upper case characters are offset by 32\n", + " lower_char = chr(lower_i) # from their lower case counterpart\n", + " upper_char = chr(upper_i)\n", + " print(f\"{lower_char} -> {lower_i} \\t {upper_char} -> {upper_i}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## String Interpolation" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The previous code cell shows an example of a so-called **f-string**, as introduced by [PEP 498](https://www.python.org/dev/peps/pep-0498/) only in 2016.\n", + "\n", + "So far, we have used the built-in [print()](https://docs.python.org/3/library/functions.html#print) function only with \"plain\" `str` objects (e.g., `\"example\"`) or variables. Sometimes, it is more convenient to fill a value determined at runtime in a \"draft\" `str` object. This is called **string interpolation**. There are three ways to achieve that in Python, but only two are commonly used." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "### f-strings" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "f-strings are the new and most readable way. Prepend the literal notation with an `f`, and put variables/expressions within curly braces. The latter are then filled in when the [print()](https://docs.python.org/3/library/functions.html#print) function is executed." + ] + }, + { + "cell_type": "code", + "execution_count": 62, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "name = \"Alexander\"\n", + "time_of_day = \"morning\"" + ] + }, + { + "cell_type": "code", + "execution_count": 63, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Hello Alexander! Good morning.\n" + ] + } + ], + "source": [ + "print(f\"Hello {name}! Good {time_of_day}.\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Separated by a colon `:`, various formatting options are available. In the beginning, only the ability to round is useful, and this can be achieved by adding `:.2f` to the variable name to cast the number as a float and round it to two digits." + ] + }, + { + "cell_type": "code", + "execution_count": 64, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [], + "source": [ + "pi = 3.141592653" + ] + }, + { + "cell_type": "code", + "execution_count": 65, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Pi is 3.14\n" + ] + } + ], + "source": [ + "print(f\"Pi is {pi:.2f}\")" + ] + }, + { + "cell_type": "code", + "execution_count": 66, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Pi is 3.142\n" + ] + } + ], + "source": [ + "print(f\"Pi is {pi:.3f}\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "### format() Method" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "`str` objects also provide a [format()](https://docs.python.org/3/library/stdtypes.html#str.format) method that accepts an arbitrary number of *positional* arguments that are inserted into the `str` object in the same order replacing curly brackets. See the official [documentation](https://docs.python.org/3/library/string.html#formatspec) for a full reference. This is the more traditional way of string interpolation and many code examples on the internet use it. f-strings are the officially recommended way going forward, but the usage of the [format()](https://docs.python.org/3/library/stdtypes.html#str.format) method is most likely not going down any time soon." + ] + }, + { + "cell_type": "code", + "execution_count": 67, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Hello Alexander! Good morning.\n" + ] + } + ], + "source": [ + "print(\"Hello {}! Good {}.\".format(name, time_of_day))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Use index numbers if the order is different in the `str` object." + ] + }, + { + "cell_type": "code", + "execution_count": 68, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Good morning, Alexander\n" + ] + } + ], + "source": [ + "print(\"Good {1}, {0}\".format(name, time_of_day))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The [format()](https://docs.python.org/3/library/stdtypes.html#str.format) method may alternatively be used with *keyword* arguments as well. Then, we must put the keyword names within the curly brackets." + ] + }, + { + "cell_type": "code", + "execution_count": 69, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Hello Alexander! Good morning.\n" + ] + } + ], + "source": [ + "print(\"Hello {name}! Good {time}.\".format(name=name, time=time_of_day))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Numbers are treated as in the f-strings case." + ] + }, + { + "cell_type": "code", + "execution_count": 70, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Pi is 3.14\n" + ] + } + ], + "source": [ + "print(\"Pi is {:.2f}\".format(pi))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## Special Characters" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Some symbols have a special meaning within `str` objects. Popular examples are the newline `\\n` and tab `\\t` \"characters.\" The backslash symbol `\\` is also referred to as an **escape character** in this context, indicating that the following character has a meaning other than its literal meaning." + ] + }, + { + "cell_type": "code", + "execution_count": 71, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "This is a sentence\n", + "that is printed\n", + "on three lines.\n" + ] + } + ], + "source": [ + "print(\"This is a sentence\\nthat is printed\\non three lines.\")" + ] + }, + { + "cell_type": "code", + "execution_count": 72, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Words\taligned\twith\ttabs.\n" + ] + } + ], + "source": [ + "print(\"Words\\taligned\\twith\\ttabs.\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "As emojis are important as well, they can be inserted with the corresponding **unicode code point** number starting with `\\U`. See this [list](https://en.wikipedia.org/wiki/List_of_Unicode_characters) of unicode characters for an overview." + ] + }, + { + "cell_type": "code", + "execution_count": 73, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "😄\n" + ] + } + ], + "source": [ + "print(\"\\U0001f604\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## Raw Strings" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Sometimes we do *not* want the backslash `\\` and its following character be interpreted as special characters.\n", + "\n", + "For example, let's print a typical installation path on a Windows systems. Obviously, the newline character `\\n` does *not* makes sense here." + ] + }, + { + "cell_type": "code", + "execution_count": 74, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "C:\\Programs\n", + "ew_application\n" + ] + } + ], + "source": [ + "print(\"C:\\Programs\\new_application\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Some strings even produce a `SyntaxError` because the `\\U` *cannot* be interpreted as a unicode code point." + ] + }, + { + "cell_type": "code", + "execution_count": 75, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "ename": "SyntaxError", + "evalue": "(unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \\UXXXXXXXX escape (, line 1)", + "output_type": "error", + "traceback": [ + "\u001b[0;36m File \u001b[0;32m\"\"\u001b[0;36m, line \u001b[0;32m1\u001b[0m\n\u001b[0;31m print(\"C:\\Users\\Administrator\\Desktop\\Project_Folder\")\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \\UXXXXXXXX escape\n" + ] + } + ], + "source": [ + "print(\"C:\\Users\\Administrator\\Desktop\\Project_Folder\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "A simple solution would be to escape the escape character with a *second* backslash `\\`." + ] + }, + { + "cell_type": "code", + "execution_count": 76, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "C:\\Programs\\new_application\n" + ] + } + ], + "source": [ + "print(\"C:\\\\Programs\\\\new_application\")" + ] + }, + { + "cell_type": "code", + "execution_count": 77, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "C:\\Users\\Administrator\\Desktop\\Project_Folder\n" + ] + } + ], + "source": [ + "print(\"C:\\\\Users\\\\Administrator\\\\Desktop\\\\Project_Folder\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "However, this is tedious to remember and type. Luckily, Python allows treating any string literal as a \"raw\" string, and this is indicated in the string literal by a `r` prefix." + ] + }, + { + "cell_type": "code", + "execution_count": 78, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "C:\\Programs\\new_application\n" + ] + } + ], + "source": [ + "print(r\"C:\\Programs\\new_application\")" + ] + }, + { + "cell_type": "code", + "execution_count": 79, + "metadata": { + "slideshow": { + "slide_type": "-" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "C:\\Users\\Administrator\\Desktop\\Project_Folder\n" + ] + } + ], + "source": [ + "print(r\"C:\\Users\\Administrator\\Desktop\\Project_Folder\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "## Multi-line Strings" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Sometimes, it is convenient to split text across multiple lines in source code. For example, to make lines fit into the 79 characters requirement of [PEP 8](https://www.python.org/dev/peps/pep-0008/) or because the text naturally contains many newlines. Using double quotes `\"` around multiple lines results in a `SyntaxError`." + ] + }, + { + "cell_type": "code", + "execution_count": 80, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "ename": "SyntaxError", + "evalue": "EOL while scanning string literal (, line 1)", + "output_type": "error", + "traceback": [ + "\u001b[0;36m File \u001b[0;32m\"\"\u001b[0;36m, line \u001b[0;32m1\u001b[0m\n\u001b[0;31m \"\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m EOL while scanning string literal\n" + ] + } + ], + "source": [ + "\"\n", + "Do not break the lines like this\n", + "\"" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "However, by enclosing a string literal with either **triple-double** quotes `\"\"\"` or **triple-single** quotes `'''`, Python creates a \"plain\" `str` object. Docstrings are precisely that, and, by convention, always written in triple-double quotes `\"\"\"`." + ] + }, + { + "cell_type": "code", + "execution_count": 81, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "multi_line = \"\"\"\n", + "I am a multi-line string\n", + "consisting of 4 lines.\n", + "\"\"\"" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Linebreaks are kept and implicitly converted into `\\n` characters." + ] + }, + { + "cell_type": "code", + "execution_count": 82, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "data": { + "text/plain": [ + "'\\nI am a multi-line string\\nconsisting of 4 lines.\\n'" + ] + }, + "execution_count": 82, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "multi_line" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "The built-in [print()](https://docs.python.org/3/library/functions.html#print) function correctly prints out the `\\n` characters." + ] + }, + { + "cell_type": "code", + "execution_count": 83, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "I am a multi-line string\n", + "consisting of 4 lines.\n", + "\n" + ] + } + ], + "source": [ + "print(multi_line)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Using the [split()](https://docs.python.org/3/library/stdtypes.html#str.split) method with the optional *sep* argument, we confirm that `multi_line` consists of *four* lines with the first and last linebreaks being the first and last characters in the `str` object." + ] + }, + { + "cell_type": "code", + "execution_count": 84, + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "0 \n", + "1 I am a multi-line string\n", + "2 consisting of 4 lines.\n", + "3 \n" + ] + } + ], + "source": [ + "for i, line in enumerate(multi_line.split(\"\\n\")):\n", + " print(i, line)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "## TL;DR" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "source": [ + "Textual data is modeled with the **immutable** `str` type.\n", + "\n", + "The `str` type supports *four* orthogonal **abstract concepts** that together make up the idea of a **sequence**: Every `str` object is an iterable container of a finite number of ordered characters." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.3" + }, + "livereveal": { + "auto_select": "code", + "auto_select_fragment": true, + "scroll": true, + "theme": "serif" + }, + "toc": { + "base_numbering": 1, + "nav_menu": {}, + "number_sections": false, + "sideBar": true, + "skip_h1_title": true, + "title_cell": "Table of Contents", + "title_sidebar": "Contents", + "toc_cell": false, + "toc_position": { + "height": "calc(100% - 180px)", + "left": "10px", + "top": "150px", + "width": "384px" + }, + "toc_section_display": false, + "toc_window_display": false + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/06_text_review_and_exercises.ipynb b/06_text_review_and_exercises.ipynb new file mode 100644 index 0000000..304a91d --- /dev/null +++ b/06_text_review_and_exercises.ipynb @@ -0,0 +1,205 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "\n", + "# Chapter 6: Text" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Content Review" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Read [Chapter 6](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/06_text.ipynb) of the book. Then work through the eight review questions." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Essay Questions " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Answer the following questions briefly with *at most* 300 characters per question!" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q1**: In what sense is a **\"string\" of characters** a **sequence**? What is a **sequence** after all? A *concrete* **data type**, or something else?" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q2**: What is a direct consequence of the `str` type's property of being **ordered**? What operations could we *not* do with it if it were *unordered*?" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q3**: What does it mean for an object to be **immutable**? Discuss how we can verify the `str` type's immutability by comparing the two variables `example` and `full_slice` below. Are they pointing to the *same* object in memory?\n", + "```python\n", + "example = \"text\"\n", + "full_slice = example[:]\n", + "```\n", + "Hint: Find out what `[:]` does first!" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q4**: Describe in your own words what we mean with **string interpolation**!" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### True / False Questions" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Motivate your answer with *one short* sentence!" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q5**: **Triple-double** quotes `\"\"\"` and **triple-single** quotes `'''` create a *new* object of type `text` that model so-called **multi-line strings**." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q6**: A **substring** is a string that *subsumes* another string." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q7**: Indexing into a `str` object with a *negative* index **fails silently**: It does *not* raise an error but also does *not* do anything useful." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Q8**: We *cannot* assign a *different* character to an index or slice of a `str` object because it is **immutable**." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + " " + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.7.3" + }, + "toc": { + "base_numbering": 1, + "nav_menu": {}, + "number_sections": false, + "sideBar": true, + "skip_h1_title": true, + "title_cell": "Table of Contents", + "title_sidebar": "Contents", + "toc_cell": false, + "toc_position": {}, + "toc_section_display": false, + "toc_window_display": false + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/README.md b/README.md index 4561afb..daa2e93 100644 --- a/README.md +++ b/README.md @@ -19,6 +19,8 @@ As such they can be viewed in a plain web browser: - [02 - Functions & Modularization](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/02_functions.ipynb) - [03 - Conditionals & Exceptions](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/03_conditionals.ipynb) - [04 - Recursion & Looping](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/04_iteration.ipynb) +- [05 - Numbers](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/05_numbers.ipynb) +- [06 - Text](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/master/06_text.ipynb) However, it is recommended that students **install Python and Jupyter locally** and run the code in the notebooks on their own. diff --git a/pyproject.toml b/pyproject.toml index 1e69fc5..c2a9bdc 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -1,6 +1,6 @@ [tool.poetry] name = "intro-to-python" -version = "0.1.0" +version = "0.4.0" description = "An introduction to Python and programming for wanna-be data scientists" authors = ["Alexander Hess "] license = "MIT" diff --git a/static/complex_numbers.png b/static/complex_numbers.png new file mode 100644 index 0000000..807eb46 Binary files /dev/null and b/static/complex_numbers.png differ diff --git a/static/floating_point.png b/static/floating_point.png new file mode 100644 index 0000000..345aaba Binary files /dev/null and b/static/floating_point.png differ diff --git a/static/numbers.png b/static/numbers.png new file mode 100644 index 0000000..6d75726 Binary files /dev/null and b/static/numbers.png differ