2180 lines
48 KiB
Text
2180 lines
48 KiB
Text
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"source": [
|
|
"**Note**: Click on \"*Kernel*\" > \"*Restart Kernel and Clear All Outputs*\" in [JupyterLab](https://jupyterlab.readthedocs.io/en/stable/) *before* reading this notebook to reset its output. If you cannot run this file on your machine, you may want to open it [in the cloud <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_mb.png\">](https://mybinder.org/v2/gh/webartifex/intro-to-python/develop?urlpath=lab/tree/05_numbers/01_content.ipynb)."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"source": [
|
|
"# Chapter 5: Numbers & Bits (continued)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"source": [
|
|
"In this second part of the chapter, we look at the `float` type in detail. It is probably the most commonly used one in all of data science, even across programming languages."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"source": [
|
|
"## The `float` Type"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"source": [
|
|
"As we have seen before, some assumptions need to be made as to how the $0$s and $1$s in a computer's memory are to be translated into numbers. This process becomes a lot more involved when we go beyond integers and model [real numbers <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_wiki.png\">](https://en.wikipedia.org/wiki/Real_number) (i.e., the set $\\mathbb{R}$) with possibly infinitely many digits to the right of the period like $1.23$.\n",
|
|
"\n",
|
|
"The **[Institute of Electrical and Electronics Engineers <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_wiki.png\">](https://en.wikipedia.org/wiki/Institute_of_Electrical_and_Electronics_Engineers)** (IEEE, pronounced \"eye-triple-E\") is one of the important professional associations when it comes to standardizing all kinds of aspects regarding the implementation of soft- and hardware.\n",
|
|
"\n",
|
|
"The **[IEEE 754 <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_wiki.png\">](https://en.wikipedia.org/wiki/IEEE_754)** standard defines the so-called **floating-point arithmetic** that is commonly used today by all major programming languages. The standard not only defines how the $0$s and $1$s are organized in memory but also, for example, how values are to be rounded, what happens in exceptional cases like divisions by zero, or what is a zero value in the first place.\n",
|
|
"\n",
|
|
"In Python, the simplest way to create a `float` object is to use a literal notation with a dot `.` in it."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 1,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"b = 42.0"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 2,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "fragment"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"139923238853936"
|
|
]
|
|
},
|
|
"execution_count": 2,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"id(b)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 3,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "fragment"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"float"
|
|
]
|
|
},
|
|
"execution_count": 3,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"type(b)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 4,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "fragment"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"42.0"
|
|
]
|
|
},
|
|
"execution_count": 4,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"b"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"source": [
|
|
"As with `int` literals, we may use underscores `_` to make longer `float` objects easier to read."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 5,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"0.123456789"
|
|
]
|
|
},
|
|
"execution_count": 5,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"0.123_456_789"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"source": [
|
|
"In cases where the dot `.` is unnecessary from a mathematical point of view, we either need to end the number with it nevertheless or use the [float() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#float) built-in to cast the number explicitly. [float() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#float) can process any numeric object or a properly formatted `str` object."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 6,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"42.0"
|
|
]
|
|
},
|
|
"execution_count": 6,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"42."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 7,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "fragment"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"42.0"
|
|
]
|
|
},
|
|
"execution_count": 7,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"float(42)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 8,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "fragment"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"42.0"
|
|
]
|
|
},
|
|
"execution_count": 8,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"float(\"42\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"source": [
|
|
"Leading and trailing whitespace is ignored ..."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 9,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"42.87"
|
|
]
|
|
},
|
|
"execution_count": 9,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"float(\" 42.87 \")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"source": [
|
|
"... but not whitespace in between."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 10,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"ename": "ValueError",
|
|
"evalue": "could not convert string to float: '42. 87'",
|
|
"output_type": "error",
|
|
"traceback": [
|
|
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
|
|
"\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)",
|
|
"Cell \u001b[0;32mIn[10], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[38;5;28;43mfloat\u001b[39;49m\u001b[43m(\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43m42. 87\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m)\u001b[49m\n",
|
|
"\u001b[0;31mValueError\u001b[0m: could not convert string to float: '42. 87'"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"float(\"42. 87\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"source": [
|
|
"`float` objects are implicitly created as the result of dividing an `int` object by another with the division operator `/`."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 11,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"0.3333333333333333"
|
|
]
|
|
},
|
|
"execution_count": 11,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"1 / 3"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"source": [
|
|
"In general, if we combine `float` and `int` objects in arithmetic operations, we always end up with a `float` type: Python uses the \"broader\" representation."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 12,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "fragment"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"42.0"
|
|
]
|
|
},
|
|
"execution_count": 12,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"40.0 + 2"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 13,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"42.0"
|
|
]
|
|
},
|
|
"execution_count": 13,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"21 * 2.0"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"source": [
|
|
"### Scientific Notation"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"source": [
|
|
"`float` objects may also be created with the **scientific literal notation**: We use the symbol `e` to indicate powers of $10$, so $1.23 * 10^0$ translates into `1.23e0`."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 14,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"1.23"
|
|
]
|
|
},
|
|
"execution_count": 14,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"1.23e0"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"source": [
|
|
"Syntactically, `e` needs a `float` or `int` object in its literal notation on its left and an `int` object on its right, both without a space. Otherwise, we get a `SyntaxError`."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 15,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"ename": "SyntaxError",
|
|
"evalue": "invalid syntax (1434619204.py, line 1)",
|
|
"output_type": "error",
|
|
"traceback": [
|
|
"\u001b[0;36m Cell \u001b[0;32mIn[15], line 1\u001b[0;36m\u001b[0m\n\u001b[0;31m 1.23 e0\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m invalid syntax\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"1.23 e0"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 16,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"ename": "SyntaxError",
|
|
"evalue": "invalid decimal literal (3971374856.py, line 1)",
|
|
"output_type": "error",
|
|
"traceback": [
|
|
"\u001b[0;36m Cell \u001b[0;32mIn[16], line 1\u001b[0;36m\u001b[0m\n\u001b[0;31m 1.23e 0\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m invalid decimal literal\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"1.23e 0"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 17,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"ename": "SyntaxError",
|
|
"evalue": "invalid syntax (2795275998.py, line 1)",
|
|
"output_type": "error",
|
|
"traceback": [
|
|
"\u001b[0;36m Cell \u001b[0;32mIn[17], line 1\u001b[0;36m\u001b[0m\n\u001b[0;31m 1.23e0.0\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m invalid syntax\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"1.23e0.0"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"source": [
|
|
"If we leave out the number to the left, Python raises a `NameError` as it unsuccessfully tries to look up a variable named `e0`."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 18,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"ename": "NameError",
|
|
"evalue": "name 'e0' is not defined",
|
|
"output_type": "error",
|
|
"traceback": [
|
|
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
|
|
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
|
|
"Cell \u001b[0;32mIn[18], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43me0\u001b[49m\n",
|
|
"\u001b[0;31mNameError\u001b[0m: name 'e0' is not defined"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"e0"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"source": [
|
|
"So, to write $10^0$ in Python, we need to think of it as $1*10^0$ and write `1e0`."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 19,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "fragment"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"1.0"
|
|
]
|
|
},
|
|
"execution_count": 19,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"1e0"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"source": [
|
|
"To express thousands of something (i.e., $10^3$), we write `1e3`."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 20,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "fragment"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"1000.0"
|
|
]
|
|
},
|
|
"execution_count": 20,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"1e3 # = thousands"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"source": [
|
|
"Similarly, to express, for example, milliseconds (i.e., $10^{-3} s$), we write `1e-3`."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 21,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "fragment"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"0.001"
|
|
]
|
|
},
|
|
"execution_count": 21,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"1e-3 # = milli"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"source": [
|
|
"## Special Values"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"source": [
|
|
"There are also three special values representing \"**not a number,**\" called `nan`, and positive or negative **infinity**, called `inf` or `-inf`, that are created by passing in the corresponding abbreviation as a `str` object to the [float() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#float) built-in. These values could be used, for example, as the result of a mathematically undefined operation like division by zero or to model the value of a mathematical function as it goes to infinity."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 22,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"nan"
|
|
]
|
|
},
|
|
"execution_count": 22,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"float(\"nan\") # also float(\"NaN\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 23,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"inf"
|
|
]
|
|
},
|
|
"execution_count": 23,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"float(\"+inf\") # also float(\"+infinity\") or float(\"infinity\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 24,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "fragment"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"inf"
|
|
]
|
|
},
|
|
"execution_count": 24,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"float(\"inf\") # also float(\"+inf\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 25,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "fragment"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"-inf"
|
|
]
|
|
},
|
|
"execution_count": 25,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"float(\"-inf\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"source": [
|
|
"`nan` objects *never* compare equal to *anything*, not even to themselves. This happens in accordance with the [IEEE 754 <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_wiki.png\">](https://en.wikipedia.org/wiki/IEEE_754) standard."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 26,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"False"
|
|
]
|
|
},
|
|
"execution_count": 26,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"float(\"nan\") == float(\"nan\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"source": [
|
|
"Another caveat is that any arithmetic involving a `nan` object results in `nan`. In other words, the addition below **fails silently** as no error is raised. As this also happens in accordance with the [IEEE 754 <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_wiki.png\">](https://en.wikipedia.org/wiki/IEEE_754) standard, we *need* to be aware of that and check any data we work with for any `nan` occurrences *before* doing any calculations."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 27,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "fragment"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"nan"
|
|
]
|
|
},
|
|
"execution_count": 27,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"42 + float(\"nan\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"source": [
|
|
"On the contrary, as two values go to infinity, there is no such concept as difference and *everything* compares equal."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 28,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"True"
|
|
]
|
|
},
|
|
"execution_count": 28,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"float(\"inf\") == float(\"inf\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"source": [
|
|
"Adding `42` to `inf` makes no difference."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 29,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"inf"
|
|
]
|
|
},
|
|
"execution_count": 29,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"float(\"inf\") + 42"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 30,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "fragment"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"True"
|
|
]
|
|
},
|
|
"execution_count": 30,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"float(\"inf\") + 42 == float(\"inf\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"source": [
|
|
"We observe the same for multiplication ..."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 31,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"inf"
|
|
]
|
|
},
|
|
"execution_count": 31,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"42 * float(\"inf\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 32,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"True"
|
|
]
|
|
},
|
|
"execution_count": 32,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"42 * float(\"inf\") == float(\"inf\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"source": [
|
|
"... and even exponentiation!"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 33,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"inf"
|
|
]
|
|
},
|
|
"execution_count": 33,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"float(\"inf\") ** 42"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 34,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"True"
|
|
]
|
|
},
|
|
"execution_count": 34,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"float(\"inf\") ** 42 == float(\"inf\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"source": [
|
|
"Although absolute differences become unmeaningful as we approach infinity, signs are still respected."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 35,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"inf"
|
|
]
|
|
},
|
|
"execution_count": 35,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"-42 * float(\"-inf\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 36,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"True"
|
|
]
|
|
},
|
|
"execution_count": 36,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"-42 * float(\"-inf\") == float(\"inf\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"source": [
|
|
"As a caveat, adding infinities of different signs is an *undefined operation* in math and results in a `nan` object. So, if we (accidentally or unknowingly) do this on a real dataset, we do *not* see any error messages, and our program may continue to run with non-meaningful results! This is another example of a piece of code **failing silently**."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 37,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "fragment"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"nan"
|
|
]
|
|
},
|
|
"execution_count": 37,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"float(\"inf\") + float(\"-inf\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 38,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "fragment"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"nan"
|
|
]
|
|
},
|
|
"execution_count": 38,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"float(\"inf\") - float(\"inf\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"source": [
|
|
"## Imprecision"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"source": [
|
|
"`float` objects are *inherently* imprecise, and there is *nothing* we can do about it! In particular, arithmetic operations with two `float` objects may result in \"weird\" rounding \"errors\" that are strictly deterministic and occur in accordance with the [IEEE 754 <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_wiki.png\">](https://en.wikipedia.org/wiki/IEEE_754) standard.\n",
|
|
"\n",
|
|
"For example, let's add `1` to `1e15` and `1e16`, respectively. In the latter case, the `1` somehow gets \"lost.\""
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 39,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"1000000000000001.0"
|
|
]
|
|
},
|
|
"execution_count": 39,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"1e15 + 1"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 40,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "fragment"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"1e+16"
|
|
]
|
|
},
|
|
"execution_count": 40,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"1e16 + 1"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"source": [
|
|
"Interactions between sufficiently large and small `float` objects are not the only source of imprecision."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 41,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"from math import sqrt"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 42,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "fragment"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"2.0000000000000004"
|
|
]
|
|
},
|
|
"execution_count": 42,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"sqrt(2) ** 2"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 43,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "fragment"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"0.30000000000000004"
|
|
]
|
|
},
|
|
"execution_count": 43,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"0.1 + 0.2"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"source": [
|
|
"This may become a problem if we rely on equality checks in our programs."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 44,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "fragment"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"False"
|
|
]
|
|
},
|
|
"execution_count": 44,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"sqrt(2) ** 2 == 2"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 45,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "fragment"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"False"
|
|
]
|
|
},
|
|
"execution_count": 45,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"0.1 + 0.2 == 0.3"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"source": [
|
|
"A popular workaround is to benchmark the absolute value of the difference between the two numbers to be checked for equality against a pre-defined `threshold` *sufficiently* close to `0`, for example, `1e-15`."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 46,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"threshold = 1e-15"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 47,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "fragment"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"True"
|
|
]
|
|
},
|
|
"execution_count": 47,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"abs((sqrt(2) ** 2) - 2) < threshold"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 48,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "fragment"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"True"
|
|
]
|
|
},
|
|
"execution_count": 48,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"abs((0.1 + 0.2) - 0.3) < threshold"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"source": [
|
|
"The built-in [format() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#format) function allows us to show the **significant digits** of a `float` number as they exist in memory to arbitrary precision. To exemplify it, let's view a couple of `float` objects with `50` digits. This analysis reveals that almost no `float` number is precise! After 14 or 15 digits \"weird\" things happen. As we see further below, the \"random\" digits ending the `float` numbers do *not* \"physically\" exist in memory! Rather, they are \"calculated\" by the [format() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#format) function that is forced to show `50` digits.\n",
|
|
"\n",
|
|
"The [format() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#format) function is different from the [format() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/stdtypes.html#str.format) method on `str` objects introduced in the next chapter (cf., [Chapter 6 <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_nb.png\">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/06_text/00_content.ipynb#format%28%29-Method)): Yet, both work with the so-called [format specification mini-language <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/string.html#format-specification-mini-language): `\".50f\"` is the instruction to show `50` digits of a `float` number."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 49,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"'0.10000000000000000555111512312578270211815834045410'"
|
|
]
|
|
},
|
|
"execution_count": 49,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"format(0.1, \".50f\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 50,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "fragment"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"'0.20000000000000001110223024625156540423631668090820'"
|
|
]
|
|
},
|
|
"execution_count": 50,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"format(0.2, \".50f\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 51,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "fragment"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"'0.29999999999999998889776975374843459576368331909180'"
|
|
]
|
|
},
|
|
"execution_count": 51,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"format(0.3, \".50f\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 52,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"'0.33333333333333331482961625624739099293947219848633'"
|
|
]
|
|
},
|
|
"execution_count": 52,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"format(1 / 3, \".50f\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"source": [
|
|
"The [format() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#format) function does *not* round a `float` object in the mathematical sense! It just allows us to show an arbitrary number of the digits as stored in memory, and it also does *not* change these.\n",
|
|
"\n",
|
|
"On the contrary, the built-in [round() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/functions.html#round) function creates a *new* numeric object that is a rounded version of the one passed in as the argument. It adheres to the common rules of math.\n",
|
|
"\n",
|
|
"For example, let's round `1 / 3` to five decimals. The obtained value for `roughly_a_third` is also *imprecise* but different from the \"exact\" representation of `1 / 3` above."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 53,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "fragment"
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"roughly_a_third = round(1 / 3, 5)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 54,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "fragment"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"0.33333"
|
|
]
|
|
},
|
|
"execution_count": 54,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"roughly_a_third"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 55,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "fragment"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"'0.33333000000000001517008740847813896834850311279297'"
|
|
]
|
|
},
|
|
"execution_count": 55,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"format(roughly_a_third, \".50f\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"source": [
|
|
"Surprisingly, `0.125` and `0.25` appear to be *precise*, and equality comparison works without the `threshold` workaround: Both are powers of $2$ in disguise."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 56,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"'0.12500000000000000000000000000000000000000000000000'"
|
|
]
|
|
},
|
|
"execution_count": 56,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"format(0.125, \".50f\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 57,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "fragment"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"'0.25000000000000000000000000000000000000000000000000'"
|
|
]
|
|
},
|
|
"execution_count": 57,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"format(0.25, \".50f\")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 58,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "fragment"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"True"
|
|
]
|
|
},
|
|
"execution_count": 58,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"0.125 + 0.125 == 0.25"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"source": [
|
|
"## Binary Representations"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"source": [
|
|
"To understand these subtleties, we need to look at the **[binary representation of floats <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_wiki.png\">](https://en.wikipedia.org/wiki/Double-precision_floating-point_format)** and review the basics of the **[IEEE 754 <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_wiki.png\">](https://en.wikipedia.org/wiki/IEEE_754)** standard. On modern machines, floats are modeled in so-called double precision with $64$ bits that are grouped as in the figure below. The first bit determines the sign ($0$ for plus, $1$ for minus), the next $11$ bits represent an $exponent$ term, and the last $52$ bits resemble the actual significant digits, the so-called $fraction$ part. The three groups are put together like so:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"source": [
|
|
"$$float = (-1)^{sign} * 1.fraction * 2^{exponent-1023}$$"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"source": [
|
|
"A $1.$ is implicitly prepended as the first digit, and both, $fraction$ and $exponent$, are stored in base $2$ representation (i.e., they both are interpreted like integers above). As $exponent$ is consequently non-negative, between $0_{10}$ and $2047_{10}$ to be precise, the $-1023$, called the exponent bias, centers the entire $2^{exponent-1023}$ term around $1$ and allows the period within the $1.fraction$ part be shifted into either direction by the same amount. Floating-point numbers received their name as the period, formally called the **[radix point <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_wiki.png\">](https://en.wikipedia.org/wiki/Radix_point)**, \"floats\" along the significant digits. As an aside, an $exponent$ of all $0$s or all $1$s is used to model the special values `nan` or `inf`.\n",
|
|
"\n",
|
|
"As the standard defines the exponent part to come as a power of $2$, we now see why `0.125` is a *precise* float: It can be represented as a power of $2$, i.e., $0.125 = (-1)^0 * 1.0 * 2^{1020-1023} = 2^{-3} = \\frac{1}{8}$. In other words, the floating-point representation of $0.125_{10}$ is $0_2$, $1111111100_2 = 1020_{10}$, and $0_2$ for the three groups, respectively."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "-"
|
|
}
|
|
},
|
|
"source": [
|
|
"<img src=\"static/floating_point.png\" width=\"85%\" align=\"center\">"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"source": [
|
|
"The crucial fact for the data science practitioner to understand is that mapping the *infinite* set of the real numbers $\\mathbb{R}$ to a *finite* set of bits leads to the imprecisions shown above!\n",
|
|
"\n",
|
|
"So, floats are usually good approximations of real numbers only with their first $14$ or $15$ digits. If more precision is required, we need to revert to other data types such as a `Decimal` or a `Fraction`, as shown in the next two sections.\n",
|
|
"\n",
|
|
"This [blog post](http://fabiensanglard.net/floating_point_visually_explained/) gives another neat and *visual* way as to how to think of floats. It also explains why floats become worse approximations of the reals as their absolute values increase.\n",
|
|
"\n",
|
|
"The Python [documentation <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/tutorial/floatingpoint.html) provides another good discussion of floats and the goodness of their approximations.\n",
|
|
"\n",
|
|
"If we are interested in the exact bits behind a `float` object, we use the [.hex() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/stdtypes.html#float.hex) method that returns a `str` object beginning with `\"0x1.\"` followed by the $fraction$ in hexadecimal notation and the $exponent$ as an integer after subtraction of $1023$ and separated by a `\"p\"`."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 59,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"one_eighth = 1 / 8"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 60,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "fragment"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"'0x1.0000000000000p-3'"
|
|
]
|
|
},
|
|
"execution_count": 60,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"one_eighth.hex()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"source": [
|
|
"Also, the [.as_integer_ratio() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/stdtypes.html#float.as_integer_ratio) method returns the two smallest integers whose ratio best approximates a `float` object."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 61,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "fragment"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"(1, 8)"
|
|
]
|
|
},
|
|
"execution_count": 61,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"one_eighth.as_integer_ratio()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 62,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "slide"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"'0x1.555475a31a4bep-2'"
|
|
]
|
|
},
|
|
"execution_count": 62,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"roughly_a_third.hex()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 63,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "fragment"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"(3002369727582815, 9007199254740992)"
|
|
]
|
|
},
|
|
"execution_count": 63,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"roughly_a_third.as_integer_ratio()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"source": [
|
|
"`0.0` is also a power of $2$ and thus a *precise* `float` number."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 64,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"zero = 0.0"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 65,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"'0x0.0p+0'"
|
|
]
|
|
},
|
|
"execution_count": 65,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"zero.hex()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 66,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"(0, 1)"
|
|
]
|
|
},
|
|
"execution_count": 66,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"zero.as_integer_ratio()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"source": [
|
|
"As seen in [Chapter 1 <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_nb.png\">](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/develop/01_elements/00_content.ipynb#%28Data%29-Type-%2F-%22Behavior%22), the [.is_integer() <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/stdtypes.html#float.is_integer) method tells us if a `float` can be casted as an `int` object without any loss in precision."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 67,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"False"
|
|
]
|
|
},
|
|
"execution_count": 67,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"roughly_a_third.is_integer()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 68,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"True"
|
|
]
|
|
},
|
|
"execution_count": 68,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"one = roughly_a_third / roughly_a_third\n",
|
|
"\n",
|
|
"one.is_integer()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"source": [
|
|
"As the exact implementation of floats may vary and be dependent on a particular Python installation, we look up the [.float_info <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/sys.html#sys.float_info) attribute in the [sys <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/sys.html) module in the [standard library <img height=\"12\" style=\"display: inline-block\" src=\"../static/link/to_py.png\">](https://docs.python.org/3/library/index.html) to check the details. Usually, this is not necessary."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 69,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"import sys"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 70,
|
|
"metadata": {
|
|
"slideshow": {
|
|
"slide_type": "skip"
|
|
}
|
|
},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"sys.float_info(max=1.7976931348623157e+308, max_exp=1024, max_10_exp=308, min=2.2250738585072014e-308, min_exp=-1021, min_10_exp=-307, dig=15, mant_dig=53, epsilon=2.220446049250313e-16, radix=2, rounds=1)"
|
|
]
|
|
},
|
|
"execution_count": 70,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"sys.float_info"
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": "Python 3",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.12.2"
|
|
},
|
|
"livereveal": {
|
|
"auto_select": "code",
|
|
"auto_select_fragment": true,
|
|
"scroll": true,
|
|
"theme": "serif"
|
|
},
|
|
"toc": {
|
|
"base_numbering": 1,
|
|
"nav_menu": {},
|
|
"number_sections": false,
|
|
"sideBar": true,
|
|
"skip_h1_title": true,
|
|
"title_cell": "Table of Contents",
|
|
"title_sidebar": "Contents",
|
|
"toc_cell": false,
|
|
"toc_position": {
|
|
"height": "calc(100% - 180px)",
|
|
"left": "10px",
|
|
"top": "150px",
|
|
"width": "384px"
|
|
},
|
|
"toc_section_display": false,
|
|
"toc_window_display": false
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 4
|
|
}
|