intro-to-data-science/01_scientific_stack.ipynb

663 lines
51 KiB
Text
Raw Normal View History

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Chapter 1: Python's Scientific Stack"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Python itself does not come with any scientific algorithms. However, over time, many third-party libraries emerged that are useful to build machine learning applications. In this context, \"third-party\" means that the libraries are *not* part of Python's standard library.\n",
"\n",
"Among the popular ones are [numpy](https://numpy.org/) (numerical computations, linear algebra), [pandas](https://pandas.pydata.org/) (data processing), [matplotlib](https://matplotlib.org/) (visualisations), and [scikit-learn](https://scikit-learn.org/stable/index.html) (machine learning algorithms).\n",
"\n",
"Before we can import these libraries, we must ensure that they installed on our computers. If you installed Python via the Anaconda Distribution that should already be the case. Otherwise, we can use Python's **package manager** `pip` to install them manually.\n",
"\n",
"`pip` is a so-called command-line interface (CLI), meaning it is a program that is run within a terminal window. JupyterLab allows us to run such a CLI tool from within a notebook by starting a code cell with a single `%` symbol. Here, this does not mean Python's modulo operator but is just an instruction to JupyterLab that the following code is *not* Python."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Requirement already satisfied: numpy in ./.venv/lib/python3.8/site-packages (1.20.3)\n",
"Note: you may need to restart the kernel to use updated packages.\n"
]
}
],
"source": [
"%pip install numpy"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Requirement already satisfied: pandas in ./.venv/lib/python3.8/site-packages (1.2.4)\n",
"Requirement already satisfied: numpy>=1.16.5 in ./.venv/lib/python3.8/site-packages (from pandas) (1.20.3)\n",
"Requirement already satisfied: python-dateutil>=2.7.3 in ./.venv/lib/python3.8/site-packages (from pandas) (2.8.1)\n",
"Requirement already satisfied: pytz>=2017.3 in ./.venv/lib/python3.8/site-packages (from pandas) (2021.1)\n",
"Requirement already satisfied: six>=1.5 in ./.venv/lib/python3.8/site-packages (from python-dateutil>=2.7.3->pandas) (1.16.0)\n",
"Note: you may need to restart the kernel to use updated packages.\n"
]
}
],
"source": [
"%pip install pandas"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Requirement already satisfied: matplotlib in ./.venv/lib/python3.8/site-packages (3.4.2)\n",
"Requirement already satisfied: python-dateutil>=2.7 in ./.venv/lib/python3.8/site-packages (from matplotlib) (2.8.1)\n",
"Requirement already satisfied: pyparsing>=2.2.1 in ./.venv/lib/python3.8/site-packages (from matplotlib) (2.4.7)\n",
"Requirement already satisfied: pillow>=6.2.0 in ./.venv/lib/python3.8/site-packages (from matplotlib) (8.2.0)\n",
"Requirement already satisfied: kiwisolver>=1.0.1 in ./.venv/lib/python3.8/site-packages (from matplotlib) (1.3.1)\n",
"Requirement already satisfied: numpy>=1.16 in ./.venv/lib/python3.8/site-packages (from matplotlib) (1.20.3)\n",
"Requirement already satisfied: cycler>=0.10 in ./.venv/lib/python3.8/site-packages (from matplotlib) (0.10.0)\n",
"Requirement already satisfied: six in ./.venv/lib/python3.8/site-packages (from cycler>=0.10->matplotlib) (1.16.0)\n",
"Note: you may need to restart the kernel to use updated packages.\n"
]
}
],
"source": [
"%pip install matplotlib"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Requirement already satisfied: scikit-learn in ./.venv/lib/python3.8/site-packages (0.24.2)\n",
"Requirement already satisfied: joblib>=0.11 in ./.venv/lib/python3.8/site-packages (from scikit-learn) (1.0.1)\n",
"Requirement already satisfied: threadpoolctl>=2.0.0 in ./.venv/lib/python3.8/site-packages (from scikit-learn) (2.1.0)\n",
"Requirement already satisfied: numpy>=1.13.3 in ./.venv/lib/python3.8/site-packages (from scikit-learn) (1.20.3)\n",
"Requirement already satisfied: scipy>=0.19.1 in ./.venv/lib/python3.8/site-packages (from scikit-learn) (1.6.1)\n",
"Note: you may need to restart the kernel to use updated packages.\n"
]
}
],
"source": [
"%pip install scikit-learn"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"After we have ensured that the third-party libraries are installed locally, we can simply go ahead with the `import` statement. All the libraries are commonly imported with shorter prefixes for convenient use later on."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"import pandas as pd\n",
"import matplotlib.pyplot as plt"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's see how the data type provided by these scientific libraries differ from Python's built-in ones.\n",
"\n",
"As an example, we create a `list` object similar to the one from Chapter 0."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[1, 2, 3]"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"vector = [1, 2, 3]\n",
"\n",
"vector"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We call the `list` object by the name `vector` as that is what the data mean conceptually. As we remember from our linear algebra courses, vectors should implement scalar-multiplication. So, the following code cell should result in `[3, 6, 9]` as the answer. Surprisingly, the result is a new `list` with all the elements in `vector` repeated three times. That operation is called **concatenation** and is an example of a concept called **operator overloading**. That means that an operator, like `*` in the example, may exhibit a different behavior depending on the data type of its operands."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[1, 2, 3, 1, 2, 3, 1, 2, 3]"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"3 * vector"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`numpy`, among others, provides a data type called an **n-dimensional array**. This may sound fancy at first but when used with only 1 or 2 dimensions, it basically represents vectors and matrices as we know them from linear algebra. Additionally, arrays allow for much faster computations as they are implemented in the very efficient [C language](https://en.wikipedia.org/wiki/C_%28programming_language%29) and optimized for numerical operations."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To create an array, we use the [array()](https://docs.scipy.org/doc/numpy/reference/generated/numpy.array.html#numpy-array) constructor from the imported `np` module and provide it with a `list` of values."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([1, 2, 3])"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"v1 = np.array([1, 2, 3])\n",
"\n",
"v1"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The vector `v1` can now be multiplied with a scalar yielding a result meaningful in the context of linear algebra."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([3, 6, 9])"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"v2 = 3 * v1\n",
"\n",
"v2"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To create a matrix, we just use a `list` of (row) `list`s of values."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[1, 2, 3],\n",
" [4, 5, 6]])"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"m1 = np.array([\n",
" [1, 2, 3],\n",
" [4, 5, 6],\n",
"])\n",
"\n",
"m1"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we can use `numpy`'s `dot()` function to multiply a matrix with a vector to obtain a new vector ..."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([14, 32])"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"v3 = np.dot(m1, v1)\n",
"\n",
"v3"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"... or simply transpose it by accessing its `.T` attribute."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[1, 4],\n",
" [2, 5],\n",
" [3, 6]])"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"m1.T"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The rules from maths still apply and it makes a difference if a vector is multiplied from the left or the right by a matrix. The following operation will fail."
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"ename": "ValueError",
"evalue": "shapes (3,) and (2,3) not aligned: 3 (dim 0) != 2 (dim 0)",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-13-c170f5d663e1>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mdot\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mv1\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mm1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;32m<__array_function__ internals>\u001b[0m in \u001b[0;36mdot\u001b[0;34m(*args, **kwargs)\u001b[0m\n",
"\u001b[0;31mValueError\u001b[0m: shapes (3,) and (2,3) not aligned: 3 (dim 0) != 2 (dim 0)"
]
}
],
"source": [
"np.dot(v1, m1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In order to retrieve only a **slice** (i.e., subset) of an array's data, we index into it. For example, the first row of the matrix is ..."
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([1, 2, 3])"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"m1[0, :]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"... while the second column is:"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([2, 5])"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"m1[:, 1]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To acces the lowest element in the right column, two indices can be used."
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"6"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"m1[1, 2]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`numpy` also provides various other functions and constants, such as `linspace()` to create an array of equidistant numbers, `sin()` to calculate the sinus values for all numbers in an array, or simple an approximation for `pi`. To further illustrate the concept of **vectorization**, let us calculate the sinus curve over a range of 100 values, going from negative to positive $3\\pi$."
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([-9.42477796, -9.23437841, -9.04397885, -8.8535793 , -8.66317974,\n",
" -8.47278019, -8.28238063, -8.09198108, -7.90158152, -7.71118197,\n",
" -7.52078241, -7.33038286, -7.1399833 , -6.94958375, -6.75918419,\n",
" -6.56878464, -6.37838508, -6.18798553, -5.99758598, -5.80718642,\n",
" -5.61678687, -5.42638731, -5.23598776, -5.0455882 , -4.85518865,\n",
" -4.66478909, -4.47438954, -4.28398998, -4.09359043, -3.90319087,\n",
" -3.71279132, -3.52239176, -3.33199221, -3.14159265, -2.9511931 ,\n",
" -2.76079354, -2.57039399, -2.37999443, -2.18959488, -1.99919533,\n",
" -1.80879577, -1.61839622, -1.42799666, -1.23759711, -1.04719755,\n",
" -0.856798 , -0.66639844, -0.47599889, -0.28559933, -0.09519978,\n",
" 0.09519978, 0.28559933, 0.47599889, 0.66639844, 0.856798 ,\n",
" 1.04719755, 1.23759711, 1.42799666, 1.61839622, 1.80879577,\n",
" 1.99919533, 2.18959488, 2.37999443, 2.57039399, 2.76079354,\n",
" 2.9511931 , 3.14159265, 3.33199221, 3.52239176, 3.71279132,\n",
" 3.90319087, 4.09359043, 4.28398998, 4.47438954, 4.66478909,\n",
" 4.85518865, 5.0455882 , 5.23598776, 5.42638731, 5.61678687,\n",
" 5.80718642, 5.99758598, 6.18798553, 6.37838508, 6.56878464,\n",
" 6.75918419, 6.94958375, 7.1399833 , 7.33038286, 7.52078241,\n",
" 7.71118197, 7.90158152, 8.09198108, 8.28238063, 8.47278019,\n",
" 8.66317974, 8.8535793 , 9.04397885, 9.23437841, 9.42477796])"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x = np.linspace(-3 * np.pi, 3 * np.pi, 100)\n",
"\n",
"x"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([-3.67394040e-16, -1.89251244e-01, -3.71662456e-01, -5.40640817e-01,\n",
" -6.90079011e-01, -8.14575952e-01, -9.09631995e-01, -9.71811568e-01,\n",
" -9.98867339e-01, -9.89821442e-01, -9.45000819e-01, -8.66025404e-01,\n",
" -7.55749574e-01, -6.18158986e-01, -4.58226522e-01, -2.81732557e-01,\n",
" -9.50560433e-02, 9.50560433e-02, 2.81732557e-01, 4.58226522e-01,\n",
" 6.18158986e-01, 7.55749574e-01, 8.66025404e-01, 9.45000819e-01,\n",
" 9.89821442e-01, 9.98867339e-01, 9.71811568e-01, 9.09631995e-01,\n",
" 8.14575952e-01, 6.90079011e-01, 5.40640817e-01, 3.71662456e-01,\n",
" 1.89251244e-01, -1.22464680e-16, -1.89251244e-01, -3.71662456e-01,\n",
" -5.40640817e-01, -6.90079011e-01, -8.14575952e-01, -9.09631995e-01,\n",
" -9.71811568e-01, -9.98867339e-01, -9.89821442e-01, -9.45000819e-01,\n",
" -8.66025404e-01, -7.55749574e-01, -6.18158986e-01, -4.58226522e-01,\n",
" -2.81732557e-01, -9.50560433e-02, 9.50560433e-02, 2.81732557e-01,\n",
" 4.58226522e-01, 6.18158986e-01, 7.55749574e-01, 8.66025404e-01,\n",
" 9.45000819e-01, 9.89821442e-01, 9.98867339e-01, 9.71811568e-01,\n",
" 9.09631995e-01, 8.14575952e-01, 6.90079011e-01, 5.40640817e-01,\n",
" 3.71662456e-01, 1.89251244e-01, 1.22464680e-16, -1.89251244e-01,\n",
" -3.71662456e-01, -5.40640817e-01, -6.90079011e-01, -8.14575952e-01,\n",
" -9.09631995e-01, -9.71811568e-01, -9.98867339e-01, -9.89821442e-01,\n",
" -9.45000819e-01, -8.66025404e-01, -7.55749574e-01, -6.18158986e-01,\n",
" -4.58226522e-01, -2.81732557e-01, -9.50560433e-02, 9.50560433e-02,\n",
" 2.81732557e-01, 4.58226522e-01, 6.18158986e-01, 7.55749574e-01,\n",
" 8.66025404e-01, 9.45000819e-01, 9.89821442e-01, 9.98867339e-01,\n",
" 9.71811568e-01, 9.09631995e-01, 8.14575952e-01, 6.90079011e-01,\n",
" 5.40640817e-01, 3.71662456e-01, 1.89251244e-01, 3.67394040e-16])"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"y = np.sin(x)\n",
"\n",
"y"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"With `matplotlib`'s [plot()](https://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.plot) function we can visualize the sinus curve."
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[<matplotlib.lines.Line2D at 0x7fdadb2ba220>]"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYcAAAD4CAYAAAAHHSreAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAABCWUlEQVR4nO29d3hc13Xo+1sz6MCgVxJsIAGwk5Ioim6SLMmyaOdJ7pHy8iLH9tV1Esc3zk0c6fN7jj+nycmLnZdcO7biJjuOJFtx0XUoy5Is0UWmxGIWgERjAdEx6DPomNnvjzkHHIEACWBmTpnZv++bDzOnLpyy115lry1KKTQajUajicZjtwAajUajcR5aOWg0Go3mKrRy0Gg0Gs1VaOWg0Wg0mqvQykGj0Wg0V5FmtwCrobS0VG3cuNFuMTQajcZVHD9+fEApVbacbV2pHDZu3MixY8fsFkOj0WhchYi0L3db7VbSaDQazVVo5aDRaDSaq9DKQaPRaDRXoZWDRqPRaK5CKweNRqPRXEVclIOIfF1E+kWkYYn1IiL/LCJtInJaRG6MWvegiLQanwfjIY9Go9FoYiNelsM3gXuusf4gUGt8HgL+FUBEioG/BG4B9gN/KSJFcZJJo9FoNKskLspBKfVzYOgam9wHfEtFOAIUikgV8HbgeaXUkFJqGHieaysZjYVcHpzgP169zOnOEbtF0RhMzoR4qamfp45eZmYubLc4GkApRVt/gG8faaelL2C3OHHDqkFwa4GOqN+dxrKlll+FiDxExOpg/fr1iZFSA8BjPz/Pk0c7uOAfByDdK3zuvbt5z43VNkuWurT0Bfi7Q+d45fwg04ZS+NHJbv71d2+iIDvdZulSk3BY8YUXWvjhyS46hiYByE738r9+5wbu3FZhs3Sx45qAtFLqMaXUPqXUvrKyZY3+1qyC/32qm7891ERJbgaf/q3t/NfH38zNG4v50++e4p9fbEVPDmU9kzMhPvrt45zsGOGB/ev51of28/fv3c1rF4d4/5dfoWtk0m4RU5LHf32Jf/lZGxtLcvnrd+3kx3/8ZraU5/HfvnWMbx9Z9kBkx2KV5dAFrIv6XW0s6wJuX7D8ZYtk0iygd3SK//uHDexdV8gT/+0Aad5I3+Gbv7+fh//zNJ9/voXxmTkeObjNZklTi0efPceFgXH+4yO38MYtpfPLq4uy+e//fpx3f/FX/Pjjb6bcl2WjlKlFW3+AR59t4o6t5XztwX2ICABPPnSAjz/xG/6fHzYwPRviI2+psVnS1WOV5fAM8HtG1tIBYFQp1QM8B9wtIkVGIPpuY5nGYpRS/PnTp5ieC/H5D+yZVwwAGWke/vEDe3jfTdV87RcX6RiasFHS1OLnLX4e/3U7H3rTptcpBoA3binlyYcOMDg+w1cOX7BJwtRjNhTmE0+dIifDy6Pv3TWvGAByM9P4yv91E3dsLeefXmhlZGLGRkljI16prE8AvwbqRaRTRD4sIh8VkY8amxwCLgBtwL8BfwiglBoC/go4anw+ayzTWMy3j7Tzi9YBPvXO7dSU5V21XkT4s7vr8YjwpZfbbJAw9RiZmOHPnz5FbXken7ynftFtdqwp4N03rOXfj7TTH5iyWMLU5F9ebOVM1yh/955di1praV4Pf3HPVoLTc3ztlxdtkDA+xCtb6QGlVJVSKl0pVa2U+ppS6stKqS8b65VS6o+UUpuVUruUUsei9v26UmqL8flGPOTRrIzRiVn+7lATt9aV8bu3LB3sryzI4oH96/jesU5tPVjA//diK4PBGb7w23vJSvcuud3H3rqFubDS1oMFdAxN8MWXz/OeG9dyz86qJberr/Txzl1VfONXl1xrPbgmIK1JHD882cXkbIhPvr3+dSbyYvzB7Vu09WABU7Mh/vN4J+/YVcXOtQXX3HZjaa62Hiziu8c6CCvFn929uCUXzcfvrHW19aCVQ4qjlOKJ1y6zY03+dRsh0NaDVfykoZexqTnuv3nd9TdGWw9WMBcK871jndxWV8aawuzrbu9260ErhxTndOcoTb0B7t+//LEj2npIPE+8dpkNJTkcqClZ1vbaekg8h1v89I5Ncf/Ny39X3Gw9aOWQ4jx59DLZ6V7u27tm2ftUFmTxrhvW8MzJbqZmQwmULjW54A/y6sUhfvvmdXg813bzRfPR22qYngvz41M9CZQudXnitQ5K8zK5c1v5svepr/Rxx9Zynj7eSTjsrjFCWjmkMOPTczxzspt37q4iP2tlo2x/a/caxmdC/LzFnyDpUpenjnbg9QjvW+GI9C3lPrZW+ni2QSuHeNM3NsVLzf2876Zq0r0razZ/a3cVPaNTnHJZGRqtHFKYH5/uZnwmxAP7l+fXjuYNm0soyE7nJw29CZAsdZmZC/OfJzq5c2s55fkrH9R2cGcVx9qH6R/TrqV48vTxTkJhtewYUDR3bqsg3Suue1e0ckhhnnitgy3ledy4fuWFcNO9Ht62vYLnz/UxPaddS/HixXN9DARneGAFMaBoDu6qRCl4rtFdDZGTCYcVTx69zBtqSthYmrvi/Quy03nTllIONfS4qvyMVg4pyuXBCU52jPDb+9ZdN311Kd6xq5LA1ByvtA3GWbrU5Ucnu6nIz+TWutXVD6stz2NzWS6HzmjlEC9+0zFCx9AkH7h59YUn37Gzio6hSRq7x+IoWWLRyiFFOdzSD8Bd21dfPfJNW0rxZaZpH3ecmA2F+VXbAHdsLce7gkB0NCLCO3ZV8erFQQaD03GWMDU53NyPR+Ct9csPRC/kbdsr8HqEQ2fc865o5ZCiHG7xs744h40lOas+Rmaalzu3lfPTs33MhvTcArFyon2YwPQct9WtvhECuGdnJWEFPz3bFyfJUpvDLX72riukMCdj1ccoys3gDTUlPNvQ6xrXklYOKcj0XIhXzg9ye33Zql1KJgd3VTEyMcuRC9q1FCuHW/ykeYQ3blne2Ial2F6Vz4aSHFf1Up3KYHCa012j3B6D1WByz85KLg6M0+ySCYG0ckhBjl0aZmImxG2r9GtHc1tdGTkZXp51WSaGEznc4ufGDUUrTiteiIhwcGcVvz4/6MqRuU7il20DKEVc3pW376hEBNfEg7RySEEOt/jJ8HqWPfr2WmSle3nr1nKeP9vnGnPZifQHpmjsHotLIwRwcGclc2HFy816HEosHG72U5ybwa5llJa5HmW+TG7eWMzzLnH3aeWQghxu9nPzpiJyM+Mz19Obt5TiD0xzYWA8LsdLRX7RMgDEp4cKsHNtAflZadrdFwPhsOLnrX7eUlu6opHq1+LNW0pp6h1zhUWnlUOK0TM6SXNfIG6NEDBvgeiGaPW83OKnzJfJjjX5cTme1yPs31Si70kMNHaPMRCcifu7ohS8etH509Zo5ZBimOUuYs2IiWZjSQ4V+ZkcueD8B96JhMKKX7T6ubU29gSBaA7UFHNpcIKeUT3H9Gow073fUhs/5bBnXQGZaR5XKO14zQR3j4g0i0ibiDy8yPoviMhJ49MiIiNR60JR656JhzyapTnc4qcyP4u6iqtne1stIsKBmkgvVccdVs7pzhFGJma5rT5+jRBcsehe1Up7VRxu8bNzbT5lvsy4HTMzzctNG4pc0ZGKWTmIiBf4InAQ2A48ICLbo7dRSn1CKbVXKbUX+Bfg+1GrJ811Sql7Y5VHszRzoTC/aB3gtrr49lAh0hD5A9Nc1HGHFXO4xY9H4C0L5oiOlW1V+TrusEpGJ2c5cXkkri4lkwM1Ja6IO8TDctgPtCmlLiilZoAngfuusf0DwBNxOK9mhTR0jxGYmuMtdfFthCA67uD8HpHTeKVtkF1rCyjKXf0gq8XQcYfV89rFIUJhFVeXkokZd3jN4XGHeCiHtUBH1O9OY9lViMgGYBPws6jFWSJyTESOiMi7ljqJiDxkbHfM79fpeavhePswADdvLI77sTeW5FDuy9QN0QqZmQtzqnOEfQm4J6DjDqvlxOVh0jzC3nWFcT/2lbhD8iuHlXA/8LRSKrqM5wal1D7gd4B/EpHNi+2olHpMKbVPKbWvrCz+2jwVOHF5mLWF2VSsohT09dBxh9VxrmeM6bkwN21
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"plt.plot(x, y)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let us quickly generate some random data and draw a scatter plot with `numpy`'s `random` module."
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<matplotlib.collections.PathCollection at 0x7fdadb1ba7f0>"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXAAAAD4CAYAAAD1jb0+AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAAaTElEQVR4nO3dfYxldX3H8feXZcRZtR2QqZVFXGoaDKAFnRojjbGrLSgUiJoqjdbHbPqQ1qpZXWqi2MS4laZq0rRmoxQTrKKI61MjUsBo8SmzLiuiUq2gMqA7pi6tOujs7rd/3DO7M3fPufc8n9/vnM8rmczcx/Pds/d+f8+/Y+6OiIjE54SuAxARkXKUwEVEIqUELiISKSVwEZFIKYGLiETqxDYPduqpp/rWrVvbPKSISPT27t37E3efH7+/1QS+detWFhcX2zykiEj0zOz7aferC0VEJFJK4CIikVICFxGJlBK4iEiklMBFRCLV6iwUEQnDnn1LXH3T3dx/cIXT5mbZceFZXH7+lq7DkoKUwEUGZs++Ja688U5WVg8DsHRwhStvvBNASTwy6kIRGZirb7r7aPJes7J6mKtvurujiKQsJXCRgbn/4Eqh+yVcSuAiA3Pa3Gyh+yVcSuAiA7PjwrOYndm04b7ZmU3suPCsjiKSsjSIKTIwawOVmoUSPyVwkQG6/PwtStg9oC4UEZFIKYGLiERKCVxEJFJK4CIikVICFxGJlBK4iEiklMBFRCKlBC4iEiklcBGRSCmBi4hESglcRCRSUxO4mV1jZgfM7Bspj73ezNzMTm0mPBERyZKnBn4tcNH4nWb2OOAPgR/UHJOIiOQwdTdCd/+8mW1NeeidwBuAj9cdlIiUo4sVD0up7WTN7DJgyd33m1nNIYlIGbpY8fAUHsQ0s83A3wJvzvn87Wa2aGaLy8vLRQ8nIjnpYsXDU2YWyhOAM4H9ZnYvcDrwNTP7zbQnu/tud19w94X5+fnykYrIRLpY8fAU7kJx9zuB31i7nSTxBXf/SY1xiUhBp83NspSSrHWx4v7KM43wg8CXgLPM7D4ze1XzYYlIUbpY8fDkmYVyxZTHt9YWjYiUposVD48uaizSI7pY8bBoKb2ISKSUwEVEIqUELiISKfWBi4g0qMntDZTARUQa0vT2BupCERFpSNPbGyiBi4g0pOntDZTARUQakrWNQV3bGyiBi4g0pOntDTSIKSLSkKa3N1ACFxFpUJPbG6gLRUQkUkrgIiKRUgIXEYmUEriISKSUwEVEIqUELiISKSVwEZFIKYGLiERKCVxEJFJK4CIikZqawM3sGjM7YGbfWHff1Wb2bTP7upl9zMzmGo1SRESOk6cGfi1w0dh9NwPnuvuTgf8Crqw5LhERmWLqZlbu/nkz2zp232fX3fwy8MKa45JANXl9PxEppo7dCF8JXJ/1oJltB7YDnHHGGTUcTrrS9PX9RKSYSoOYZvYm4BDwgaznuPtud19w94X5+fkqh5OONX19PxEppnQN3MxeDlwCPNvdvbaIJFhNX99PRIopVQM3s4uANwCXuvsv6g1JQtX09f1EpJg80wg/CHwJOMvM7jOzVwH/BDwKuNnM7jCz9zQcpwSg6ev7iUgxeWahXJFy9/saiEUC1/T1/USkGF0TUwpp8vp+IlKMltKLiERKCVxEJFJK4CIikVICFxGJlBK4iEiklMBFRCKlaYQiEh3tijmiBC4iUdGumMeoC0VEoqJdMY9RDVwqU3NW2qRdMY9RApdKumzOquAYptPmZllKSdZD3BVTXShSSVfN2bWCY+ngCs6xgmPPvqVGjyvd066YxyiBSyVdNWfVDzpsD585lrrmZmd4+/OfNMjWl7pQBqjOroeumrPqBx2m8S47gF8eOtJhRN1SDXxg6u566Ko5q6sDDZNaXhspgQ9M3V+Ay8/fwtuf/yS2zM1iwJa52Vaas+oHHSa1vDZSF8rANPEF6OIiD7o60DBpBspGSuAD06cvgK4ONDw7LjzruD7wIbe81IUyMOp6kJh11WUXKtXAB0ZdD8XEsFgohhjrpJbXMVMTuJldA1wCHHD3c5P7TgGuB7YC9wJ/7O4/bS5MqZO+APnEsGlSDDFKc/J0oVwLXDR2307gFnf/beCW5LZIr8QwZa2uGPfsW+KCXbdy5s5Pc8GuW7WiNRJTa+Du/nkz2zp292XAs5K/3w98DnhjnYGJdC20KWtpXSWTYszbtaJafLzKDmI+xt0fSP7+EfCYmuIRCUZIi4WyFmDNbZ5Jff7c5pncC7ZiaGlIusqzUNzdAc963My2m9mimS0uLy9XPZxIa0KasZOVZN1JjdGd3Ek5hJaGunDKKZvAf2xmjwVIfh/IeqK773b3BXdfmJ+fL3k4kfaFNGUtK5k+uLKaGuODK6u536frloZ2liyv7DTCTwAvA3Ylvz9eW0QiAQllxs6kBVhpMV590925F2x1vThmUhdOCOc+ZFNr4Gb2QeBLwFlmdp+ZvYpR4v4DM/sO8Jzktog0pGh3TpHnd93SCKELJ1Z5ZqFckfHQs2uORUQyFF2AVeb5XdV2+7S9Q9tsNAbZjoWFBV9cXGzteHK8oa3ak/Cl7fE9O7Np0Evkx5nZXndfGL9fS+kHRPN926XCMh9t71CeEviAaLCoPSosiwllsDg2SuADosGi/KrWnmMpLNVKiJsS+IBosCifOmrPMRSWaiXET/uBD0hIKwvbUmaFXx1Ly7teHJOHltDHTwl8QLqe79u2siv86qg9x1BYxtBKkMnUhTIwQxosKtsPXbarabw/+QVP3cJt317m/oMrzG2ewR1ee/0dXH3T3UH0NatLLX6qgUtvla1hlqk9p9X2P7p3iR0XnsU7X3QeD60e4eDKalB7fcTQSpDJVAOXwmKZuVC2hllmXvK0/uQQZ6Ro/nX8lMClkJhmLlTZpKloV1OZ2n4Ifc1D6lLrI3WhSCExzVxoc9B20qyTtmekaG/t4VANXAqJbeZCnTXMta6jpYMrbDLjsDtbkm6HabX9trZrjamFJNWpBi6FxDC/uQnrBykBDiebwK1PkFm1/TZbAjG1kKQ61cClkK43/+9KWmJcs5Ygb9+5rfPtWmNrIYUu9AF7JXApZKgzF6YlwFASpOZ21yeG7iglcClsiDMXshLj+sdDMNQWUhNi2JBMfeAiOaQtelkTUoIc2nYJTYqhO0o1cJEc1ncdpc1CCSlBDrGF1IQYuqOUwEVyUmIclhi6o5TApRahj9aLFBXDgL0SuFQWw2h9W5osyFRIti/0VlelBG5mrwVeDThwJ/AKd3+ojsAkHjGM1rehyYKsyHsr0Q9H6VkoZrYF+Gtgwd3PBTYBL64rMIlHDKP1bWhyFWTe9y57EYs6aS+W9lSdRngiMGtmJwKbgfurhySxGery+nFNFmR537vrpfQhFCBDUjqBu/sS8A/AD4AHgAfd/bPjzzOz7Wa2aGaLy8vL5SOVYOnCACNNFmR537vr1lDXBcjQVOlCORm4DDgTOA14hJm9ZPx57r7b3RfcfWF+fr58pBKsmBeP1Nncb7Igy/veXbeGui5AhqbKIOZzgHvcfRnAzG4EngFcV0dgEpZpA2Ohj9anqXvQsclpZ3nfu+u5yzEsfumTKgn8B8DTzWwzsAI8G1isJSoJSqzTBKcVOk3MnmmyIMvz3l3PXe66ABma0gnc3b9iZjcAXwMOAfuA3XUFJuGIcZpgnkKnr839LltDXRcgQ1NpHri7vwV4S02xSKCKJLpQ5iDnKXTU3G9GjN1psdJuhDJV3oGxkKaQ5Sl0pg0Maj6zhE4JXKbKOwMipClkc5tnpt4/afZMSIWRSBbthSJT5e3XDKlP+aGMy58ll7I8Kqu5H2O/vwyPErjkkqdfM5Q+5T37llhZPZL62IMrq7neo+3CKJSxA4mLErjUJpQpZJO6bPIWJkULoyoJONZpmmWooKqX+sClNnWuyKwygDiplpy3MCmyqrJqf3lIYwdN0rhC/VQDl1rVMYWsao00q/Z88uaZ3LEVmc9ctb88pLGDJmlcoX5K4BKcql/0rK6ct/zROanPz2rW5y2MqibgUMYOmjaUgqpN6kKR4FT9ohf
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"x = np.random.normal(42, 3, 100)\n",
"y = np.random.gamma(7, 1, 100)\n",
"\n",
"plt.scatter(x, y)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.9"
},
"toc": {
"base_numbering": 1,
"nav_menu": {},
"number_sections": false,
"sideBar": true,
"skip_h1_title": false,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": false,
"toc_position": {},
"toc_section_display": true,
"toc_window_display": false
}
},
"nbformat": 4,
"nbformat_minor": 4
}