{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "**Note**: Click on \"*Kernel*\" > \"*Restart Kernel and Clear All Outputs*\" in [JupyterLab](https://jupyterlab.readthedocs.io/en/stable/) *before* reading this notebook to reset its output. If you cannot run this file on your machine, you may want to open it [in the cloud ](https://mybinder.org/v2/gh/webartifex/intro-to-python/main?urlpath=lab/tree/07_sequences/05_appendix.ipynb)." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Chapter 7: Sequential Data (Appendix)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "In the [third part ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/main/07_sequences/03_content.ipynb#Tuples-are-like-\"Immutable-Lists\") of the chapter, we proposed the idea that `tuple` objects are like \"immutable lists.\" Often, however, we use `tuple` objects to represent a **record** of related **fields**. Then, each element has a *semantic* meaning (i.e., a descriptive name).\n", "\n", "As an example, think of a spreadsheet with information on students in a course. Each row represents a record and holds all the data associated with an individual student. The columns (e.g., matriculation number, first name, last name) are the fields that may come as *different* data types (e.g., `int` for the matriculation number, `str` for the names).\n", "\n", "A simple way of modeling a single student is as a `tuple` object, for example, `(123456, \"John\", \"Doe\")`. A disadvantage of this approach is that we must remember the order and meaning of the elements/fields in the `tuple` object.\n", "\n", "An example from a different domain is the representation of $(x, y)$-points in the $x$-$y$-plane. Again, we could use a `tuple` object like `current_position` below to model the point $(4, 2)$." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "current_position = (4, 2)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "We implicitly assume that the first element represents the $x$ and the second the $y$ coordinate. While that follows intuitively from convention in math, we should at least add comments somewhere in the code to document this assumption." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## The `namedtuple` Type" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "A better way is to create a *custom* data type. While that is covered in depth in [Chapter 11 ](https://nbviewer.jupyter.org/github/webartifex/intro-to-python/blob/main/11_classes/00_content.ipynb), the [collections ](https://docs.python.org/3/library/collections.html) module in the [standard library ](https://docs.python.org/3/library/index.html) provides a [namedtuple() ](https://docs.python.org/3/library/collections.html#collections.namedtuple) **factory function** that creates \"simple\" custom data types on top of the standard `tuple` type." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "from collections import namedtuple" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "[namedtuple() ](https://docs.python.org/3/library/collections.html#collections.namedtuple) takes two arguments. The first argument is the name of the data type. That could be different from the variable `Point` we use to refer to the new type, but in most cases it is best to keep them in sync. The second argument is a sequence with the field names as `str` objects. The names' order corresponds to the one assumed in `current_position`." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "Point = namedtuple(\"Point\", [\"x\", \"y\"])" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "The `Point` object is a so-called **class**. That is what it means if an object is of type `type`. It can be used as a **factory** to create *new* `tuple`-like objects of type `Point`. In a way, [namedtuple() ](https://docs.python.org/3/library/collections.html#collections.namedtuple) gives us a way to create our own custom **constructors**." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "94457911453856" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "id(Point)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "type" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(Point)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "The value of `Point` is just itself in a *literal notation*." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "__main__.Point" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Point" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "We write `Point(4, 2)` to create a *new* object of type `Point`." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "current_position = Point(4, 2)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "Now, `current_position` has a somewhat nicer representation. In particular, the coordinates are named `x` and `y`." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "Point(x=4, y=2)" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "current_position" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "It is *not* a `tuple` any more but an object of type `Point`." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "140376178109184" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "id(current_position)" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "__main__.Point" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "type(current_position)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "We use the dot operator `.` to access the defined attributes." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/plain": [ "4" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "current_position.x" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "2" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "current_position.y" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "As before, we get an `AttributeError` if we try to access an undefined attribute." ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "ename": "AttributeError", "evalue": "'Point' object has no attribute 'z'", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mAttributeError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[13], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43mcurrent_position\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mz\u001b[49m\n", "\u001b[0;31mAttributeError\u001b[0m: 'Point' object has no attribute 'z'" ] } ], "source": [ "current_position.z" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "skip" } }, "source": [ "`current_position` continues to work like a `tuple` object! That is why we can use `namedtuple` as a replacement for `tuple`. The underlying implementations exhibit the *same* computational efficiencies and memory usages.\n", "\n", "For example, we can index into or loop over `current_position` as it is still a sequence with the familiar four properties." ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/plain": [ "4" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "current_position[0]" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "slideshow": { "slide_type": "skip" } }, "outputs": [ { "data": { "text/plain": [ "2" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "current_position[1]" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "4\n", "2\n" ] } ], "source": [ "for number in current_position:\n", " print(number)" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2\n", "4\n" ] } ], "source": [ "for number in reversed(current_position):\n", " print(number)" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "2" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(current_position)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.2" }, "livereveal": { "auto_select": "code", "auto_select_fragment": true, "scroll": true, "theme": "serif" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": false, "sideBar": true, "skip_h1_title": true, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": { "height": "calc(100% - 180px)", "left": "10px", "top": "150px", "width": "384px" }, "toc_section_display": false, "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 4 }