1
0
Fork 0

Create notebook for the fifth application of tidying

This commit is contained in:
Alexander Hess 2018-08-26 16:36:50 +02:00
commit 0723da59a8

View file

@ -0,0 +1,55 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# One Type in multiple Tables\n",
"\n",
"The repository with the original R code does not provide code for this case but only refers to other projects that cannot be replicated any more (source website not available any more)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Messy Data\n",
"\n",
"> Its also common to find data values about a single type of observational unit spread out over multiple tables or files. These tables and files are often split up by another variable, so that each represents a single year, person, or location."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Tidy Data\n",
"\n",
"> As long as the format for individual records is consistent, this is an easy problem to fix:\n",
"1. Read the files into a list of tables.\n",
"2. For each table, add a new column that records the original file name (because the file name is often the value of an important variable).\n",
"3. Combine all tables into a single table"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.5"
}
},
"nbformat": 4,
"nbformat_minor": 2
}