Data
For this lesson, we will use a number of different data sets. Download these files to your computer and put them in a location that you can find again later. (Some browsers may require you to right click on the link to specify the download location.)
Note: this data set is 1.6GB!
Software
For this workshop we use Python version 3.x.
For installing these packages we will use Anaconda. Anaconda is a Python distribution aimed at data science.
Download and install Anaconda. Remember to download and install the installer for Python 3.x for your platform.
You can download either the graphical or command-line installer. If you
download the command line installer, you will need to run the installer
using the sh
command. For example, if you downloaded
Anaconda3-4.4.0-MacOSX-x86_64.sh
, you would need to run the command:
sh Anaconda3-4.4.0-MacOSX-x86_64.sh
It is usually necessary to restart your shell once you’ve installed Anaconda.
In addition to the Python packages, you will also need access to a text editor or a development environment for Python scripts. Nano is a good option for anyone not used to text editing. See these instructions for how to install Nano on Windows/Mac/Linux. If you have another editor you’d rather use that is fine also.
Another alternative is to use a Python development environment such as PyDev. Here are instructions on how to install PyDev, but it’s use is beyond the scope of this tutorial.
After installing either Anaconda or Miniconda and the workshop packages, launch a Jupyter notebook by typing this command from the terminal:
jupyter notebook
The notebook should open automatically in your browser. If it does not or you wish to use a different browser, open this link: http://localhost:8888.
Screenshot of a Jupyter Notebook on quantum mechanics by Robert Johansson
After typing the command jupyter notebook
, the following happens:
The Jupyter Notebook server opens the Jupyter notebook client, also known as the notebook user interface, in your default web browser.
The Jupyter notebook file browser
To create a new Python notebook select the “New” dropdown on the upper right of the screen.
The Jupyter notebook file browser
When you can create a new notebook and type code into the browser, the web browser and the Jupyter notebook server communicate with each other.
A new, blank Jupyter notebook
Under the “help” menu, take a quick interactive tour of how to use the notebook. Help on Jupyter and key workshop packages is available here too.
User interface tour and Help
The web browser then displays the updated notebook to you.
For example, click in the first cell and type some Python code.
A Code cell
This is a Code cell (see the cell type dropdown with the word Code). To run the cell, type Shift-Enter.
A Code cell and its output
Let’s look at a Markdown cell. Markdown is a text manipulation language that is readable yet offers additional formatting. Don’t forget to select Markdown from the cell type dropdown. Click in the cell and enter the markdown text.
A markdown input cell
To run the cell, type Shift-Enter.
A rendered markdown cell
This workflow has several advantages:
.ipynb
.The notebook has two modes of operation: Control and Edit. Control mode lets you edit notebook level features; while, Edit mode lets you change the contents of a notebook cell. Remember a notebook is made up of a number of cells which can contain code, markdown, html, visualizations, and more.
Use the Help menu and its options when needed.