Introduction to Pandas: Setup

Data

For this lesson, we will use a number of different data sets. Download a zip archive containing all the files to your computer and unzip this in a known location:

You can also download the files individually:

Once you click on a file, it should be automatically downloaded to your default download directory. Some browsers may require you to right click on the link to specify the download location.

Software

Python is a popular language for scientific computing, and great for general-purpose programming as well. Installing all of its scientific packages individually can be a bit difficult, so we recommend an all-in-one installer.

For this workshop we use Python version 3.x.

Required Python Packages for this workshop

Install the workshop packages

For installing these packages we will use Anaconda. Anaconda is a Python distribution aimed at data science.

Anaconda installation

Download and install Anaconda. Remember to download and install the installer for Python 3.x for your platform.

You can download either the graphical or command-line installer. If you download the command line installer, you will need to run the installer using the sh command. For example, if you downloaded Anaconda3-4.4.0-MacOSX-x86_64.sh, you would need to run the command:

sh Anaconda3-4.4.0-MacOSX-x86_64.sh

It is usually necessary to restart your shell once you’ve installed Anaconda.

ggplot installation

Run the command:

conda install ggplot

In some cases, installing ggplot from conda may fail with an error like:

UnsatisfiableError:The following specifications were found to be in conflict:
      - ggplot -> python3.4*
      - python 3.6*

In that case, try installing ggplot with Anaconda pip by running this command in your terminal:

pip install -U ggplot

Launch a Jupyter notebook

After installing either Anaconda or Miniconda and the workshop packages, launch a Jupyter notebook by typing this command from the terminal:

jupyter notebook

The notebook should open automatically in your browser. If it does not or you wish to use a different browser, open this link: http://localhost:8888.


Overview of the Jupyter notebook (Optional)

Example Jupyter Notebook
Screenshot of a Jupyter Notebook on quantum mechanics by Robert Johansson

How the Jupyter notebook works

After typing the command jupyter notebook, the following happens:

This workflow has several advantages:

How the notebook is stored

Notebook modes: Control and Edit

The notebook has two modes of operation: Control and Edit. Control mode lets you edit notebook level features; while, Edit mode lets you change the contents of a notebook cell. Remember a notebook is made up of a number of cells which can contain code, markdown, html, visualizations, and more.

Help and more information

Use the Help menu and its options when needed.