Pandas Tutorial

Learn Pandas

Pandas is a Python library.

Pandas is used to analyze data.

Learning by Reading

We have created 14 tutorial pages for you to learn more about Pandas.

Starting with a basic introduction and ends up with cleaning and plotting data:

Basic

Introduction
Getting Started
Pandas Series
DataFrames
Read CSV
Read JSON
Analyze Data

Cleaning Data

Clean Data
Clean Empty Cells
Clean Wrong Format
Clean Wrong Data
Remove Duplicates

Advanced

Correlations
Plotting

Learning by Examples

In our "Try it Yourself" editor, you can use the Pandas module, and modify the code to see the result.

Example

Load a CSV file into a Pandas DataFrame:

import pandas as pd

df = pd.read_csv('data.csv')

print(df.to_string())

Click on the "Try it Yourself" button to see how it works.

Learning by Exercises

Most chapters in this tutorial end with an exercise where you can check your level of knowledge.

Pandas Introduction

What is Pandas?

Pandas is a Python library used for working with data sets.

It has functions for analyzing, cleaning, exploring, and manipulating data.

The name "Pandas" has a reference to both "Panel Data", and "Python Data Analysis" and was created by Wes McKinney in 2008.

Why Use Pandas?

Pandas allows us to analyze big data and make conclusions based on statistical theories.

Pandas can clean messy data sets, and make them readable and relevant.

Relevant data is very important in data science.

Data Science: is a branch of computer science where we study how to store, use and analyze data for deriving information from it.

What Can Pandas Do?

Pandas gives you answers about the data. Like:

  • Is there a correlation between two or more columns?
  • What is average value?
  • Max value?
  • Min value?

Pandas are also able to delete rows that are not relevant, or contains wrong values, like empty or NULL values. This is called cleaning the data.

Where is the Pandas Codebase?

The source code for Pandas is located at this github repository:

Repository Link
https://github.com/pandas-dev/pandas