Pandas DataFrames

What is a DataFrame?

A Pandas DataFrame is a 2 dimensional data structure, like a 2 dimensional array, or a table with rows and columns.

Example

Create a simple Pandas DataFrame:

import pandas as pd

data = {
  "calories": [420, 380, 390],
  "duration": [50, 40, 45]
}

#load data into a DataFrame object:
df = pd.DataFrame(data)

print(df)

Result

   calories  duration
0       420        50
1       380        40
2       390        45

Locate Row

As you can see from the result above, the DataFrame is like a table with rows and columns.

Pandas use the loc attribute to return one or more specified row(s)

Example

Return row 0:

#refer to the row index:
print(df.loc[0])

Result

calories    420
duration     50
Name: 0, dtype: int64

Note: This example returns a Pandas Series.

Example

Return row 0 and 1:

#use a list of indexes:
print(df.loc[[0, 1]])

Result

   calories  duration
0       420        50
1       380        40

Named Indexes

With the index argument, you can name your own indexes.

Example

Add a list of names to give each row a name:

import pandas as pd

data = {
  "calories": [420, 380, 390],
  "duration": [50, 40, 45]
}

df = pd.DataFrame(data, index = ["day1", "day2", "day3"])

print(df)

Result

      calories  duration
day1       420        50
day2       380        40
day3       390        45

Locate Named Indexes

Use the named index in the loc attribute to return the specified row(s).

Example

Return "day2":

#refer to the named index:
print(df.loc["day2"])

Result

calories    380
duration     40
Name: day2, dtype: int64

Load Files Into a DataFrame

If your data sets are stored in a file, Pandas can load them into a DataFrame.

Example

Load a comma separated file (CSV file) into a DataFrame:

import pandas as pd

df = pd.read_csv('data.csv')

print(df)

You will learn more about importing files in the next chapters.