Use date indexes with pandas DataFrames

Pandas provides convenient features to select, slice and aggregate DataFrames that have date or datetime indexes.

Setup

# Import libraries
import pandas as pd
# Read sample data with date index
df = pd.read_csv('data/finance_data.csv', parse_dates=['Date'])
df = df.set_index('Date').loc[lambda x: x['Name'] == 'Nasdaq100', ['Value']]
df
Value
Date
2019-01-02 6360.87
2019-01-03 6147.13
2019-01-04 6422.67
2019-01-07 6488.25
2019-01-08 6551.85
... ...
2020-05-04 8834.11
2020-05-05 8930.62
2020-05-06 8984.86
2020-05-07 9101.88
2020-05-08 9220.35

341 rows × 1 columns

Select dates in DataFrames

With a date index, you can easily select a specific year, month or day.

# Select year
df['2020']
Value
Date
2020-01-02 8872.22
2020-01-03 8793.90
2020-01-06 8848.52
2020-01-07 8846.45
2020-01-08 8912.37
... ...
2020-05-04 8834.11
2020-05-05 8930.62
2020-05-06 8984.86
2020-05-07 9101.88
2020-05-08 9220.35

89 rows × 1 columns

# Select month
df['2020-05']
Value
Date
2020-05-01 8718.18
2020-05-04 8834.11
2020-05-05 8930.62
2020-05-06 8984.86
2020-05-07 9101.88
2020-05-08 9220.35
# Select period interval
df['2020-02-01':'2020-02-10']
Value
Date
2020-02-03 9126.23
2020-02-04 9334.06
2020-02-05 9367.48
2020-02-06 9445.92
2020-02-07 9401.10
2020-02-10 9516.84
# Select period after or before a date
df['2020-04-30':]
Value
Date
2020-04-30 9000.51
2020-05-01 8718.18
2020-05-04 8834.11
2020-05-05 8930.62
2020-05-06 8984.86
2020-05-07 9101.88
2020-05-08 9220.35

Aggregate rows

It is also trivial to apply aggregation at a time level, like month or week for example, and compute metrics with resample().

# Number of data points per month
df['2020'].resample('M').count()
Value
Date
2020-01-31 21
2020-02-29 19
2020-03-31 22
2020-04-30 21
2020-05-31 6
# Mean value per week
df['2020-04'].resample('W').mean()
Value
Date
2020-04-05 7550.0200
2020-04-12 8149.7600
2020-04-19 8641.4200
2020-04-26 8644.4500
2020-05-03 8874.6325