Use date indexes with pandas DataFrames

Pandas provides convenient features to select, slice and aggregate DataFrames that have date or datetime indexes.

Setup

# Import libraries
import pandas as pd

# Read sample data with date index
df = pd.read_csv('data/finance_data.csv', parse_dates=['Date'])
df = df.set_index('Date').loc[lambda x: x['Name'] == 'Nasdaq100', ['Value']]

df
Value
Date
2019-01-02
6360.87
2019-01-03
6147.13
2019-01-04
6422.67
2019-01-07
6488.25
2019-01-08
6551.85
…
…
2020-05-04
8834.11
2020-05-05
8930.62
2020-05-06
8984.86
2020-05-07
9101.88
2020-05-08
9220.35

Select dates in DataFrames

With a date index, you can easily select a specific year, month or day.

# Select year
df['2020']
Value
Date
2020-01-02
8872.22
2020-01-03
8793.90
2020-01-06
8848.52
2020-01-07
8846.45
2020-01-08
8912.37
…
…
2020-05-04
8834.11
2020-05-05
8930.62
2020-05-06
8984.86
2020-05-07
9101.88
2020-05-08
9220.35
# Select month
df['2020-05']
Value
Date
2020-05-01
8718.18
2020-05-04
8834.11
2020-05-05
8930.62
2020-05-06
8984.86
2020-05-07
9101.88
2020-05-08
9220.35
# Select period interval
df['2020-02-01':'2020-02-10']
Value
Date
2020-02-03
9126.23
2020-02-04
9334.06
2020-02-05
9367.48
2020-02-06
9445.92
2020-02-07
9401.10
2020-02-10
9516.84
# Select period after or before a date
df['2020-04-30':]
Value
Date
2020-04-30
9000.51
2020-05-01
8718.18
2020-05-04
8834.11
2020-05-05
8930.62
2020-05-06
8984.86
2020-05-07
9101.88
2020-05-08
9220.35

Aggregate rows

It is also trivial to apply aggregation at a time level, like month or week for example, and compute metrics with resample().

# Number of data points per month
df['2020'].resample('M').count()
Value
Date
2020-01-31
21
2020-02-29
19
2020-03-31
22
2020-04-30
21
2020-05-31
6
# Mean value per week
df['2020-04'].resample('W').mean()
Value
Date
2020-04-05
7550.0200
2020-04-12
8149.7600
2020-04-19
8641.4200
2020-04-26
8644.4500
2020-05-03
8874.6325