Use date indexes with pandas DataFrames
Pandas provides convenient features to select, slice and aggregate DataFrames that have date or datetime indexes.
Setup
# Import libraries
import pandas as pd
# Read sample data with date index
df = pd.read_csv('data/finance_data.csv', parse_dates=['Date'])
df = df.set_index('Date').loc[lambda x: x['Name'] == 'Nasdaq100', ['Value']]
df
|
Value |
Date |
|
2019-01-02 |
6360.87 |
2019-01-03 |
6147.13 |
2019-01-04 |
6422.67 |
2019-01-07 |
6488.25 |
2019-01-08 |
6551.85 |
... |
... |
2020-05-04 |
8834.11 |
2020-05-05 |
8930.62 |
2020-05-06 |
8984.86 |
2020-05-07 |
9101.88 |
2020-05-08 |
9220.35 |
341 rows × 1 columns
Select dates in DataFrames
With a date index, you can easily select a specific year, month or day.
|
Value |
Date |
|
2020-01-02 |
8872.22 |
2020-01-03 |
8793.90 |
2020-01-06 |
8848.52 |
2020-01-07 |
8846.45 |
2020-01-08 |
8912.37 |
... |
... |
2020-05-04 |
8834.11 |
2020-05-05 |
8930.62 |
2020-05-06 |
8984.86 |
2020-05-07 |
9101.88 |
2020-05-08 |
9220.35 |
89 rows × 1 columns
# Select month
df['2020-05']
|
Value |
Date |
|
2020-05-01 |
8718.18 |
2020-05-04 |
8834.11 |
2020-05-05 |
8930.62 |
2020-05-06 |
8984.86 |
2020-05-07 |
9101.88 |
2020-05-08 |
9220.35 |
# Select period interval
df['2020-02-01':'2020-02-10']
|
Value |
Date |
|
2020-02-03 |
9126.23 |
2020-02-04 |
9334.06 |
2020-02-05 |
9367.48 |
2020-02-06 |
9445.92 |
2020-02-07 |
9401.10 |
2020-02-10 |
9516.84 |
# Select period after or before a date
df['2020-04-30':]
|
Value |
Date |
|
2020-04-30 |
9000.51 |
2020-05-01 |
8718.18 |
2020-05-04 |
8834.11 |
2020-05-05 |
8930.62 |
2020-05-06 |
8984.86 |
2020-05-07 |
9101.88 |
2020-05-08 |
9220.35 |
Aggregate rows
It is also trivial to apply aggregation at a time level, like month or week for example, and compute metrics with resample()
.
# Number of data points per month
df['2020'].resample('M').count()
|
Value |
Date |
|
2020-01-31 |
21 |
2020-02-29 |
19 |
2020-03-31 |
22 |
2020-04-30 |
21 |
2020-05-31 |
6 |
# Mean value per week
df['2020-04'].resample('W').mean()
|
Value |
Date |
|
2020-04-05 |
7550.0200 |
2020-04-12 |
8149.7600 |
2020-04-19 |
8641.4200 |
2020-04-26 |
8644.4500 |
2020-05-03 |
8874.6325 |