Aggregate a pandas DataFrame on multiple functions

Setup

# Import libraries
import pandas as pd
import seaborn as sns

# Load sample data in a DataFrame
df = sns.load_dataset('iris')[['species', 'sepal_length']]
df
species sepal_length
0 setosa 5.1
1 setosa 4.9
2 setosa 4.7
3 setosa 4.6
4 setosa 5.0
... ... ...
145 virginica 6.7
146 virginica 6.3
147 virginica 6.5
148 virginica 6.2
149 virginica 5.9

150 rows × 2 columns

Group and aggregate

If we want to get the minimum, average and maximum of sepal_length for each species, a classic way would be:

(
    df
    .groupby('species')
    .agg({'sepal_length': ['min', 'mean', 'max']})
)
sepal_length
min mean max
species
setosa 4.3 5.006 5.8
versicolor 4.9 5.936 7.0
virginica 4.9 6.588 7.9

An alternative style for aggregation allows to aggregate the same column on multiple functions, without having to suffer the pain of multi-index columns:

(
    df
    .groupby('species')
    .agg(
        sepal_length_min=('sepal_length', 'min'),
        sepal_length_mean=('sepal_length', 'mean'),
        sepal_length_max=('sepal_length', 'max'),
    )
)
sepal_length_min sepal_length_mean sepal_length_max
species
setosa 4.3 5.006 5.8
versicolor 4.9 5.936 7.0
virginica 4.9 6.588 7.9