Get cumulated sum in SQL
Create a sample table
Let’s start by creating a small example table from a BigQuery public dataset, namely an extract of Google Analytics data from July 2017.
CREATE OR REPLACE TABLE `training.sql_cumsum` AS (
SELECT
PARSE_DATE("%Y%m%d", date) AS day,
EXTRACT(WEEK FROM PARSE_DATE("%Y%m%d", date)) AS week,
SUM(totals.visits) AS sessions
FROM `bigquery-public-data.google_analytics_sample.ga_sessions_201707*`
GROUP BY 1,2
ORDER BY 1
LIMIT 10
);
SELECT * FROM `training.sql_cumsum`;
day | week | sessions | |
---|---|---|---|
0 | 2017-07-01 | 26 | 2048 |
1 | 2017-07-02 | 27 | 1895 |
2 | 2017-07-03 | 27 | 2046 |
3 | 2017-07-04 | 27 | 1938 |
4 | 2017-07-05 | 27 | 2885 |
5 | 2017-07-06 | 27 | 2658 |
6 | 2017-07-07 | 27 | 2450 |
7 | 2017-07-08 | 27 | 1859 |
8 | 2017-07-09 | 28 | 1921 |
9 | 2017-07-10 | 28 | 2769 |
Running total
To compute the running sum in SQL, use the window clause SUM(metric) OVER (ORDER BY dimension)
:
SELECT
day,
sessions,
-- Running total of daily sessions
SUM(sessions) OVER (ORDER BY day) AS sessions_cumtd
FROM `training.sql_cumsum`
ORDER BY day
LIMIT 10
day | sessions | sessions_cumtd | |
---|---|---|---|
0 | 2017-07-01 | 2048 | 2048 |
1 | 2017-07-02 | 1895 | 3943 |
2 | 2017-07-03 | 2046 | 5989 |
3 | 2017-07-04 | 1938 | 7927 |
4 | 2017-07-05 | 2885 | 10812 |
5 | 2017-07-06 | 2658 | 13470 |
6 | 2017-07-07 | 2450 | 15920 |
7 | 2017-07-08 | 1859 | 17779 |
8 | 2017-07-09 | 1921 | 19700 |
9 | 2017-07-10 | 2769 | 22469 |
Running percentage of total
You can also get the cumulated percentage, dividing the cumulated sum by the total with SUM(metric) OVER()
:
SELECT
day,
sessions,
-- Cumulated percentage of daily sessions
SUM(sessions) OVER (ORDER BY day) / SUM(sessions) OVER() AS sessions_pct_cumtd,
FROM `training.sql_cumsum`
ORDER BY day
day | sessions | sessions_pct_cumtd | |
---|---|---|---|
0 | 2017-07-01 | 2048 | 0.091148 |
1 | 2017-07-02 | 1895 | 0.175486 |
2 | 2017-07-03 | 2046 | 0.266545 |
3 | 2017-07-04 | 1938 | 0.352797 |
4 | 2017-07-05 | 2885 | 0.481196 |
5 | 2017-07-06 | 2658 | 0.599493 |
6 | 2017-07-07 | 2450 | 0.708532 |
7 | 2017-07-08 | 1859 | 0.791268 |
8 | 2017-07-09 | 1921 | 0.876764 |
9 | 2017-07-10 | 2769 | 1.000000 |
Cumulated sum, by group
You can also get the running total partitioned by another group, with PARTITION BY
, as described below:
SELECT
day,
sessions,
week,
-- Cumulated sum of daily sessions
SUM(sessions) OVER (ORDER BY day) AS sessions_cumtd,
-- Cumulated sum of sessions, partitioned by week
SUM(sessions) OVER (PARTITION BY week ORDER BY day) AS sessions_cumtd_week
FROM `training.sql_cumsum`
ORDER BY day
day | sessions | week | sessions_cumtd | sessions_cumtd_week | |
---|---|---|---|---|---|
0 | 2017-07-01 | 2048 | 26 | 2048 | 2048 |
1 | 2017-07-02 | 1895 | 27 | 3943 | 1895 |
2 | 2017-07-03 | 2046 | 27 | 5989 | 3941 |
3 | 2017-07-04 | 1938 | 27 | 7927 | 5879 |
4 | 2017-07-05 | 2885 | 27 | 10812 | 8764 |
5 | 2017-07-06 | 2658 | 27 | 13470 | 11422 |
6 | 2017-07-07 | 2450 | 27 | 15920 | 13872 |
7 | 2017-07-08 | 1859 | 27 | 17779 | 15731 |
8 | 2017-07-09 | 1921 | 28 | 19700 | 1921 |
9 | 2017-07-10 | 2769 | 28 | 22469 | 4690 |