SQL Recipes Calculations Per Group Calculating Percentage (%) of Total Sum

Calculating Percentage (%) of Total Sum in SQL

How to compute the ratio of a particular row to the sum of values?

This question comes up frequently when you want to the relative contribution of a row against the backdrop of the total sum of all the rows. For example:

how is the browser marketshare changing over time

what's each sales person's contribution to your company's revenue

Consider a table with the number of page view (in billions) with each browser:

Browser Pageviews Chrome 7.1685 Safari 1.935 Firefox 1.3455 UC Browser 1.0965 IE 1.341 Opera 0.816 Android 0.7245 Rest 1.2

What we really want to see is the browser market share. We can use a Common Table Expression (CTE) to compute the total pageview which we then use to compute the ratio:

with total as ( select sum(pageviews) as total from pageviews ) select browser, pageviews / total.total as share from pageviews, total

Which gives a ratio of each browser to the total:

Browser Share Chrome 0.895 Safari 0.241875 Firefox 0.1681875 UC Browser 0.1370625 IE 0.167625 Opera 0.102 Android 0.0905625 Rest 0.15

And the visualization:

Percentage to Total per Group

The next question to ask is how this is changing over time?

What we are attempting to do here is to group our data into months, compute the total for that group and for each row within that group compute a ratio. An overall total wouldn't make sense. Conside the pageview table as before, but with an additional date field:

dt Browser Pageviews 2016-01-01 Chrome 7.1685 2016-01-01 Safari 1.935 2016-01-01 ... ... 2016-01-02 Chrome 7.2485 2016-01-02 Safari 1.721 2016-01-02 ... ... ... ... ... 2016-12-31 Chrome 7.864 2016-12-31 Safari 2.011 2016-12-31 ... ...

We once again to resort to window functions with a partition over the month portion of the datetime.

select date_trunc('month', dt), browser, pageviews / sum(pageviews) over(partition by date_trunc('month', dt)) from pageviews

Let's unpack the query a bit. Our window function creates partitions (groups) for each month, and for each partition we sum up the pageviews. The ratio between the current row pageviews and the total for that group is exactly what we want.

Redshift has ratio_to_report

Fortunately on Redshift, it supports a window function called ratio_to_report which computes the ratio of the value of a column in a row with that of the sum of the said expression over the group.

select date_trunc('month', dt), browser, ratio_to_report(pageviews) over(partition by date_trunc('month', dt)) from pageviews