Blog

How many widgets did I sell?

When you’re just starting out, chances are you’ll be asking a very simple question: How many widgets did I sell each day?

Your SQL might look something like this:

select date(created_at) as day, count(1) from sales where created_at > now() - interval '30 day' group by day order by day;

With that SQL, you can expect results like these:

day count 2014-03-12 5 2014-03-13 1 2014-03-19 6 2014-03-20 4 2014-03-21 1 2014-03-24 1 2014-04-08 2 2014-04-09 3

If you viewed the results in a data visualization tool, you’d get a graph like this:

Unfortunately, since you’re just starting out, you’re not selling widgets every day. And notice the zeroes aren’t even showing up! Your investors will not be pleased with such a misleading graph.

If you’re using Postgres, generate_series can help. generate_series produces a table with a given first point, last point, and interval that you specify. In this case, we’ll choose 30 days ago, today, and each day in between. Here’s our revised SQL:

select date(d) as day, count(sales.id) from generate_series( current_date - interval '30 day', current_date, '1 day' ) d left join sales on date(sales.created_at) = d group by day order by day;

By building an explicit list of dates in our query, rather than relying on the dates, in the sales table, we made sure every date showed up in our result set.

day count 2014-03-12 5 2014-03-13 1 2014-03-14 0 2014-03-15 0 2014-03-16 0 2014-03-17 0 2014-03-18 0 2014-03-19 6 2014-03-20 4 2014-03-21 1 2014-03-22 0 2014-03-23 0 2014-03-24 1 2014-03-25 0 2014-03-26 0 2014-03-27 0 2014-03-28 0 2014-03-29 0 2014-03-30 0 2014-03-31 0 2014-03-01 0 2014-03-02 0 2014-03-03 0 2014-03-04 0 2014-03-05 0 2014-03-06 0 2014-03-07 0 2014-03-08 0 2014-03-09 0 2014-03-10 0 2014-03-11 2

And here’s the chart:

That’s more like it!

Caveats and gotchas

left join: We used generate_series to make sure we got a row for every day in the last 30 days, not just a row for every day in the sales table. So be careful not to use an inner join, which will wipe out the days on the left-hand side that have no matching day on the right-hand side.

count(sales.id): A snippet from our joined table looks like this:

As intended, there are blank rows for days where we had no sales. As a result, we can’t just count every row to get number of sales. count(sales.id) will count just the rows with non-null sales IDs, which is what we want.

Other databases

Not on Postgres? Here are some generate_series equivalents that are as good or better:

Oracle’s magic dual table can help:

select sysdate - level from dual connect by level <= 30;

SQL Server’s recursive with statements are helpful and very impressive in general:

with t as ( select 0 as d union all select d + 1 from t where d < 30 ) select getdate() - d from t

MySQL, unfortunately, is trickier. For some clever hacks, start with this answer on StackOverflow.

An easier way

Of course, a visualization tool like Sisense for Cloud Data Teams will continuous-ify the data for you! In our tool of choice, head to a chart’s settings and toggle “Continuous Axis” for the desired effect.