I found a diagram recently that helped to explain the different types of JOIN syntax. Being the type person that enjoys the visualization of abstract concepts such as set theory this diagram spoke to me. Despite the diagram being well done something seemed slightly wrong about it. I thought about it for a while and then it hit me: the diagram does not accurately reflect the complexity of joins for queries I see every day. The diagram seems better suited for a classroom, not the real world. So I decided to put together my own diagrams for you to show you the things I see frequently. You’re welcome.

1. The Three Table Join

This first diagram that I thought about was a simple three table join. In fact, I was going to put together a new diagram for all the possible three table joins but decided that I really did not have the desire to put together 127 different diagrams (I am leaving it as an exercise for the reader to explain why I could need 127 different diagrams, leave your answer in the comments below). So, here is the first type of join I thought about:

What I find worth mentioning here is that you can change your result set by altering that second join. Instead of A.c1 = C.c1, change it to B.c1 = C.c1 and you will find that you may not return the expected data in your select list (another exercise for the reader there, go ahead and try that for yourself or you can just download this quick sample script).

Another fascinating point here is that it occurred to me that this same T-SQL would also apply to the following:

This diagram seems slightly more realistic to me, as I am likely to start with one table and need to go out to two (or more) other tables with joins.

2. The Table Join For Disjointed Tables

That idea got me thinking even more (always a dangerous thing) and I then thought about other types of joins I see frequently:

Yeah, that’s right…multiple tables with no relation between them. The end user just selects everything possible and then filters the rows based upon the WHERE clause or uses features in Excel or PowerPivot to do the filtering. In a way this is what I visualize “Big Data” to be, the grabbing of as much data as possible, mashing it together, and filtering the results until you find something that helps you get one step closer to wherever it is you wanted to go today.

3. The One Big Table Join

After that graph it suddenly occurred to me that there are a lot of people who create OLTP databases with really wide tables. I mean, REALLY wide tables, hundreds of columns. So, their syntax and graph looks like this:

Everything in one table, what a perfect database design, right? What could be simpler than that? In fact most data warehouses are built this way, where they have a few tables that are really wide and you may not need to join to any other tables. All the more reason to point out how the original diagram was not very indicative of the queries I see. In fact, the last picture I have for you is the EXACT type of query I see all the time.

4. The “What Were They Thinking” Table Join

How many times have you seen code similar to the following? Ever try to draw the picture? Here is my visualization:

Yes, that’s right, I did include the infamous =’NULL’, don’t pretend like you haven’t seen that one before! I also included column names of ‘pid’ and ‘p_id’ to give you an indication about how we often see column names that aren’t very descriptive and leave you wondering if the columns are truly representing the same value type.

I think the above five diagrams are a lot closer in reality to the types of queries we see every day. And the types of queries we are expected to tune, often immediately.

I often find it easier to help with performance problems once I am able to visualize what the joins are representing. I bet your end users would appreciate seeing similar diagrams as well, it may help them to understand what they are asking for, and it would likely help lead to a healthy discussion about design considerations. If nothing else, it should help to level set the expectations for performance.

Share this: Twitter

Facebook

LinkedIn

Reddit

