If warehouse is just a building for storage of goods, data warehouse is just a database with data. Or not?

Let’s play retailers and let’s have stuff. Lots of stuff. Actually, lots of stuff of different kinds: clothes, accessories, cosmetics… just to name a few.

The Storage

We buy a wooden shed and put the stuff in it. Our stuff is safe from rain, sun, snow and crows. However, when we want something from the shed, we have to sift through the pile of stuff to find what we are looking for. We buy some boxes and put the stuff in them. While boxing of the stuff, we try to organise them a bit: clothes to the left, cosmetics to the right and accessories to the back.

We have: shed (storage), stuff (data), boxes (folders or tables) and kind of organisation.

Stuff on shelves

Customer comes, asks for a soap. We go to the right, go through the boxes with cosmetics, pick up a soap and hand it to our customer. So far, so good.

Stuff Coming in and out

We ordered more stuff. Truck came and dumped it front of the shed. Some things are boxed, some things are in bags, some things are loose. As we put the new stuff in our shed, we realise that it does not fit well and also having it on the floor is not very practical. We decide to buy shelves, more boxes of different sizes and we stuff them on the shelves. Now we reorganise our stuff within the shelves: jackets, shirts, trousers, soaps, creams, wristbands have their own places.

We are slightly more organised now. Stuff is easer to pick, we know where to reach for particular kind of stuff.

A customer calls and orders three shirts. We reach to the left to the boxes with shirts, find the appropriate size and ship them to the happy customer. Meanwhile another customer calls for order, and third, and fourth … We can’t handle it, we need some help.

Warehouse shipment delivery

We have: shed (storage), stuff (data), better organised boxes (folders or tables)

Where can I find …?

We hired Brian to help us handle orders and shipping. We show him the layout of the shed, explain how to do the shipping.

Customer calls. Brian packs. Another customer calls. Brian asks: “Where I can find the wrist bands?”. “In the back, where the accessories are, small boxes on the top”. Brian packs. Brian asks again: “Where were the jackets again?” “On the left, lower shelves.”. Brian packs.

Brian is annoying!

We buy white stickers and write labels on them: “shirts”, “jackets”, “bracelets”, “facial creams”. We stick them to the boxes. Brian stops asking us questions. We are happy.

We have: shed (storage), stuff (data), better organised boxes (tables), label stickers (content metadata).

More stuff

Another delivery comes. Brian unpacks the boxes and puts stuff onto the shelves. It does not fit. We need more space. After some planning we decided to rent a larger storage space in other part of the town and move our stuff there. More orders are coming, as we ship more we order more to have everything in stock. We need more help. We hire Carlos.

Carlos messes up things constantly. We wonder why and find out that the boxes are mislabeled or labels are missing completely. We forgot to re-label our stuff properly in the new storage space after moving. It did not occur to Brian, since he learned where the stuff is. We spend another day labelling our shelves and boxes.

This should not happen again. We have to constantly take care that the labelling is correct. What if we hire a third helper? What if we move once again? We should not get slowed down by going through the boxes, looking for stuff. We have to deliver to our customers.

We have: storage, data, some kind of data organisation, metadata, process for metadata.

Errors

Customer called, that his shirt missed a button. Another customer called, that he got a shaving brush instead of a silk scarf. Third customer called that the shampoo he got is 10 years old. Fourth was complaining that his socks ordered for Christmas came in March. How we can have missed that?

Everyone was busy shipping. No one was watching whether we were doing the things right. No one has time for that! But our customers are not happy. We need to do something, otherwise we lose trust and possibly our whole business.

So we start verifying what goes in to the boxes, whether the box is being shipped to the correct address, whether the goods are not defected. It takes us time, but we save much more since we don’t have to deal with corrections, redeliveries and apologies. Most importantly, we are gaining the trust back.

We have: storage, data , some kind of data organisation, metadata, process for metadata, quality control

When “the Stuff” is the Data

In the data warehouse, instead of cashmere cardigans, shaving creams and rings we store invoices, customers, orders,… Our end-customers are people asking questions, in most cases for making decisions. We want to provide the answers in an understandable way, we need to precisely know where the information for the questions is stored, we want to deliver the most accurate answers possible and we want to deliver it in time.

Datawarehouse “shipment” delivery

To be able to do that, we do the same as with the warehouse: we store information about our stuff, which is “the data” in our case. We store metadata.

Summary

Even with this small retail business warehouse, we learned that having a storage space and people handing goods are not enough.

We needed to know where things are We need to make sure that the storage place of things are obvious to the others working in the warehouse We have to be aware of what we are delivering. We need to be able to deliver in time.

Data warehouse is similar, just with data. As warehouse is not just a storage place, boxes, shelves and stuff, the data warehouse is not just bunch of tables and records in a database.

Warehouse includes everything around the stuff, including processes how to handle the stuff to deliver the shipments. Point of data warehouse is mostly in the information about data (metadata) and processes how to handle the data to deliver answers. Data warehouse is rather a concept than a physical place. The concept is also technology agnostic, despite most of the traditional ones exist in relational databases which might induce false notion of its obsolescence due to the emerging data store technologies.

Data Warehouse is Data and Meta-data

What we have missed in this introduction? We are dealing with stuff, but how we know which stuff is the actual stuff we are interested in? We know that we need to organise, but do we know how to organise? We know that we need to watch for quality, but do we know what the quality exactly is? The answers lie in the conceptual modelling, quality measurement, and governance in general. We can talk about it later, if you are interested.

So, do you really have a data warehouse or just a database with tables?