Part 4 — Processing and storing our sensor data

So where(and what) is our sensor data? When you created your device registry you also created a Pub/Sub topic for it where your device’s sensed data can be accessed, named projects/<project id>/topics/<registry name>. Going back to the MqttBridge class you can see how the MQTT topic we subscribe to is constructed by the getTelemetryTopic() method. This allows us to specify which device in our registry we are supplying data for. If you run the sample dummy sensor code and subscribe to your registry topic(you can do this a number of ways, one being Google’s gcloud tools) you will see your dummy sensor data being updated.

Notice that the data is a string, not just the sensor value, the data we send over the MQTT bridge is actually a stringified version of the SensorData class which gives us the type of the sensor, the value and the timestamp(the ‘at’ property) the value was sampled. This timestamp, called sample time or more formally acquisition time is quite important in the IoT world, no matter how much more processing you do with your data this is the real time at which the associated value was read. As a result of this in large scale SCADA systems the accuracy of time across the sensor estate is crucial, usually large sensing plant such as say a nuclear facility synchronize time via the use of highly accurate radio clocks and master clocks. Acquisition time should be stamped on the sensed value as soon as possible, preferably by the sensor but this is not always possible.

So, lets now look at what we need to represent this data in BigQuery. In BigQuery create a dataset, I named mine iot_home and a table named ‘sensor_data’(this is used in the cloud function so if you change it beware!). It should have just 3 columns that tie in with our incoming sensor data, see below.

Right, so all we need to do now is populate it. We do this using a cloud function, this attaches itself to our registry topic(called its trigger), transforms the incoming data and updates our table above. Cloud functions are implemented in GCE using nodejs, so they are written in java script. Create a cloud function(name isn’t important) and as its trigger attach it to our iot-home topic, this should be visible in the topic drop down. Now have a look at the index.js file in the cloudfunction directory of the project.

This is a small java script function that imports our BigQuery interface, gets the incoming topic data, logs it, transforms it into values we can use to update our BigQuery table and updates it. It really is that simple, you can copy this into your cloud function, also update the package.json file with the one from the project to pull in the BigQuery API.

Now, every time our sensor updates we should trigger our cloud function which will update the BigQuery table, from there we can query the incoming data, export it and do what ever analytics we want.

So lets do it, run the dummy sensor code, if your cloud function is set up correctly you should be able to go to your Logging pages and see lines like this :-

D iot-home-sensor 93758829817053 Function execution started iot-home-sensor 93758829817053

I iot-home-sensor 93758829817053 dummy:67:1525430789125 iot-home-sensor 93758829817053

If your not seeing this then something is wrong, if all is well go to your BigQuery table and run a query on it like this :-

SELECT value, at FROM [iot_home.sensor_data]

WHERE type = 'dummy' ORDER BY at DESC

This will show you your incoming dummy sensor data values as they come in in real time.

From here you can export the data to any analytics tool of your choice to graph the data, I simply exported the data as a CSV file and used Google Sheets to do this, there is also a plugin for Google’s DataStudio product to try if you wish. I’ll leave this as a reader exercise.

OK, we have now set up and proven our information chain from sensor to storage, lets have a look at interfacing to real sensors.