The software side as expected took a bit of work. The camera has a resolution of 640x480 but has an option to produce lower res images so I have it sending 160x120 JPGs. For some reason the camera worked with a Wemos D1 Mini v1 and Wemos D1 Mini Lite, but not Wemos D1 Mini v2. The camera’s wired directly into the TX/RX ports of the esp8285 which means it needed to be unplugged when flashing new code via the USB port

Now the esp8285 which the Wemos D1 Mini Lite uses isn’t typically known for image processing, but it has 1MB of RAM which is enough to do the job, I needed far less than that in the end. The process starts with the esp8285 receiving a message to get a reading. It powers on the LEDs, waits 10 seconds, then tells the camera to take a picture. The camera sends back a 160x120 pixel jpeg image. Due to the way the camera was mounted, the image actually comes upside-down.

Original image from camera (flipped)

Using the JPEGDecoder library, the image buffer is decoded into something readable, however JPEGs are read in blocks. The top and bottoms of the the image were always uninteresting, so they were skipped. The remaining blocks were read into a buffer the width of the image and the height of one block so the image could be read row by row.

Picture from meter split into blocks

As the image is essentially, grey-scale, only the green channel is saved, reducing the storage to only one byte per pixel. To ensure the numbers were in the expected locations in case the camera is jiggled slightly, the program searches for the bottom right corner of the display. It’s very high contrast so is easy to pick out. Once that origin point is discovered, the digits are extracted into their own 8x16px buffers.

8x16px boxes for each digit and origin

Now in theory it should be possible to read a seven segment display by just reading one pixel for each segment and comparing that to one pixel of the middle lighter area. In practice, the numbers aren’t always lined up perfectly, the lighting isn’t ideal, and the sides can be a bit blurred.

Theoretical points to check

In practice, I had to first boost the contrast, then check a few of the pixels. Boosting the contrast turned out to be fairly simple. Since each row of a number contains at least one segment that’s dark, I could find the brightest and darkest pixels of a row, then anything over the half way point is set to white, anything below is set to black. Doing it row by row also helped with the different lighting issue as the individual rows tended to have consistent lighting even if a column did not.

Digit with increased contrast

I went through two methods of extracting the digit from that. The first method was to check blocks in each of the segments. If the block was mostly black then the segment was considered filled. Then each segment was assigned a bit of a byte and the resulting number in the byte was converted to a digit.

Blocks checked for segments

While this method worked most of the time, there were a lot of exceptions where numbers didn’t quite line up right which meant it wasn’t get a good reading. I actually made a small python program that would run the algorithm on a list of images to see how accurate my changes were as I resized the blocks, but selecting the best blocks was a bit frustrating so I figured why not have a computer to the work?

For method two, I took my original method and had it read hundreds of images. I had a basic sanity check that the result was correct (water meter almost always goes up, but not by too much) and manually corrected the ones that were flagged as wrong. From there I used the sklearn library to train a DecisionTreeClassifier which took a list of the 8x16 pixels either on or off and produced a digit.

Other models definitely could have been more accurate, but a decision tree converts very easily into if-statements that run quickly on my esp8285. The worst case only is 7 comparisons and the whole tree fit into 33 binary if-else-statements or as it’s known in the industry: “AI”.

The pinnacle of machine learning

Once the digits are identified, the results are converted to a number and sent off to HomeAssistant. As with all of my custom devices, this one uses the MQTT protocol to communicate. HomeAssistant receives the data, checks that the water usage is higher than the the previous reading (though I identified a couple times it actually went DOWN by 0.001ft³), then stores that value to show on the front end and draw pretty graphs.