A brief explanation of wysker’s image recognition technology.

One of wysker’s main value proposition is to provide users with new and exciting content through what we call product streams. It’s a novel form for displaying content in an extremely fast and receptive way. Streams in the wysker mobile app are looped, just like a GIF. They are built by piling up visually matching product images belonging to the same product category, style or other visually appealing features.

a wysker stream: Loop consisting of visually matching product images.

And here, wysker makes use of the brain’s tremendous absorption capacity of visual data. By introducing streams and a single-button navigation, we created a new browsing experience for mobile.

As these streams are looping very fast, we need to make sure they still feel natural so that wyskering doesn’t become exhausting. This breaks down to several interesting image processing and machine learning tasks.

In a first step, product images and information are automatically drawn from an online shop or marketplace. Usually, these pictures are all different in their appearance, size and colour. To be able to provide an enjoyable user experience, wysker adjusts every product image to align it with the overall product pool. For this, wysker pre-processes product images to normalise position, image scale and background color with a custom and self-developed image processor.

wysker’s image processor

Generally speaking, streams in the wysker app should contain products of the same category so users are able to grasp and concentrate on the rapidly changing product differences easily. Information about the product category, price, brand, available sizes, color and shipping destinations can be obtained directly from the web presence of a retailer. However, early experiments performed by wysker, suggested that these features are not sufficient to build receptive product streams since the occurrence of flickering when randomly stacking product images makes it hard to grasp the content of a wysker product stream.

A fuzzy sneaker stream: Flickering makes it impossible to grasp the products

Interestingly, streams are appealing when successive product images have a similar outline and background. Furthermore, the stream quality further increases when displayed products are similar in color and/or material. This makes wysker’s streams even more valuable. The app undertakes the task of researching product alternatives and puts these into context. It makes overviewing product ranges much faster than conventional scrolling.

To be able to scale our inventory efficiently, we developed a mechanism to gradually reduce manual stream creation based on human contribution. It matches similar looking product images of one category and produces new streams from our existing database and connected APIs.

Understanding and extracting features of images to recognise objects in them is nowadays mainly done by convolutional neural networks (CNNs). If you don’t know about this, we highly recommend reading about it further, just check out this excellent Stanford lecture for a more technical introduction.

Beside automatic stream creation, wysker has been and is still building streams manually. As a result, we have a rich dataset consisting of more than 100k product images as a basis for supervised training. These were used to train our self-developed CNN in an unsupervised fashion. The goal of the CNN is to extract a low dimensional feature vector from a given image — representing information such as product color, structure, shape and size. Subsequently, this feature vector is then used to further reconstruct the original image — a standard CNN auto-encoder.

What you end up with is a low dimensional feature vector extraction from any given image. This enables efficient and sensible image difference computation so we can match close images into groups. As a result, we can produce smooth product streams.

A long automated stream creation

Finally, we sort the display order of these images to get the smoothest loop within the given image group. This is done by searching for a path with overall minimum distance and the same start/ end image. In a different context, this is known as the famous Travelling Salesman Problem (see visualisation below) and completes wysker’s automatic stream creation process for now.

Travelling Salesman Problem — Source: https://media.giphy.com/media/EODiwfB7tJp1m/giphy.gif

It is fascinating how the power of CNNs and other machine learning techniques can be used to solve extremely challenging and exciting vision and data problems at wysker. This enables us to scale our content database.

Some future challenges at wysker in the field of machine learning remain untouched — contact us if you are curious or have ideas on how to develop our mechanism further.