Rumors

This article is the second in a 3-part series on data pricing. You can find Part I here.

Cost-based pricing (cost + margin)

The simplest method is estimating how much it cost you to gather this data and to maintain it in your infrastructure, add a margin on top of that and use this as a price. A formula for this would take C as the creation plus the maintenance cost, M% margin, P the price to consume the asset and k a parameter corresponding to the expected number of sales at the beginning, after which one will reach the target profit.

P = C * (1 + M/100) / k

This formula gives an initial price that is less than the creation cost, i.e. what it would cost another person to gather and host the same data, making the asset attractive to potential buyers. We’re expecting to sell the data at least a few times, and after k sales, we will have reached our target profit.

Early sketch of a cost-based pricing tool

Comparison with other assets on the market

Naturally, if there are already similar assets on the market, you’d want to take a look at them and try to figure out a price for yours by comparison. The real question here is: how do you make this process both easy and insightful at the same time? Trying to automate too much of the process makes it likely to provide unrealistic estimations, whereas leaving the provider do most of the work is too much friction.

One way to do it would be as follows:

Relying on metadata, tags and additional information, show the assets that come up through the search function of the marketplace and may be similar Let you — the provider — select the ones they deem relevant for comparison purposes Show a graph of the price of those assets over the last month (can be useful to see any market trend for this type of asset) If any common metadata fields are detected, show a side-by-side comparison between your asset and the selected ones. With the help of a guide explaining which factors are important for the quality and price of an asset, this visual comparison can help you position your asset compared to the existing offers on the market.

Here are some important factors to consider for comparison purposes (by no means an exhaustive list):

Volume (can be in GB or number of points but also in other measures like the number of miles of autonomous vehicle data for ex.) Frequency of updates (every 3h vs. every 3 days) Number of attributes (if your dataset is similar to an existing one and you have all the same attributes, except yours contains two more fields for every point, your dataset is obviously more valuable) Precision (for GPS locations for ex.) Brand (even if your asset is better, it will be better to compete with an established and recognized provider)

To sum it up, here is how this comparison could be assisted with proper marketplace tools for data providers:

Early sketch of a comparison tool for datasets

Comparison with offers outside the data market

In some cases, your assets might not only be competing with other assets on the same data market, but also with existing offers elsewhere. For example, if you’re selling a whole package of financial data and tools around it for professional trading firms, you should definitely be looking towards Bloomberg Terminal and Reuters Eikon as competitors. Comparing your product with theirs, taking into account the difference in brand strength and using that to adjust your price compared to theirs will provide a strong estimate for your product.

This is not something that can be reliably implemented within a data marketplace, but it is an important factor to keep in mind for data providers.

Problem-based pricing

Here, the approach is reversed: the customer sets the price they’re willing to pay to receive a specific product. For example, this can take the form of bounties (achieve a certain milestone and get paid X), tournaments (the best performance at the end receives X) or crowdsourcing (everyone who contributes beyond a certain threshold gets appropriately compensated). The success of platforms such as Kaggle clearly indicates there is a massive opportunity in that domain.

The advantage in terms of pricing is that the customer reveals a certain price they’re willing to pay and the provider can make their decision based off that. Additionally, showing past and ongoing bounties that may be related to an asset as the provider is going through the pricing process could also provide very useful information regarding how much customers value this type of asset or product.

Pricing tools

Data pricing is a complex problem, involving a lot of different variables — but is not unique in that regard. Looking at Airbnb’s dynamic pricing tool (here and here), we can see how their engineering team has approached the construction of a sophisticated tool to help homeowners tackle the complicated problem of pricing their home per night — even though most of them have little to no experience in that area. As data markets grow and the marketplaces learn from experience, it’s likely that the endgame for data pricing tools will involve discerning the context, isolating the most important factors for the value of a data asset and incorporating into a model to give a price estimation.

One of the many third-party pricing tools available for Airbnb hosts (source: https://www.airdna.co/)

However, simpler tools can also be useful in addition to the approaches mentioned above. Think visualizations of live and historic market data, as well as continuous feedback on listed assets.

An example: Let’s say a provider has just put a new asset on the market. The provider erred on the high side for the price, and a pattern appears: interested customers first send a small query or they request a sample to confirm their interest, and then they never come back. This type of feedback requires constant attention and is hard to gather for the provider on their own, whereas the marketplace they used could provide them with a clear analysis of what is happening: the asset seems attractive, but is either not that relevant to its potential audience, or too expensive for what it would bring them.

On the other hand, imagine another provider that puts their asset on the market at the right price and registers a few quick sales, but competitors arrive around the same time, with slightly better offers. An alert could be triggered by the marketplace for the provider to notice that assets with similar tags and metadata have been registered recently, so that they can either adjust their price to the competition, or aim to differentiate by offering new versions of their product.

Bringing everything together

The final step is combining all those heuristics into a coherent experience for providers going through a marketplace (or similar tool) to put a data asset on the market. Most of these ideas are not new, but how they are designed and implemented in the real world is fundamental to their effectiveness. One common misconception is that, since these tools are made to be used in a professional context, it is fine to expect providers to do most of the pricing work — thereby forgetting essential design principles.

However, just because there is an economic incentive for providers to be able to monetize their data assets does not mean everything will solve itself on its own. In fact, if we want to unlock the long-tail of data assets currently residing behind closed doors, we have to remember that most providers will be businesses or organizations that have collected data alongside their everyday activities and now realize it might of interest to someone else. All in all, most will have little experience with data markets — and we should be designing accordingly.

Conclusion

Here we have introduced some initial drafts of an interface incorporating these ideas. For example, it is very important that in the vast majority of cases, all the necessary pricing work can be done directly on the pricing page of the publishing flow — without having to open another page to browse the marketplace, search for external information or even negotiate with potential buyers. Reducing friction at this step ensures a maximum number of providers are able to put their data assets on the market and turn a profit, after which they can refine their strategy as needed.