What would be ‘Predictive’ ?

Here’s my view on how a system for UTXO set optimization might work, which I would call ‘Predictive’.

This will not contain definitive answers, but I will try to reason about the ways this problem can be addressed, to lay out directions of further research for our team², and to incite a discussion about the topic.

Context

The presented view will be in the context of a service that accepts and sends bitcoin payments, and wants to minimize network fee paid in the course of its operation.

The payments this particular service handles most likely will have a certain pattern. It will depend on the type of the service, on the behavior of their customers, on the payment sending policies, and other factors.

Number and value distribution of the payments will vary depending on the season, day of the week, time of day, and market situation. Market situations are hard to predict, but payment activity characteristics that depend on time periods can be modeled with historical data, which each service most certainly collects, and the predictions can be based on that model.

UTXO sources

The source of UTXO for a wallet of a service that both accepts and sends the payments are:

Incoming payments from their customers

Change outputs from outgoing payments

Top-up of hot wallets from cold wallets

You generally cannot control what payments your customers send to you — but you can try to predict the pattern of upcoming deposits: the number of incoming coins, their values, and their timing. And you can build a probability distribution from that.

You can completely control the transactions from the cold wallet, but the nature of the cold wallet is such that you want to touch it as rarely as possible, and accessing it just for potential fee savings is probably overkill.

The most suitable process for adjusting the UTXO set value distribution is when you build a transaction for an outgoing payment — you have some degree of freedom in what UTXO you will consume to collect the required sum, and the value of new coin you will send back to your wallet, as change output.

Optimization

The most economical way to spend Bitcoin is when you use the least acceptable number of inputs, and do not generate a change output, or generate a change output that will not be expensive to spend in the future. The larger the UTXO value is, the cheaper it is to spend, relatively to the value. On the other hand, generating big change outputs means that a bigger portion of your balance will be in a zero-confirmation state.

When you worked out the probability distribution for upcoming outgoing payments, you can choose the most probable value, and check if there are UTXO in your UTXO set that will fit to satisfy that out-payment. If the next payment will be sent soon, then it is likely that the network fee will not move that much, and you can even predict the exact value of UTXO to create as change output, so that you can send the next payment with a transaction of optimal size.

That change output will likely have to be spent before it confirms, as the next payment may come soon, and in this case, you have to watch out for ‘too-long-mempool-chain’ errors from your bitcoind.

You will not always hit predicted conditions, but if the probabilities for the target UTXO sizes are high enough, you may hit good fits often enough to save on fee.

You can create probability distribution for next N out-payments. N will be dependent on your UTXO churn — how fast UTXO set in your hot wallet completely changes. For payments that are not immediate, you can afford to wait for your change output to confirm, but your predictions may not be as good. This is because you need to predict network fee more far in the future, and new UTXO from incoming payments will also be a factor.

This uncertainty means you will have to store a range of slightly different UTXO that may fit your most-probable out-payments depending on the future situation.

UTXO Stacks

You can go with an approach of having ‘stacks’ of different-sized UTXO for the most probable out-payments. The UTXO in each ‘stack’ can all be slightly different to account for fee variation. The values of UTXO can also be the same within each ‘stack’, but then you will have to use some ‘free-standing’ UTXO in your wallet to make up for fee variations.

Same-sized UTXO is convenient, in that you can even top-up these ‘stacks’ when you send the funds from your cold wallet, and maybe, if you analyze your coin selection results over time, you can decide what UTXO sizes you most certainly want to have in your set.

The approach of analyzing the coin selection results is interesting in that there is much less uncertainty, because you are working with existing data, and not making direct predictions.

You can ask a question, “Historically, when we built the most economical transactions, what is the UTXO size that turned out the most convenient?”, and after an analysis, you could say, “For best results, based our historical usage patterns, we need to maintain these number of UTXO of these sizes”.

Potential risks

Storing extra UTXO in hot wallets to make building optimal transaction easier is a viable approach, but it also increases risk: you are storing more bitcoin in your hot wallet than necessary. Even if you implement spending limits for your hot wallet and secure it to the max — it is still an online wallet. So you need to consider your risk tolerance when going with this strategy. You also need to make sure your coin selection algorithm treats these ‘special’ UTXO differently from others, otherwise it may spend them before the payments they are destined for are due, and potential fee savings will not realize.