Over a year ago, the StellarX team added a GitHub issue about a strange 500 Internal Server Error returned by Horizon from time to time. After checking it out, we realized the cause: Horizon was getting offers from the Stellar-Core database, and some of those offers were coming in from ledgers not yet ingested by Horizon. To get a creation time for offers, Horizon checks the ledger close time, and if a ledger has yet to be ingested, that check comes up empty. Hence the server error.

Meanwhile, we were getting reports about very long response times for the {% code-line %}/paths{% code-line-end %} endpoint. Stellar has a built-in decentralized order book that allows users to exchange one asset for another, and the {% code-line %}/paths{% code-line-end %} endpoint helps you find the best set of intermediate trades so that you can exchange the minimum possible amount of the source asset for the maximum possible amount of the destination asset. Horizon relied on the Stellar-Core database, which just isn’t optimized for path-finding queries.

Those (mostly minor) issues weren’t exactly new — we’d been aware of similar problems for years — but they were becoming increasingly common. To address them, we had to make a fundamental change to Horizon’s architecture, and to build a completely new system for ingesting data. Over the past few months, that’s exactly what we’ve done.

The new ingestion system, which is included in the Horizon v0.20.1 release, is ready for testing, and in this short blog post I’ll explain the reasoning behind it and outline two experimental features in Horizon that wouldn’t have been possible in the old system. It has some big advantages over the old ingestion system — it’s more consistent and developer-friendly, allows user configuration, and doesn’t overtax Stellar-Core — and eventually we plan to move over to it completely. At the moment, both systems are working concurrently, and if you enable a special feature flag in Horizon you can test the new system and its new features right now!

The new ingestion Golang package can also be used outside Horizon to build custom apps and services. In the coming months, we will publish documentation, examples, and a new blog post explaining how to use it.

Ledger entries and transactions

Let’s start with an extremely brief and simplified explanation of how Stellar-Core works. In short: Stellar-Core is a replicated state machine. Each ledger represents a state that is changed by a set of transactions. Stellar-Core later propagates the information about the new state/ledger to the network and publishes a checkpoint to the history archives every 64 ledgers.

We can see that there are two types of data connected to each ledger:

State which in Stellar are ledger entries : accounts, trustlines, offers, account data.

: accounts, trustlines, offers, account data. Transitions which in Stellar are transactions with operations modifying ledger entries. A payment, for example, is a transition between two account balance states.

Prior to the new ingestion system, Horizon did not have its own view of the state (or ledger entries). Users could request the historical data (transactions, operations, payments, trades, etc.), and Horizon would serve them from its DB directly, but all requests connected to current ledger state were forwarded to Stellar-Core!

This single architectural decision caused many issues, among them:

Horizon relied on the Stellar-Core database schema, which wasn’t optimized for Horizon use cases. XDR-encoded fields and a lack of indexes made some queries slow or impossible.

High load on Stellar-Core. If queries were unoptimized, or there were simply too many of them, the Stellar-Core database would slow down and, in the worst-case scenario, lose track of consensus, which is critical to the correct functioning of the network.

Data was potentially inconsistent. Stellar-Core and Horizon are two different systems. It was possible that Stellar-Core had closed ledger X but Horizon was still ingesting ledger X-1. That’s what caused the StellarX issue mentioned at the beginning of this article.

To solve these problems, the new ingestion system has a full copy of the ledger state built using history archives. Using history archives generates no load on Stellar-Core, and allows building the state not only for the latest ledger but also for any other checkpoint ledger in the past! Having access to the state enables the creation of new features, such as the two examples found below.

Accounts for Signer

“Accounts for Signers” was a highly popular feature request in our GitHub. You can use it to look up all the accounts that match a given signer, which makes it easier for Stellar clients to implement multi-sig in a user-friendly way. If, for instance, you have a primary account that also co-signs several multi-sig accounts — a business account, an account you share with your partner, a family trust account — your Stellar wallet can find them all using your primary account, no extra effort required on your part. With a single key, you can easily manage multiple multi-sig accounts.

Before the new ingestion system, we couldn’t build the “Accounts for Signers” view because Stellar-Core has no index on the {% code-line %}signer{% code-line-end %} field in the {% code-line %}signer{% code-line-end %} table, and the query was not fast enough for real-time, production use. To make matters more complicated, a later version of Stellar-Core added a performance update that removed the signers table completely and moved the XDR encoded array of signers to the {% code-line %}accounts{% code-line-end %} table.

Because the new system can recreate the state, we were able to create a table in Horizon’s database with all signers and proper indexes. This feature has been available since 0.19.0, but is hidden behind a feature flag. You can also try it out on our public Horizon clusters.

Path-Finding

Another feature we released as a part of 0.20.0 is a faster path-finding algorithm. There were many past experiments that tried to improve the performance of path-finding through Horizon. The new ingestion system finally adds these improvements to Horizon.

The previous version of the path-finding algorithm was slow for two reasons:

It was using a database when finding paths by expanding them with new assets. The round-trip time required to expand each single path made this very slow for some queries.

It was using the Stellar-Core database, which is not designed for this specific use case.

To improve path-finding, we used the new ingestion system to build an in-memory order book graph of all the offers in the network. Because we keep everything in memory, access to data is extremely fast. This decreased the response time of the {% code-line %}/paths{% code-line-end %} endpoint for some queries by 10x! To check this out, start Horizon with {% code-line %}ENABLE_EXPERIMENTAL_INGESTION=true{% code-line-end %} feature flag or use our public Horizon clusters. For more information, check out the release notes.

Maximum response time for /paths endpoint in SDF cluster. New version was deployed on 2019–08–26.

What’s next?

In Q3 2019 we will be working on moving all the Horizon features to the new system. We will also publish documentation for the new ingestion package that can be used to build custom, advanced apps and services without Horizon.