The Lightning Network Routing Problem, Explained

There are a lot of ideas floating around about the way the Lightning Network works or could work. With all these varying ideas about the way things work, it’s difficult to clearly communicate or articulate what the routing problem is. For different “solutions” of Lightning Network routing, there are different routing problems. This article will hopefully explore some of the miss-understandings about Lightning Network routing and hopefully gives readers a clearer perspective of the Lightning Network routing problem.

To understand routing, let’s take a real-world situation to provide some context to our discussion. Alice has an acquaintance, Dave, that she owes $30 from splitting lunch. Dave says, “I’m on Lightning Network, so you can just send me the $$ on there. Send it to address XYZ.” Since Alice doesn’t interact with Dave that often, she doesn’t have a Lightning Network channel with Dave, nor does she want to spend a large on-chain transaction fee to open a channel directly with Dave.

The LN leaves path finding up to the sender, so how does Alice find a path?

Let’s start by looking at a “broadcast implementation” of the Lightning Network, and discuss how it handles routing. In this implementation, there is a network map (or graph) that is maintained. So, Alice just needs to look at the network map, find a route to Dave with at least $30 available in each connection, select that route, and boom, we’re done.

Well, that was easy. I thought you said there was a routing problem, but we found a route in about 0.2 seconds with no hassle at all. The Lightning Network seems to be working great!!!

As with all things in life, the devil is in the details. Where did the network map come from? And how is it maintained? To maintain an up to date network map, every single transaction needs to be broadcast to every node in the network. This is where things get ugly fast. With on-chain bitcoin, there’s usually around 1,000 to 5,000 full nodes running, which can support an indefinite (and possibly unlimited) number of users. This means that to execute a typical bitcoin transaction, around 1,000 to 5,000 messages of 255 bytes each need to be sent around the globe.

With the Lightning Network, this proposed routing system, and a network size of 100,000 users, 100,000 messages of around 255 bytes would need to be sent and received by each user – because in the Lightning Network ever user is a node. So, we can see where the scaling problems starts to creep in, but it gets worse from there because that estimate is just for a single hop transaction. If Alice’s path to Dave goes through Bob and Carol, there are actually 3 transactions that take place; Alice moving money to Bob, Bob moving it to Carol and Carol moving it to Dave. All three of these transactions change important data in the network map, and must be broadcast. That means, now 300,000 messages need to be sent to execute that 1 payment to Dave (Note: it would also work to send 100,000 messages that are 3 times as large).

For a network size of just 100,000 users, we can already see that this routing system is two orders of magnitude worse than just sending a bitcoin transaction. As we scale to millions of users and add more hops into the routing, the inefficiency only escalates.

“Wait a second, I was told Lightning Network transactions didn’t need to be broadcast.”

I hear this type of comment a lot from Lightning Network proponents, and it’s 100% true. One of the original tenants of the Lightning Network is that channels can be private and isolated between two parties. No broadcasts ever need to take place. But then the question becomes, how do you do multi-hop routing? If everyone chooses to keep their lightning transactions private how do you find any routes for payment?

Let’s go back to our original example. Alice wants to find a path to Dave, but in this case, nobody broadcasts their channel state. Now there’s no network map to follow to find a route because there can’t be a network map if nobody has all of the data to make one.

So, how does Alice find a route? The only real way to do it is to send a message to all her known peers asking, “Do you have a channel with at least $30 with Dave.” So, Alice sends that message to Bob, Bill, Ben, Beth, Becky, and others in her network and waits to hear back. Guess what? Nobody she knows has a channel with Dave. Now she must send more messages to her same peers asking, “Can you check with all your peers to see if they have a channel with Dave.”

So now Bob, Bill, Ben, Beth, Becky and others all send out dozens of messages to their friends, and in this case, we’re in luck, Bob knows Carol, who knows Dave. With only a hundred or so messages back and forth, we found a path. But what would have happened if she needed to send money to Frank? That’s two more levels of separation, two more levels of messages back and forth, where each level grows exponentially in the number of messages that need to be sent and responded. And that’s not even the worst part – We don’t know if a path to Frank with XX dollars in it even exists. We could end up just sending messages to peers in the network all day long waiting for a positive response, clogging up the network, and never discovering a routing solution for our payment.

So, the first system doesn’t scale, and not broadcasting channels is a nightmare…

What if we take a hybrid of the two systems? Instead of broadcasting every transaction, we only broadcast opening and closing transactions, which are already broadcast on the bitcoin blockchain. This allows us to maintain a network graph of possible routes, which may or may not still be valid. We can then use that list of possible routes to guess where a path might be found.

Let’s look at the data involved in this new system. First, every opening or closing transaction needs to be broadcast to every node in the network. Like the first example, this initial activity has several orders of magnitude more overhead than a normal bitcoin transaction. For a system of 100,000 users, which are all nodes in the Lightning Network, Alice’s channel opening transaction must be sent to all 100,000 users. So, we’ve dug a deep hole relative to just performing an on-chain bitcoin transaction, but now we have a network map, so each subsequent transaction will require very little data transmission.

In this case, around 20x more data needed to be sent through the network to create the initial graph, so around 20 transactions need to be performed within Alice’s chain before we can break even on the amount of data sent. That’s a lot, but not unreasonable.

But what happens as the system scales to 100 million users? Now initial channel availability and closure needs to be broadcast all 100 million users. That’s a lot of data. Now each channel needs to do about 20,000 transactions before we were better off just broadcasting an on-chain transaction on the bitcoin network.

So, it appears this option doesn’t scale either

There’s a lot of smart people working on LN, surely, they must know the limitations listed above and have a better plan for routing.

They certainly know the limitations of the current system. I’ve seen comments from several LN developers indicating “the current system can scale to around 100,000 users.” This indirectly implies it can’t scale to hundreds of millions of users. I haven’t seen anyone propose a routing system that doesn’t rely on a network map, and I haven’t seen anyone propose an alternate solution to the problem.

So, what gives? That doesn’t make any sense. Why would smart, talented developers sink thousands of hours into developing something they pretty much know can’t work? Why would smart investors dump millions of dollars into a technology that obviously can’t go anywhere with no potential for return on investment?

There is an alternate solution, but it’s not what LN was advertised to be…

The folly of the current system is that everyone needs to know the network map in order to find a path, and the folly of the second system in this article, is that there’s no way to efficiently guess where a path might be found.

What if only a few participants need to have a network map, or paths could be guessed because almost everyone uses a few key well-connected nodes?

Let’s assume for example there are only dozens of well-connected nodes. Alice, recognizing that having a channel with a well-connected node will make things easier, connects with Binance. Alice still doesn’t have a path to Dave, but now she knows who to ask to find a path efficiently. Bob, Bill, Beth, Becky and her other friends are unlikely to be connected to Dave, so instead of sending a message to her friends, she just sends a message to Binance, “Do you do business with Dave or know someone who does?” In our example, Dave isn’t connected to Binance, but Binance can reach out to Coinbase, Lighntning Labs, Kracken, Blockstream, and the other handful of well-connected nodes and find out that Dave uses Coinbase in a matter of seconds. Now Alice has her path: Alice->Binance->Coinbase->Dave. Easy peasy!

Hopefully, anyone reading this already can guess the problems with this system, but I’ll lay a few out, just in case.

First, this system gives a ton of power to these well-connected nodes. Pathfinding might theoretically happen without them, but as described above, it’s nearly impossible to find a path on your own at scale. This power will translate into pricing power. Since there’s no good way to find a route without them, well-connected nodes set the fees for the network.

But I was told fees would be miniscule due to the ease of competition…

Much ink has been spent discussion how fees would be kept in check because there’s no barrier to entry for new nodes or paths to arise, but in truth, there are huge barriers to entry. If your new node only has a few connections, there’s no incentive for the big players to try to use your connections, so you’ll need to market your node heavily. Second, you’re not very “discoverable” as a new node. Alice and Binance don’t know who you are or who you work with unless you broadcast all your connections…there we go again with lots of broadcasting data. Next, there’s the legal issues. Well-connected nodes will definitely fall under money transmitter laws in many jurisdictions, so setting up your own well-connected node or even a slightly connected node can land you in legal hot water. These types of problems establishing new nodes helps to create a large moat around the well-connected node business.

The second obvious problem is centralization. What happens if one of these major nodes crashes? What happens if they need to liquidate and send millions of on-chain transactions to the bitcoin network? What about AML/KYC and privacy – good luck.

Conclusions

Hopefully, this article gives people a better idea of what the Lightning Network routing problem really is. As discussed, there is a working implementation of the Lightning Network on the main net right now, but that system scales worse than native bitcoin. Other solutions exist, but are even less efficient, or are highly centralized. To date, there have been no published papers, or even public theoretical discussions that of that solve the routing problem in a scale-able way that doesn’t result in extreme centralization.

What’s really scary is that there are many other problems with the Lightning Network beyond the unsolved routing problem. The Lightning Network distorts the well-established economic incentives of the bitcoin main chain, it has a huge AML/KYC red flag, and it is incompatible with full blocks. There’s no way to drive adoption when on-chain fees are high and unreliable, and the Lightning Network provides no utility when on-chain fees are low. Those are all topics for another time.

All that said, there are still use cases where the Lightning Network could really excel. Things like micro payments, and pay for consumption are probably right in the LN’s wheel house. Imagine if people could pay $0.01 per video for an ad free YouTube, I bet they’d get a lot of takers. Simply fund your Google channel with $5 or $10 and automatically update the channel state every time you watch a video. Add to that in-app/in-game purchases where small transactions are repeated with a single party. These limited use cases don’t need routing solutions because they’re single hop, don’t destroy bitcoin’s economic incentives because they’re a niche case, and don’t get into AML/KYC territory. For these use cases, I can’t wait to see the Lightning Network on Bitcoin Cash!

No paid content behind the paywall. I'm just trying to get the information out. Tips are welcome. I'd also like to thank reddit user Zelgada for pointing out an error in the original version of this article.