I recently wrote a piece discussing how I defeated the Ethereum DDos attack using QuickBlocks. Doing this was important because it freed me from the pain of a slow RPC. Speed allows me to analyze the Ethereum data iteratively. I can find more interesting stuff.

The Ethereum dataset is big (and growing). I want to be able to scan through the entire thing. I want to be able to do this on a laptop. This last fact makes it impossible for me to create a separate, independent copy of the data. If I want to do data analysis, I have to do it against the node’s data directly.

In this article, I am going to try to help you understand why the Fall 2016 DDos presents such a severe problem if one is trying to scan the node’s data directly. By the end of the article, I hope you understand why the RPC is so freaking slow.

You may download the data I produced for this story and play around with an interactive graph here and here. Let’s dive in to the data.

Traces per Transaction

I scanned each of the first 5,000,000 blocks. At each block, I scanned all the transactions in that block. At each transaction, I counted the number of traces generated by that transaction. (Parity delivers one trace for every transaction plus more traces whenever a transaction sends ether to an account, calls into a smart contract, creates a new smart contract, or suicides.)

I scanned groups of 50,000 blocks at a time and incremented a trace counter each time the transaction had that number of traces. Below is the top-left corner of the data. Groups of blocks go across the top of the table. The number of transactions with the given number of traces appears in each cell. For example, in the first 50,000 block group, there were 1,871 transactions all of which created a single trace. Between blocks 150,000 and 200,000, there were 108 transactions that created exactly 11 traces (for a total of 1,980 traces — this will become important later).