yes. the problem is all parts of the system affect the other parts. So even just a few extra HDD seeks will kill the streaming of data and as it is it lost 50% of throughput due to the additional workload of saving all the sigs.



to be random accessing these to process them as soon as all the data is available would most likely slow things down much more due to the swapping in/out of data. Unless you have 64GB RAM

[doublepost=1457717781][/doublepost]to give an example of how sensitive things are, going to fixed buffer allocation and internally allocating portions of it, versus malloc, sped things up 40%.



There is also the issue of tmp files explosion. Cant have 1 file per block, as 400,000 files is bound to wreak havoc on most normal file systems. So you need to group them together, but then you cant easily delete them, which means total HDD space used cannot be kept small.



So, how to group things, when the blocks are coming in at random order?



My solution to that was to have a file per peer per bundle and it concatenates all block data to that file. When the bundle processing is done, then all the peer/bundle tmp files can be safely deleted. Still there can be quite a few number of tmp files if you have 100 peers, so controlling the request pattern of the peers to limit the number of peer/bundle tmp files is needed.



All the while creating the append only data structures. Maybe somebody more skilled than me would be able to weave in just-in-time sig validation, but I am at the limits of what I can get to work at close to full speed.



As it is, I will have to deal with the pubkeys and sigs in separate files as the 4GB limit of 32bit is very close to exceeded.



Performance engineering is an iterative process. Everything can work but there is a large range of performance within the universe of working solutions.