Exploring IOTA #ICT-3, Reverse engineering the code Part-2

Diving deeper in this little code base of ICT (IOTA controlled agenT). Trying to figure out why invalid Txs (Transactions) got passed ICT and IRI (IOTA Reference Implementation). Finally discussing what will be the next step in this testing phase of ICT.

The ICT test phase 0.1.1 was stopped by CfB (Come-from-Beyond) phasing it out over a week, requesting the participants to disconnect their ICT clients from the main net and their connected nodes. Over a week you could follow a few testers confirming on DISCORD ICT off that was about a month ago in late July. But the why and what is next will be discussed later.

Now we deep dive into why ICT during test phase 0.1.1 has been over time poorly performing due to invalid transaction which never should have been gossiped by either ICT nor IRI (IOTA Reference Implementation is the node software which holds the distributed tangle).

The source code of the recompiled ict-0.1.1 can be found here (no worries CfB stated that we can expose the code as he is about to publish it… As there are new test phases entered the next version might be significantly changed anyways).

Come-from-Beyond statement exposing the code base

When we started the alpha testing of ICT 0.1.1 there have been phases where my ICT was not active really even though it was connected to at least 4 neighbor IRI nodes. And with active I mean my ICT was almost not gossiping Tx to its neighbors.

7 days ICT performing on the f1-micro cloud instance. The first 2–3 days almost no outbound Tx as of invalid transaction getting gossiped.

In order to analysis the cause we need to understand what the ICT logic stopped sending the UDP (User Datagram Protocol)packets back to its neighbors.

Sharing the analysis on DISCORD #ICT channel

I recognized that the invalid Tx counter was active in every epoch (every minute of processing before ICT is reset and starts over again). Checking the logic the IRI node (ICT is connected to) gets ignored with further sharing its Tx once the invalid Tx has been seen by ICT.

That means depending on the time when the invalid Tx is recognized no Tx are transmitted. In our case there where invalid Tx after already a couple of seconds in an epoch. This leads to almost no outbound traffic

After recompiling the Trasaction.java and adding some lines in that class which allowed me to check why the Tx are invalid I concluded that all invalids are caused by an earlier TimeStamp than

By using the following command we get the timestamp in an human readable format.

$date -d @1508760000

Mon Oct 23 12:00:00 UTC 2017

I added in tha if statement above following command

System.out.println(Converter.trytes(trits,0,8019));

and was able to retrieve the invalid Tx which caused the ICT to stop gossiping. It turned out this time the infamous FPS bundle caused the ICT to stop gossiping

BUNDLE: FPSJJPZO9LGRIZLLHTNCBEELJHKJSPXJDXLFGKPTTTXZMAZZNKIXHQTTOPURGGVLKNZVAS9FTCUFUIMB9

Looking at the transaction object ( you can get a better view of a transaction by converting the trytes to the below representation using iota.utils.transactionObject(TxInTrytes) )

{ hash: 'ZQNFEEIHXB9LEHAUDYMEDQCKA9CGYHIYORBYJTZ9XUDULKWAE9KDBXG9DVQEHPMFWZRUEKDVXJLCA9999',

signatureMessageFragment: 'KTCNPN...(shortened)...UOZ',

address: 'JNYEGRFRQQNYQNMJV9YRPRWMEGBZYLNHURIGEGQWF9AISLMQEUZOEBDBQYETETKEBLUQNGVAOGWXHQKEY',

value: 0,

obsoleteTag: '999999999999999999999999999',

timestamp: 1507219591,

currentIndex: 2,

lastIndex: 3,

bundle: 'FPSJJPZO9LGRIZLLHTNCBEELJHKJSPXJDXLFGKPTTTXZMAZZNKIXHQTTOPURGGVLKNZVAS9FTCUFUIMB9',

trunkTransaction: 'NEJKMDKIPBRISDEXICIXGMNCGTV9NGKLMPNIOUWESVIIUQFOEQQHZEENBWFWLAIFREZBYRNJWOISZ9999',

branchTransaction: 'SIQJAMJKQISZGI9J9JAKFSGVLAPLMMBVJEXLSQLLEZQTIKQPQTBZ9JFIARRFKRMLMXZNQUVEFJYEA9999',

tag: '999999999999999999999999999',

attachmentTimestamp: 1531119376226,

attachmentTimestampLowerBound: 0,

attachmentTimestampUpperBound: 12,

nonce: 'BFA999GC99IPA99999999999999' }

As seen above the timestamp of this Tx is before the allowed one which throws an error and ignores this node from that moment on.

$ date -d @1507219591

Thu Oct 5 16:06:31 UTC 2017

In the code that is realized by checking whether the invalid transaction counter of a neighbor is exceeding zero. If that is the case, the while loop is entered again (realized by continue; )

101 if (neighbor2.numberOfInvalidTransactions != 0) {

102 continue;

103 }

And this is where the counter is increased to one (the counter will not exceed one as the loop is discontinued as with the above statement.)

Snippet of ICT.java where neighbors with invalid Tx got ignored in the epoch

Looking at the FPS Bundle it apparently has been issued back in October the 5th and until now there are still reattaches. At some point the bundle stopped getting reattached and the ICT invalid Tx counter stayed 0 and the network got healthy again.

Yellow marked the invalid Tx counter getting 0 again after the FPS stopped getting reattached

Which leaves us the questions why did IRI pass those invalid Tx in the first place. And the answer to that questions in my opinion should be found in the IRI source code.

Heading over to https://github.com/iotaledger/iri and searching in the code base for Timestamp — we like to find out where IRI decides to ignore a transaction based on its timestamp.

After some trials (there is currently no documentation of the code available) I found the class TransactionValidator.java . Here is the boolean function hasInvalidTimestamp(TransactionViewModel transactionViewModel) returns true once the transaction timestamp meets listed criteria, one of them it needs to be larger than snapshotTimestampMs

return transactionViewModel.getAttachmentTimestamp() < snapshotTimestampMs || transactionViewModel.getAttachmentTimestamp() > System.currentTimeMillis() + MAX_TIMESTAMP_FUTURE_MS;

Which ist according to IOTA.java

long snapshotTimestamp = configuration.longNum(Configuration.DefaultConfSettings.SNAPSHOT_TIME);

and Configuration.java at the moment

public static final String GLOBAL_SNAPSHOT_TIME = "1531148400";

But this still doesn’t tell us that all transaction with lower timestamp should get ignored by IRI.

We find in Node.java a function preProcessReceivedData(...) which should preprocess incoming transactions from its neighbors. It loops through all neighbors IPs and only continue once the IP address of the sender senderAddress matches on of the IPs of the neighbors. This step actually ignors all incoming packets from other IPs to be further processed. Than their is some random dropping of transaction done — don’t know the purpose yet.

After that the try {} statement

from the transaction message a hash is created

check whether the transaction has been recently seen recentSeenBytes

Only if the transaction is not cached the transaction gets validated. The Object TransactionModel is created and with that the validation run is entered where our earlier exception should be thrown once the timestamp of the transaction is earlier than the GLOBAL_SNAPSHOT_TIME . This exception is caught with

} catch (final TransactionValidator.StaleTimestampException e) {

and increases the numberOfInvalidTransaction++ for that neighbor.

This whole validation and preprocessing is handled via a hash of the actual transaction. When the runValidation throws the timestamp exception the transaction should be removed from the queue of transactions to be further processed:

transactionRequester.clearTransactionRequest(receivedTransactionHash);

Clearly the Tx should not been transmitted in the first place this is obvious by the implementation in IRI. Still I am not able to figure out where the bug resides. It could have many reason but one possible explanation is that the faulty transaction is not removed from the queue and therefore gets processed further. Another possibility is that the preProcessReceiveData is getting bypassed.

As I still have no full node actively participating on the tangle I am not able to debug this situation. I think I will leave that for another time — maybe it will be already solved with the next IRI release or its actually not a bug but a feature 😅