Tick and Tickless

A little while ago I posted an example of a simulation, with platelets moving down a pipeline together. When platelets bumped into each other, they stayed stuck together forever in a clot, and moved (or stayed put) together. The original code used a tick barrier, as many of my simulation examples do, to keep the simulation in lock-step. An alternative way to keep the simulation in lock-step is to make sure that every process in the pipeline communicates exactly once with its neighbours each timestep, which makes the tick barrier redundant. In this post I will show how to make my previous platelet example tickless, using this alternative method. I will be working in the formal Communicating Sequential Processes algebra, CSP (with a small extension); a Haskell implementation will follow in my next post.

The Ticking Version, in CSPc

We will start by looking at the ticking (original) version of the simulation in CSP. In fact, I will be using CSPc (CSP with conjunction); CSP doesn’t have the idea of conjunction in it, so I am using /\ as an added conjunction operator (akin to the logical conjunction operator) that conjoins two events into a single event that will occur when, and only when, both of its constituent events occur. All of the processes in this post are site processes: they represent a piece of space in the pipeline that does not move, and may or may not be occupied by a single platelet. A site that is occupied by a platelet is said to be full; otherwise it is empty. Let’s begin with the full site process:

FULL(prev, next) = FULL_WILLING(prev, next) |~| FULL_STATIONARY(prev, next)

The full process is making an internal choice between being willing to move this time-step and being stationary for the time-step. The internal choice means that FULL will make the decision itself and no other outside process (collectively referred to in CSP as its environment) can influence the decision. In our original Haskell implementation we made the choice randomly; a site would be stationary in 5% of the time-steps, and willing in the other 95% of time-steps. The stationary process refuses to do anything until the end of the time-step at which point it loops round to become the FULL process again:

FULL_STATIONARY(prev, next) = tick -> FULL(prev, next)

In contrast, the willing process offers two choices. It will move forwards if the process before it in the pipeline signals that is empty, or if the process before it in the pipeline is willing to move too. (So it will move if the process before it is empty, or full and willing, but not if the process before it is full and stationary.) We specify this using our conjunction operator:

FULL_WILLING(prev, next) = prev.move /\ next.move -> tick -> FULL(prev, next) [] prev.empty /\ next.move -> tick -> EMPTY(prev, next) [] tick -> FULL(prev, next)

(/\ binds most tightly above, then ->, and [] binds least tightly.) This model is an accurate rendering of my original CHP program, but it contains a race hazard. It is possible that a site that is willing to move in a time-step does not do so and ticks instead; if all the sites in a clot (a contiguous group of full cells) were willing, they could just tick repeatedly and never move at all. The details of the CHP library’s implementation prevented this occurring in my original CHP program (and hence I did not realise there was a race hazard), but it is a sin to rely on a library’s implementation of synchronisation for your concurrent program to be correct. (If I had been able to model-check this code, I could have discovered this problem; see the summary at the end.) The problem could be removed if we gave priority to the movement event; see the previous discussion of priority on this blog and Gavin Lowe’s paper on implementing priority with a small extension to CSP.

The empty site is as long as the full site, but that is because it is repetitive:

EMPTY(prev, next) = prev.move -> ((next.empty -> tick -> FULL(prev, next)) [] (tick -> FULL(prev, next))) [] next.empty -> ((prev.move -> tick -> FULL(prev, next)) [] (tick -> EMPTY(prev, next))) [] tick -> EMPTY(prev, next)

The empty site is willing to optionally accept a movement from behind it in the pipeline and/or signal to the process ahead of it that it is empty, before ticking. Like the full site, this again has the race hazard that it could tick without accepting the movement of a willing platelet from behind it.

To wire up our pipeline, we start with N EMPTY sites in a row (with the next event of each connected to the prev event of the following process as you would expect) synchronising on the tick event together with a GENERATOR at the beginning, but the END process does not need to synchronise on the tick event — the latter two processes being defined as follows:

GENERATOR(next) = ON(next) ; OFF(next) ; OFF(next) ; GENERATOR(next) ON(next) = next.move -> tick -> SKIP [] tick -> SKIP OFF(next) = tick -> SKIP END(prev) = prev.move -> END(prev)

The generator sends out a platelet every three steps (and again has the aforementioned problem that EMPTY and FULL have). The END process doesn’t need to synchronise on the tick event because it all it does is synchronise on move as frequently possible; the FULL process already rate-limits itself to one move event per time-step, so this is acceptable behaviour. In this ticking CSP version, we don’t really need this END process at all, but it’s instructive to include it because our tickless version will need one. The CSP up to this point makes up the full model of our original clotting program (minus the wiring, which isn’t very interesting).

The Tickless Version, in CSPc

The design of the tickless version is more complicated than the original version. In a simulation with a tick event, we can use implicit signalling. Each process will offer to perform some actions, then eventually it must agree to synchronise on the tick event (if all the processes didn’t eventually tick, we’d get deadlock!). So you can gain information if you offer a choice of event A with your neighbour, or tick with everyone, and the tick happens. This means that your neighbour did not choose to offer event A to you before it offered tick. We often use this implicit information in the simulation. In the previous platelets code, a full site not willing to move would not offer a movement, and would wait for a tick. A full site willing to move would offer to move with its neighbours, but if tick happened instead, it knew that one of its neighbours was not willing to move, and they had implicitly agreed to stay put by synchronising on tick instead. (The idea is good, but my use of it above led to the problem where willing platelets may not end up moving — this can be solved in the tickless version, though.)

If we want to remove the tick event from our pipeline, we therefore have to add more events between the processes, to allow them to explicitly communicate what they used to implicitly communicate. Peter Welch suggested a possible solution — but he saw that his suggestion had the problem now revealed in the original version, mentioned above. I was able to improve on his idea to remove the problem, and I describe my improvement to his solution here. The tickless version involves introducing two new events (besides move and empty) that indicate further information: canstay and muststay.

If there was only one stay value, then that is all that full stationary sites would offer. But willing full sites would also have to offer to stay, in order to synchronise with stationary neighbours (if one platelet in the clot stays they all must). So all willing sites would offer to stay, and this could allow a clot of willing platelets to agree to stay even though they were all willing to move. This is the same problem as we had in our ticking version. To remove the problem, we differentiate the events offered by a full stationary site (who will offer muststay) and a full willing site (who will offer canstay). Here is how they are used in the new FULL_WILLING process:

FULL_WILLING(prev, next) = prev.empty /\ next.move -> EMPTY (prev, next) [] prev.move /\ next.move -> FULL (prev, next) [] prev.empty /\ next.canstay -> FULL (prev, next) [] prev.canstay /\ next.canstay -> FULL (prev, next) [] prev.muststay /\ next.muststay -> FULL (prev, next)

The first two cases are just as before, in the original version. The middle case is for when the process is at the beginning of the clot; it synchronises on empty with the process behind it, and canstay with the process ahead of it in the clot. The last two cases can be thought of as perpetuating the canstay/muststay event through the pipeline.

The new FULL_STATIONARY process is as follows:

FULL_STATIONARY(prev, next) prev.empty /\ next.muststay -> FULL (prev, next) [] prev.muststay /\ next.muststay -> FULL (prev, next) [] prev.canstay /\ next.muststay -> FULL (prev, next)

The first case is for if this process is at the beginning of the clot; it synchronises on empty with the process behind it, and muststay with the process ahead of it. Looking up at the FULL_WILLING process, we can see that any FULL_WILLING process (from the last case) and any FULL_STATIONARY process (from the middle case immediately above) that synchronises on muststay with the process behind it will also synchronise on muststay with the process ahead of it. So if the process at the start of the clot synchronises on muststay, all processes ahead of it in the clot will also synchronise on muststay (by induction).

The third case of the FULL_STATIONARY process indicates that the processes behind the stationary one may offer canstay, and it will then offer muststay to all the processes ahead of it. The canstay event will only be offered from the previous process if it is in the FULL_WILLING state (FULL_STATIONARY only offers muststay to the process ahead of it, and we will see shortly that EMPTY only offers empty to the process ahead of it), which must then synchronise either on canstay with the process behind that (which, again, must be a FULL_WILLING process) or empty (which means it’s the start of the clot). All the full processes after FULL_STATIONARY, following the logic in the previous paragraph, will synchronise on muststay regardless of their state.

The new EMPTY process is as follows:

EMPTY(prev, next) = prev.empty /\ next.empty -> EMPTY (prev, next) [] prev.muststay /\ next.empty -> EMPTY (prev, next) [] prev.move /\ next.empty -> FULL (prev, next)

All cases offer the empty event to the process ahead of it. It will accept from behind: the empty case (when the previous site is empty), the move case (when the previous site is full and able to move forward) and the muststay event (when the previous site is part of a clot that cannot move). It does not accept the canstay event, which is crucial, for reasons explained in the next section.

The new ON, OFF and END processes are:

ON(next) = next.move -> SKIP [] next.canstay -> SKIP OFF(next) = next.empty -> SKIP END(prev) = prev.empty -> END(prev) [] prev.muststay -> END(prev) [] prev.move -> END(prev)

You can think of as ON as being the “forward half” of a FULL_WILLING site that is receiving empty from behind it; similarly, OFF is the forward half of an EMPTY site and END is the “backward half” of an EMPTY site.

Proofs

Since conjunction is an extra feature in CSP, there is no direct model-checking support for it. (We have designed a mapping from CSPc to CSP, but that causes a state space explosion and does not yet have automated tool support.) I will offer instead, inductive proofs about clots. By proving a statement for the beginning site, and optional middle/end sites based on the neighbour behind them, this should inductively cover all non-empty clots. This can be done by considering the pairings of prev and next events, to see when offering a set of events from the previous site, what might be offered to its next neighbour.

So, let us consider a clot of willing platelets. The site at the beginning of the clot can only synchronise on prev.empty (as that is all the empty site before it will offer). Therefore the site at the beginning of a clot will only offer move or canstay to the next site. Any middle site that synchronises on move or canstay with the previous site will offer the same thing to the next site. So inductively the last site of the clot will only offer move or canstay. We can see that the empty site following the clot will only accept move, not canstay, so a clot of willing processes may only move and may not stay put. This solves the problem that we had with the ticking version, and is why the EMPTY process does not offer to synchronise on canstay. (This result also shows that any line of sites at the beginning of a clot will only offer move or canstay to the sites ahead of it.)

Now let us consider a clot with one or more stationary platelets somewhere along its length (but not the beginning). We have seen in the previous paragraph that the willing sites at the beginning of the pipeline will offer move or canstay to the first stationary site in the clot. This stationary site appearing after these willing sites will only accept canstay, and will then offer muststay ahead of it. We can see that all full sites, stationary and willing, will only synchronise on prev.muststay with next.muststay, so regardless of the stationary/willing state of sites ahead of the first stationary site, muststay will be the event offered at the end of the clot. The empty site will accept this, and so a clot with one or more stationary sites after the beginning will all agree to stay. If a stationary site is at the beginning, it will synchronise on prev.empty and next.muststay, and then the rest of the clot will also synchronise on muststay, so again the clot will stay put. Thus any clot with one or more stationary sites will stay put.

Summary

So we have a system, expressed in a few lines of CSPc, of a sticky platelet simulation without a tick event. I will show a translation to CHP in the next post, which works the same as the original version (minus the potential problem). This work is interesting from a formal perspective because we have no direct support to model check this CSP, due to the conjunction extension. We have devised a mapping from CSPc to CSP, but it generates a large number of events; I believe it would be in the order of 4^N for this problem. We don’t have a tool to generate the CSP just yet, and even if we could, I suspect the model checker may choke on that size of problem. However, by taking advantage of conceptual features of the simulation, namely clots, I was able to perform some inductive reasoning about the problem. The reasoning was aided by the neat symmetry of the problem; each site in the pipeline offered a pair of (prev, next) events in conjunction, which could be thought as a sort of normal form.

From a practical perspective, it can be seen that this is not really a very concurrent simulation. The chained conjunction along the pipeline means that all processes must resolve their actions for the time-step together, and really the power of the simulation is in the resolution of all the choices (which could be written as a single sequential/functional piece of code: transforming the list of all platelets at time-step T to the list of all platelets at time-step T+1), not in the concurrency of the sites. The advantage of the way we constructed our solution is that we have encoded the behaviour of each site by only referring to the two neighbours of a site. These are local rules, that when resolved for all sites, produce the emergent behaviour of sticky platelets bumping, forming clots, and advancing down the pipeline together. There is no global visibility of the system in our code (only in the run-time system) to complicate things. This investigation of emergent behaviour is part of the ongoing CoSMoS research project that uses process-oriented technologies to produce this kind of simulation, and which builds on the original work of the TUNA project from which this blood clotting example is taken.