State Machines

The idea of “state” and the “state machine” is so ubiquitous in software programming that we tend to use it all the time without even thinking about it very much: that we have a fixed number of states of some object or system and that there are fixed allowable transitions from one state to another will be almost inevitable when we start modelling some physical process or business procedure.

I quite like the much quoted description of a state machine used in the erlang documentation:

If we are in state S and the event E occurs, we should perform the actions A and make a transition to the state S’.

once we string a number of these together we could consider it to form a workflow. Whole books, definition languages and software systems have been made about this stuff.

Managing State

Often we find ourselves implementing state management in an application on an ad-hoc basis, defining the constraints on state transition in code, often in separate part of the application and relying on the other parts of the application to do the right thing. I’m sure I’m not alone to have seen code in the wild that purports to implement a state machine which in fact has proved to be entirely non-deterministic in the face of the addition of new states or actions in the system. Or systems where a new action has to be performed on some object when it enters a new state and the state can be entered in different ways in different parts of the application so new code has to be implemented in a number of different places with the inevitable consequence that one gets missed and it doesn’t work as defined.

Using a single source of state management with a consistent definition within an application can alleviate these kinds of problems and can actually make the design of the application simpler and clearer than it might otherwise have been.

Thus I was inspired to write Tinky which is basically a state management system that allows you to create workflows in an application.

Tinky allows you to compose a number of states and the transitions between them into an application workflow which can make the transitions between the states available as a set of Supplies: providing the required constraints on the allowed transitions and allowing you to implement the actions on those transitions in the appropriate place or manner for your application.

Simple Workflow

Perhaps the canonical example used for similar software is that of the bug tracking software, so let’s start with the simplest possible example with three states and two transitions between them:

use Tinky; my $ state-new = Tinky::State . new (name => " new " ); my $ state-open = Tinky::State . new (name => " open " ); my $ state-done = Tinky::State . new (name => " done " ); my @ states = ( $ state-new , $ state-open , $ state-done ); my $ new-open = Tinky::Transition . new (name => " open " , from => $ state-new , to => $ state-open ); my $ open-done = Tinky::Transition . new (name => " done " , from => $ state-open , to => $ state-done ); my @ transitions = ( $ new-open , $ open-done ); my $ workflow = Tinky::Workflow . new (name => " support " , : @ states , : @ transitions , initial-state => $ state-new );

This defines our three states “new”, “open” and “done” and two transitions between them, from “new” to “open” and “open” to “done”. This defines a “workflow” in which the state must go through “open” before becoming “done”.

Obviously this doesn’t do very much without an object that can take part in this workflow, so Tinky provides a role Tinky::Object that can be applied to a class who’s state you want to manage:

class Ticket does Tinky::Object { has Str $ . ticket-number = ( ^ 100000 ) . pick . fmt ( " %08d " ); } my $ ticket = Ticket . new ; $ ticket . apply-workflow( $ workflow ); say $ ticket . state . name ; # new $ ticket . next-states >>. name . say ; # (open) $ ticket . state = $ state-open ; say $ ticket . state . name ; # open

The Tinky::Object role provides the accessors state and next-states to the object, the latter returning a list of the possible states that the object can be transitioned to (in this example there is only one, but there could be as many as your workflow definition allows,) you’ll notice that the state of the object is defaulted to new which is the state provided as initial-state to the Tinky::Workflow constructor.

The assignment to state is constrained by the workflow definition, so if in the above you were to do:

$ ticket . state = $ state-done ;

This would result in an exception “No Transition for ‘new’ to ‘done'” and the state of the object would not be changed.

As a convenience the workflow object defines a role which provides methods named for the transitions and which is applied to the object when apply-workflow is called, thus the setting of the state in the above could be written as:

$ ticket . open

However this feature has an additional subtlety (that I unashamedly stole from a javascript library,) in that if there are two transitions with the same name then it will still create a single method which will use the current state of the object to select which transition to apply; typically you might do this where the to state is the same, so for example if we added a new state ‘rejected’ which can be entered from both ‘new’ and ‘open’:

use Tinky; my $ state-new = Tinky::State . new (name => " new " ); my $ state-open = Tinky::State . new (name => " open " ); my $ state-done = Tinky::State . new (name => " done " ); my $ state-rejected = Tinky::State . new (name => " rejected " ); my @ states = ( $ state-new , $ state-open , $ state-done , $ state-rejected ); my $ new-open = Tinky::Transition . new (name => " open " , from => $ state-new , to => $ state-open ); my $ new-rejected = Tinky::Transition . new (name => " reject " , from => $ state-new , to => $ state-rejected ); my $ open-done = Tinky::Transition . new (name => " done " , from => $ state-open , to => $ state-done ); my $ open-rejected = Tinky::Transition . new (name => " reject " , from => $ state-open , to => $ state-rejected ); my @ transitions = ( $ new-open , $ new-rejected , $ open-done , $ open-rejected ); my $ workflow = Tinky::Workflow . new (name => " support " , : @ states , : @ transitions , initial-state => $ state-new ); class Ticket does Tinky::Object { has Str $ . ticket-number = ( ^ 100000 ) . pick . fmt ( " %08d " ); } my $ ticket-one = Ticket . new ; $ ticket-one . apply-workflow( $ workflow ); $ ticket-one . next-states >>. name . say ; $ ticket-one . reject; say $ ticket-one . state . name ; my $ ticket-two = Ticket . new ; $ ticket-two . apply-workflow( $ workflow ); $ ticket-two . open ; $ ticket-two . next-states >>. name . say ; $ ticket-two . reject; say $ ticket-two . state . name ;

You are not strictly limited to having the similarly named transitions enter the same state, but they must have different from states (otherwise the method generated wouldn’t know which transition to apply). Obviously if the method is called on an object which is not in a state for which there are any transitions an exception will be thrown.

So what about this asynchronous thing

All of this might be somewhat useful if we are merely concerned with constraining the sequence of states an object might be in, but typically we want to perform some action upon transition from one state to another (and this is explicitly stated in the definition above). So, for instance, in our ticketing example we might want to send some notification, recalculate resource scheduling or make a branch in a version control system for example.

Tinky provides for the state transition actions by means of a set of Supplies on the states and transitions, to which the object for which the transition has been performed is emitted. These “events” are emitted on the state that is being left, the state that is being entered and the actual transition that was performed. The supplies are conveniently aggregated at the workflow level.

So, if, in the example above, we wanted to log every transition of state of a ticket and additional send a message when the ticket enters the “open” state we can simply tap the appropriate Supply to perform these actions:

use Tinky; my $ state-new = Tinky::State . new (name => " new " ); my $ state-open = Tinky::State . new (name => " open " ); my $ state-done = Tinky::State . new (name => " done " ); my $ state-rejected = Tinky::State . new (name => " rejected " ); my @ states = ( $ state-new , $ state-open , $ state-done , $ state-rejected ); my $ new-open = Tinky::Transition . new (name => " open " , from => $ state-new , to => $ state-open ); my $ new-rejected = Tinky::Transition . new (name => " reject " , from => $ state-new , to => $ state-rejected ); my $ open-done = Tinky::Transition . new (name => " done " , from => $ state-open , to => $ state-done ); my $ open-rejected = Tinky::Transition . new (name => " reject " , from => $ state-open , to => $ state-rejected ); my @ transitions = ( $ new-open , $ new-rejected , $ open-done , $ open-rejected ); my $ workflow = Tinky::Workflow . new (name => " support " , : @ states , : @ transitions , initial-state => $ state-new ); # Make the required actions $ workflow . transition-supply . tap ( -> ( $ trans , $ object ) { say " Ticket ' { $ object . ticket-number } ' went from { $ trans . from . name } ' to ' { $ trans . to . name } ' " }); $ state-open . enter-supply . tap ( -> $ object { say " Ticket ' { $ object . ticket-number } ' is opened, sending email " }); class Ticket does Tinky::Object { has Str $ . ticket-number = ( ^ 100000 ) . pick . fmt ( " %08d " ); } my $ ticket-one = Ticket . new ; $ ticket-one . apply-workflow( $ workflow ); $ ticket-one . next-states >>. name . say ; $ ticket-one . reject; say $ ticket-one . state . name ; my $ ticket-two = Ticket . new ; $ ticket-two . apply-workflow( $ workflow ); $ ticket-two . open ; $ ticket-two . next-states >>. name . say ; $ ticket-two . reject; say $ ticket-two . state . name ;

Which will give some output like

[open rejected] Ticket '00015475' went from new' to 'rejected' rejected Ticket '00053735' is opened, sending email Ticket '00053735' went from new' to 'open' [done rejected] Ticket '00053735' went from open' to 'rejected' rejected

The beauty of this kind of arrangement, for me at least, is that the actions can be defined at the most appropriate place in the code rather than all in one place and can also be added and removed at run time if required, it also works nicely with other sources of asynchronous events in Perl 6 such as timers, signals or file system notifications.

Defining a Machine

Defining a large set of states and transitions could prove somewhat tiresome and error prone if doing it in code like the above, so you could choose to build it from some configuration file or from a database of some sort, but for convenience I have recently released Tinky::JSON which allows you to define all of your states and transitions in a single JSON document.

The above example would then become something like:

use Tinky; use Tinky::JSON; my $ json = q :to / JSON /; { "states" : [ "new", "open", "done", "rejected" ], "transitions" : [ { "name" : "open", "from" : "new", "to" : "open" }, { "name" : "done", "from" : "open", "to" : "done" }, { "name" : "reject", "from" : "new", "to" : "rejected" }, { "name" : "reject", "from" : "open", "to" : "rejected" } ], "initial-state" : "new" } JSON my $ workflow = Tinky::JSON::Workflow . from-json( $ json ); $ workflow . transition-supply . tap ( -> ( $ trans , $ object ) { say " Ticket ' { $ object . ticket-number } ' went from { $ trans . from . name } ' to ' { $ trans . to . name } ' " }); $ workflow . enter-supply( " open " ) . tap ( -> $ object { say " Ticket ' { $ object . ticket-number } ' is opened, sending email " }); class Ticket does Tinky::Object { has Str $ . ticket-number = ( ^ 100000 ) . pick . fmt ( " %08d " ); } my $ ticket-one = Ticket . new ; $ ticket-one . apply-workflow( $ workflow ); $ ticket-one . next-states >>. name . say ; $ ticket-one . reject; say $ ticket-one . state . name ; my $ ticket-two = Ticket . new ; $ ticket-two . apply-workflow( $ workflow ); $ ticket-two . open ; $ ticket-two . next-states >>. name . say ; $ ticket-two . reject; say $ ticket-two . state . name ;

As well as providing the means of constructing the workflow object from a JSON description it adds methods for accessing the states and transitions and their respective supplies by name rather than having to have the objects themselves to hand, which may be more convenient in your application. I’m still working out how to provide the definition of actions in a similarly convenient declarative way.

It would probably be easy to make something similar that can obtain the definition from an XML file (probably using XML::Class,) so let me know if you might find that useful.

Making something useful

My prime driver for making Tinky in the first place was for a still-in-progress online radio management software, this could potentially have several different workflows for different types of objects: the media for streaming may need to be uploaded, it may possibly require encoding to a streamable format, have silence detection performed and its metadata normalised and so forth before it is usable in a show; the shows themselves need to have either media added or a live streaming source configured and then be scheduled at the appropriate time and possibly also be recorded (and then the recording fed back into the media workflow.) All of this might be a little too complex for a short example, but an example that ships with Tinky::JSON is inspired by the media portion of this and was actually made in response to something someone was asking about on IRC a while ago.

The basic idea is that a process waits for WAV files to appear in some directory and then copies them to another directory where they are encoded (in this case to FLAC.) The nice thing about using the workflow model for this is that the code is kept quite compact and clear, since failure conditions can be handled locally to the action for the step in the process so deeply nested conditions or early returns are avoided, also because it all happens asynchronously it makes best of the processor time.

So the workflow is described in JSON as:

{ " states " : [ " new " , " ready " , " copied " , " done " , " failed " , " rejected " ], " transitions " : [ { " name " : " ready " , " from " : " new " , " to " : " ready " }, { " name " : " reject " , " from " : " new " , " to " : " rejected " }, { " name " : " copied " , " from " : " ready " , " to " : " copied " }, { " name " : " fail " , " from " : " ready " , " to " : " failed " }, { " name " : " done " , " from " : " copied " , " to " : " done " }, { " name " : " fail " , " from " : " copied " , " to " : " failed " } ], " initial-state " : " new " }

Which defines our six states and the transitions between them. The “rejected” state is entered if the file has been seen before (from state “new”,) and the “failed” state may occur if there was a problem with either the copying or the encoding.

The program expects this to be in a file called “encoder.json” in the same directory as the program itself.

This example uses the ‘flac’ encoder but you could alter this to something else if you want.

use Tinky; use Tinky::JSON; use File::Which; class ProcessFile does Tinky::Object { has Str $ . path is required ; has Str $ . out-dir is required ; has Str $ . new-path ; has Str $ . flac-file ; has @ . errors ; method new-path () returns Str { $ ! new-path // = $ ! out-dir . IO . child( $ ! path . IO . basename) . Str ; } method flac-file () returns Str { $ ! flac-file // = self . new-path . subst (/ \ . wav $ /, ' .flac ' ); $ ! flac-file ; } } multi sub MAIN ( $ dir , Str : $ out-dir = ' /tmp/flac ' ) { my ProcessFile @ process-files ; my $ json = $ * PROGRAM . parent . child( ' encoder.json ' ) . slurp ; my $ workflow = Tinky::JSON::Workflow . from-json( $ json ); my $ flac = which( ' flac ' ) or die " no flac encoder " ; my $ cp = which( ' cp ' ); my $ watch-supply = IO::Notification . watch-path( $ dir ) . grep ({ $_ . path ~~ / \ . wav $ / }) . unique ( as => { $_ . path }, expires => 5 ); say " Watching ' $dir ' " ; react { whenever $ watch-supply -> $ change { my $ pf = ProcessFile . new (path => $ change . path, : $ out-dir ); say " Processing ' { $ pf . path } ' " ; $ pf . apply-workflow( $ workflow ); } whenever $ workflow . applied-supply() -> $ pf { if @ process-files . grep ({ $_ . path eq $ pf . path }) { $ * ERR . say : " ** Already processing ' " , $ pf . path, " ' ** " ; $ pf . reject; } else { @ process-files . append : $ pf ; $ pf . ready; } } whenever $ workflow . enter-supply( ' ready ' ) -> $ pf { my $ copy = Proc::Async . new ( $ cp , $ pf . path, $ pf . new-path, : r); whenever $ copy . stderr -> $ error { $ pf . errors . append : $ error . chomp ; } whenever $ copy . start -> $ proc { if $ proc . exitcode { $ pf . fail ; } else { $ pf . copied; } } } whenever $ workflow . enter-supply( ' copied ' ) -> $ pf { my $ encode = Proc::Async . new ( $ flac , ' -s ' , $ pf . new-path, : r); whenever $ encode . stderr -> $ error { $ pf . errors . append : $ error . chomp ; } whenever $ encode . start -> $ proc { if $ proc . exitcode { $ pf . fail ; } else { $ pf . done ; } } } whenever $ workflow . enter-supply( ' done ' ) -> $ pf { say " File ' { $ pf . path } ' has been processed to ' { $ pf . flac-file } ' " ; } whenever $ workflow . enter-supply( ' failed ' ) -> $ pf { say " Processing of file ' { $ pf . path } ' failed with ' { $ pf . errors } ' " ; } whenever $ workflow . transition-supply -> ( $ trans , $ pf ) { $ * ERR . say ( " File ' { $ pf . path } ' went from ' { $ trans . from . name } ' to ' { $ trans . to . name } ' " ); } } }

If you start this with an argument of the directory where you want to pick up the files ffrom, it will wait until new files appear then create a new ProcessFile object and apply the workflow to it, then every object to which the workflow is applied is sent to the applied-supply which is tapped to check whether the file has already been processed: if it has (and this can happen because the file directory watch may emit more than one event for the creation of the file,) the object is moved to state ‘rejected’ and no further processing happens, otherwise it is moved to state ‘ready’ whereupon it is copied, and if successfully encoded.

Additional states (and transitions to enter them,) could easily be added to, for instance, store the details of the encoded file in a database, or even start playing it, or new actions could be added for existing states by adding additional “whenever” blocks. As it stands this will block forever waiting for new files; however this could be integrated into a larger program by starting this in a new thread for instance.

The program and the JSON file are in the examples directory for Tinky::JSON, please feel free to grab and tinker with them.

Not quite all

Tinky has a fair bit more functionality that I don’t think I have space to describe here: there are facilities for the run-time validation of transition application and additional supplies that are emitted to at various stages of the workflow lifecycle. Hopefully your interest is sufficiently picqued that you might look at the documentation.

I am considering adding a cookbook-style document for the module for some common patterns that might arise in programs that might use it. If you have any ideas or questions please feel free to drop me a note.

Finally, I chose a deliberately un-descriptive name for the module as I didn’t want to make a claim that this would be the last word in the problem space. There are probably many more ways that a state managed workflow could be implemented nicely in Perl 6. I would be equally delighted if you totally disagree with my approach and release your own design as I would be if you decide to use Tinky.

Tinky Winky is the purple Teletubby with a red bag.