Intro

Workflows are used in many cases from automation of build processes and CI/CD to order management systems in online/offline stores and booking a taxi. You may find a lot of solutions from small github projects to enterprise systems. There is a huge variety of implementations: the frameworks that could be integrated into your system, some separate solutions, workflow engines which are available in big complex system as part of functionality and etc. Here is a small list of workflow engines: Jenkins, GitLab Pipelines, Cadence, Airflow, Conductor, etc.

In this article I would like to show the base concept for all workflow engines.

What is the workflow, state machine and workflow engine?

Before looking into the concept let’s understand the terms: workflow, state machine and workflow engine.

What wikipedia tells us about the workflow term.

A workflow consists of an orchestrated and repeatable pattern of activity, enabled by the systematic organisation of resources into processes that transform materials, provide services, or process information.

I think this description of the workflow from wikipedia is clear. I just wanna mention that it describes what the workflow is in general, nothing saying about any technology or development pattern or math model. The math model of the workflow is a state machine. Let’s take a look at wikipedia about it.

A state machine is a mathematical model of computation. It’s an abstract concept whereby the machine can have different states, but at a given time fulfils only one of them.

So as soon as there is workflow and its math model, needs something that implements the math model. Workflow engine is an implementation of the math model(state machine). Read wikipedia one more time =).

A workflow engine manages and monitors the state of activities in a workflow, such as the processing and approval of a loan application form, and determines which new activity to transition to according to defined processes (workflows).

Basically any application is an implementation of some workflow and as I said before there are a lot of places where kind of parts of workflow engine are used. Let’s take a look at the concept of the workflow engine.

Concept

Workflow engine consists of 6 logical components.

Component diagram of workflow engine

Scheduler

It’s the main part of workflow engine where all magic happen. It schedule the runs, knows about flow definition, monitor the results of the task execution. It could be triggered by some event like a timer, API or manually.

Metadata store

It’s just a storage for data that needs scheduler to orchestrate the runs. Depends on implementation it could be in memory structures or a database. I’ll describe it more carefully in the next article about data concept.

Triggers

The way how your workflow could be triggered. In case of framework it’s usually call of some function. If it’s some solution then it could be some sort of API, time schedule. Basically you may have different implementations of triggers for the workflow.

Executors

Component that is responsible for running an atomic task and providing the results of the run. It could be just separate thread inside of your app, container or event VM.

Flow repository

The place where definitions are stored depends on implementation it could be different place and provide additional features not just storage for definitions. Flow is kind of description of the workflow. I’ll describe it more carefully in the next article about data concept.

Shared context

As soon as your workflow becomes a little bit complex you immediately need to pass some context from one task to another. This component is responsible for that.

Conclusion: When do you need to use workflow engine?

In order to answer the topic question let’s take a look 2 examples.

The first one: “UI that provides a list of users to the client” in this case API needs to provide a result to the client and after that client decides what to do next, may be just analyse the result or take some action. So the process is not fully automated there is a space for the client to make a decision.

Another example: “Get the list of users, find peoples with birthday today and send them a message ‘Happy birthday’” in this scenario the client do care only about let call it “job to be done”, the client doesn’t need to analyse and make a decision based on the list of users because the behaviour pattern was found before and now client needs just to execute it.

Simply to say when you need to implement some fully automated process pick a workflow engine(example: Build processes, CI/CD processes, Provisioning of VM, Order management process, etc), when you wanna provide some ability to your client to take an action (example: show me list of entities and provide set of actions on them, show me list of my friends in social network, let me find some video and then play it or comment it, etc), so try to consider the another approach because workflow engine as any unified solution doesn’t cost zero.