Yes, you read that title right. While working on a new way to create finite state machines (fsm), I inadvertently designed a state machine you can execute at compile-time. With branching and everything. Today, allow me to pull you down my rabbit hole. It’s always nicer with company down here.

Let’s go, Alice!

Backstory

I have this friend, you see, and he works in the c++ audio industry. DSP things and all that good stuff. Every time I show him a new piece of code I find interesting, he goes on and on about his “audio callback” and “hard real time” and “skipping a frame” and and and. It is never-ending, I tell you. While discussing some of his current issues with a project he’s working on, I recommended a fsm. It would fix all his problems and probably eradicate many state management bugs he has.

State machines are great. I love them and you should too. When you must have state, they are the way to go. However, remember that “audio callback” my dear friend keeps rambling about. Well, inside one of those, you cannot use std::function . Apparently, you can’t even use C function pointers. You really want your compiler to be able to inline everything it sees fit.

And so, I set out to make an inlinable state machine. One without std::function or function pointers. I want one anyways. [spoiler alert] That fsm isn’t done yet, I’ve been a little bit detracted from my goal.

As my new fsm design evolved, all the while gazing absently at my sanity escaping, I realised what I had done. A constexpr executable state machine, with many of the bells and whistles you’d expect. Forgive me Bjarne for I have sinned.

What Does This Look Like

Since I’ll be going pretty deep in terms of c++ stuff later on, how about I start with a simple demonstration. This is a very theoretical example, used to demonstrate the capabilities of the system.

The API, on Windows at least, isn’t half bad if I do say so myself. I was going to write a poem for Microsoft’s compiler, but my poetry skills threw an exception. Instead, here is a heartfelt letter to Visual Studio’s compiler. Written at compile-time, of course.

OK, not necessarily impressive.

Lets look at the code used to generate this wonderful and delightful masterpiece of literature. I will assume you’ve worked with fsms before, since explaining that concept here will turn this already too long post into a way too short novel. I will also use the Visual Studio version of the code, as it is more readable and doesn’t need any macro (clang and gcc need 1 macro, more on that later).

First, lets define our states and transitions and extra things we’ll need.

// Used to demonstrate branching. #if defined(NDEBUG) static constexpr bool debug_build = false; #else static constexpr bool debug_build = true; #endif // Used to turn on/off messages. static constexpr bool assert_val = true; enum class transition { do_intro, do_debug, do_release, do_paragraph, do_outro, count, // count is required by fsm }; enum class state { intro, debug, release, paragraph, outro, count, // count is required by fsm };

Nothing special going on here. We have a few compile-time variables to branch on, and we define our letter writing state machine. We will start in intro , then either transition to debug or release , then paragraph and finally outro .

Lets create the intro state.

fsm_builder < transition, state > builder; // intro auto intro_transitions = builder.make_transition < transition :: do_debug, state :: debug > () .make_transition < transition :: do_release, state :: release > (); auto intro_events = builder.make_event < fsm_event :: on_enter > ([]( auto & machine) { static_assert (assert_val, "Dear" ); if constexpr (debug_build) { return machine. template trigger < transition :: do_debug > (); } else { return machine. template trigger < transition :: do_release > (); } }) .make_event < fsm_event :: on_update > ( []( auto & ) { return 0 ; }) .make_event < fsm_event :: on_exit_to, state :: debug > ( []( auto & ) { static_assert (assert_val, "slow Visual Studio Compiler," ); }) .make_event < fsm_event :: on_exit_to, state :: release > ([]( auto & ) { static_assert (assert_val, "relatively fast Visual Studio Compiler ;)" ); }); auto intro_state = builder.make_state < state :: intro > (intro_transitions, intro_events);

OK! Things are getting interesting. First we create an fsm_builder . It needs to know about our transition and our state enum classes. I’ve discussed this non-type template technique in the past here. I really like it. It is very robust and all the user needs to do is make sure his enums finish with count , which is a best-practice anyhow.

With that builder, we create our transitions and events. The api is written in this “chaining” style to simplify usage. The make_transition accept a transition enum value and a state enum value. This is the association between which transition go to which state. Calling a transition that isn’t handled by a state will static_assert .

The events are self-explanatory. The fsm supports the standard on_enter , on_update , on_exit and on_enter_from . And the rare, yet infinitely powerful, on_exit_to . You must provide the machine with lambdas, as function pointers are runtime. The first argument of each of your event lambdas is a reference to the state machine itself. This is quite useful with runtime machines, and as you’ll see, with constexpr machines as well. After that, you can add whichever argument you wish, since we are storing your lambda types directly. The machines are very flexible that way.

Here is a rundown of the intro state execution path.

When we enter the state, we will print “Dear” (using static_assert ).

Then, depending on whether or not we are compiling in debug, we will trigger the do_debug transition or the do_release transition.

The on_update event will never be called, but if it were, it would return 0.

Finally, just to show an example, we use the on_exit_to event. If we exit to debug , we print “slow Visual Studio Compiler,”. If not, we print “relatively fast Visual Studio Compiler”.

Notice that the triggers must return their value. The next state is encoded in the return value and the machine needs this. Also, clang and gcc complain about .template , so I add .template .

Finally, we create the state using the builder’s make_state . We pass our state::intro enum value in the template, and provide the transitions and events as input arguments. In that order.

The other states creation is more of the same. We first make our transitions, then our events and create states using those. Here they are.

// debug auto debug_transitions = builder.make_transition < transition :: do_paragraph, state :: paragraph > (); auto debug_events = builder.make_event < fsm_event :: on_enter > ([]( auto & m) { static_assert (assert_val, "In debug mode," ); return m. template trigger < transition :: do_paragraph > (); }) .make_event < fsm_event :: on_update > ( []( auto & ) { return 1 ; }); auto debug_state = builder.make_state < state :: debug > (debug_transitions, debug_events); // release auto release_transitions = builder.make_transition < transition :: do_paragraph, state :: paragraph > (); auto release_events = builder.make_event < fsm_event :: on_enter > ([]( auto & m) { static_assert (assert_val, "In release mode," ); return m. template trigger < transition :: do_paragraph > (); }) .make_event < fsm_event :: on_update > ( []( auto & ) { return 2 ; }); auto release_state = builder.make_state < state :: release > ( release_transitions, release_events); // paragraph auto par_transitions = builder.make_transition < transition :: do_outro, state :: outro > (); auto par_events = builder.make_event < fsm_event :: on_enter > ([]( auto & m) { static_assert (assert_val, "We've been very critical of you in the " "past." ); return m. template trigger < transition :: do_outro > (); }) .make_event < fsm_event :: on_update > ( []( auto & ) { return 3 ; }) .make_event < fsm_event :: on_exit > ([]( auto & ) { static_assert (assert_val, "And we still are." ); }); auto par_state = builder.make_state < state :: paragraph > (par_transitions, par_events); // outro auto outro_events = builder.make_event < fsm_event :: on_enter > ([]( auto & ) { static_assert (assert_val, "But look how you've grown!" ); }) .make_event < fsm_event :: on_update > ( []( auto & ) { return 42 ; }); auto outro_state = builder.make_state < state :: outro > (builder.empty_t(), outro_events);

Nothing new here.

We define on_enter events and on_update events. The triggers occur on enter. Finally, the outro doesn’t trigger anything or go anywhere, which will stop the “execution” of the fsm. The builder has helpers empty_t() and empty_s() since you must provide the adequate arguments when calling make_state .

This is quite verbose, but hopefully not too cryptic. Now, lets run this machine. Hold on to yer helmets!

// Tada! auto machine = builder.make_machine(intro_state, debug_state, release_state, par_state, outro_state).init();

Yep, that’s it.

We create the state machine using make_machine and our previously created states. Then, we execute init() which will call on_enter of our starting state (the first state you pass to make_machine ). That will in turn execute its trigger, which will in turn proceed to the next state, which will execute its trigger, etc. Until we reach outro.

You don’t believe me? Lets see what value our state machine holds.

constexpr auto result = machine.update(); static_assert (result == 42 , "Wrong answer to life." );

Not only does this compile, but since we are now in our outro state, it correctly returns the answer to life : 42. Of course, since we executed the machine at compile-time, we can capture the resulting value as constexpr , and I guess do some very fancy things with it!

More Realistic Scenarios

This is great and all, but not very useful. In my unit tests, I have an example of a state machine that conditionally fills up a tuple at compile-time, according to its transitions. You can imagine building a plethora of objects at compile-time, dependent on some compile options provided by the user or the environment.

You can also pass on the state machine as a template argument, so you can use it to conditionally initialize structs’ static constexpr variables. For example.

template < class Machine > struct my_obj { static constexpr auto my_conditional_variable = Machine :: update(); }; // Then, fill a tuple with types who've been constructed conditionally. // Call triggers on the state machine to change value returned by .update().

Fsms are good at managing state. Compile-time branching is state. So, if your software has a mess of ifdefs and compile-time checks, this fsm should help you manage that. You can imagine a scenario where we are compiling cross-platform; windows, mac and linux. We have code that is generated dependent on the platform, in addition to a debug and release build type. Maybe we have some code that also checks c++ feature macros, etc. This becomes quite unmanageable in the long term.

You could define a state machine with 3 base states, windows, mac and linux. Then, define transitions to the code that needs to be generated. Lets say, an instantiate_renderer state. You can now override the behavior using on_enter_from and on_exit_to . Since the fsm doesn’t have conditional transitions (aka transition guards), you could use different states for debug and release, so on and so forth.

With a simple init() call, you can generate all your compile-time branch dependent code, and make sure the behavior is well thought out, predictable, clear to understand and static_asserts if you forgot to implement a required transition or state.

Caveats

I’ve already mentioned a few, but I feel the need to discuss the limitations of the design.

One big trade-off is compile-times, clearly. The APIs are also very verbose and not as visually pleasing as with runtime state machines. Though I’d say it is very reasonable considering what we are doing. Error messages, well… You know what you are getting into.

Since I’m unsure how useful this fsm is in reality, I’m weary to add any more features to the machine. I do not want to over-complexify the code for something that seems very niche at this point.

Another limitation is, whenever you call trigger inside your fsm, you must return the value. If you call trigger during an update, you will need to capture the result as it is your new “state”. For example :

auto my_machine = builder.make_machine(...).init(); auto my_new_machine = my_machine.update(); // If updates call triggers, you must capture the result.

This makes the fsm much less useful overall, though I’ve found quite a few ways around this while writing examples and unit tests.

If you want to return a user value from your update events, while also calling trigger , you could return a tuple of both the trigger return value and a std::optional user value. Since the return types are also yours to customize, you can really design a state machine that fits your needs.

The MSVC compiler is very relaxed compared to clang and gcc. It allows for static constexpr variables inside static constexpr functions. This isn’t “legal” C++, though I’ve read there is a proposal for such a thing. Until we get this feature, on clang and gcc, you will need to use a macro fea_event to pass on your events. The first argument is your event name, and the second is your event lambda. You later provide the event name to the make_event function. Like so.

fea_event(intro_onenter, []( auto & machine) { static_assert (assert_val, "Dear" ); if constexpr (debug_build) { return machine. template trigger < transition :: do_debug > (); } else { return machine. template trigger < transition :: do_release > (); } }); fea_event(intro_onupdate, []( auto & ) { return 0 ; }); fea_event(intro_onexitto_debug, []( auto & ) { static_assert (assert_val, "slow Visual Studio Compiler," ); }); fea_event(intro_onexitto_release, []( auto & ) { static_assert (assert_val, "relatively fast Visual Studio Compiler ;)" ); }); auto intro_events = builder.make_event < fsm_event :: on_enter > (intro_onenter) .make_event < fsm_event :: on_update > (intro_onupdate) .make_event < fsm_event :: on_exit_to, state :: debug > (intro_onexitto_debug) .make_event < fsm_event :: on_exit_to, state :: release > (intro_onexitto_release);

This is the default behavior on MSVC as well, unless you define FEA_FSM_NO_EVENT_WRAPPER .

How It Works

Before I go into the details, it is mandatory you understand the concepts presented in this great read. If you do not understand Michael Park’s post, you will not understand what I am about to show. If you do however, then it’s smooth sailing from here till the end of this never ending wall of text.

Alright lets do this. I won’t go every single line of code here, as you can find the full source on-line (links at the end). I just want to explain the major concepts.

The state machine uses a very simple container I dubbed tuple_map . Think of it as a compile-time std::map , where you can query objects using types instead of runtime values. The tuple map is simply 2 tuples. One with keys, known to us and with which we’ll do our searches, and one with values unknown to us (the lambdas). Here is a rough sketch of what it looks like, just to get an idea.

struct tuple_map { // Search by template. template < class Key > static constexpr const auto & find() { constexpr size_t idx = tuple_idx_v < Key, keys_tup_t > ; return std :: get < idx > (_values); } // Does the map contain the key? template < class Key > static constexpr bool contains() { constexpr bool ret = tuple_contains_v < Key, keys_tup_t > ; return ret; } private : // Used to find the index of the value inside the other tuple. std :: tuple < ... > _keys = /* details on how this is initialized are coming */ ; // Your values. std :: tuple < ... > _values = /*...*/ ; // This is the type used to find indexes and whether our map contains elements. using keys_tup_t = std :: decay_t < decltype (_keys) > ; using values_tup_t = std :: decay_t < decltype (_values) > ; };

tuple_idx_v and tuple_contains_v are just a simple type trait you can find on stackoverflow. They would be nice additions to the standard.

Next up, lets dive into our “builders”. The state machine encodes all its information into templates. That is why I use chaining apis. The builder will return a builder type, which can in turn return a new builder type with the appropriate information encoded in the templates. Here is a “pseudo-c++” example just to illustrate the idea.

// Start here. struct builder { auto make_transition() { // Start with a parent of type void. return transition_builder < int , void > {}; } }; // Then, return a builder which can recursively return builders // with data encoded in the template type. template < class Whatever , class Parent > struct transition_builder { template < class NewType > auto make_transition() { // Recursively append the Parent information to the template. using parent_t = transition_builder < Whatever, Parent > // And prepend the new information coming from this call to make_transition. return transition_builder < NewType, parent_t > {}; } };

Now, what if we were to encode in those templates, the information we need to build keys for our tuple_map mentioned earlier? We could build up a struct which contains all the transitions and their destination states. All we would need is some function to make sense of all this. Lets do just that.

First, we define a key which contains the enum types and their non-type templates. We need this as the tuple_map really doesn’t play friendly with non-type templates. This is what we’ll use later on to find transitions.

template < class TransitionEnum , TransitionEnum FromTransition > struct fsm_transition_key {}; // Usage example : // TransitionEnum is the user enum class of transitions. // FromTransition is a non-type template of that enum, for example transition::do_outro. StateEnum my_target_state = my_tuple_map.find < fsm_transition_key < TransitionEnum, FromTransition >> ();

Now back to the transition_builder , we add an unpack() api which will unroll all of our gathered transitions (encoded in our template types) and return a tuple of that. This time, not pseudo-code.

// We need, the type of the transition enum, the type of the state enum, // the transition enum value, the state enum value and the parent types. template < class TransitionEnum , class StateEnum , TransitionEnum FromT, StateEnum ToS, class Parent > struct transition_builder { // Creates a new transition, with key NewFromT transition and value NewToS state. template < TransitionEnum NewFromT, StateEnum NewToS > static constexpr auto make_transition() { // The recursively encoded parent type. // This is our "current" struct type. using parent_t = fsm_transition_builder < TransitionEnum, StateEnum, FromT, ToS, Parent > ; // Our new transition_builder, which contains all the previous information // and the new template parameters from this function call. return fsm_transition_builder < TransitionEnum, StateEnum, NewFromT, NewToS, parent_t > {}; } // Unroll all the templates into something useable to build a tuple_map of : // keys : fsm_transition_key<TransitionEnum, TransitionEnumValue> // values : StateEnumValue static constexpr auto unpack() { // We aren't done recursing. if constexpr ( ! std :: is_same_v < Parent, void > ) { // We output a tuple<tuple<key, value>> // No particular reason why I didn't use a tuple<pair<key, value>> // We will later on convert that into 2 tuples, one of keys, one of values. // tuple_cat our tuple<tuple<key, value>> with the parent's return std :: tuple_cat( std :: make_tuple(std :: tuple{fsm_transition_key < TransitionEnum, FromT > {}, ToS }), Parent :: unpack() ); } else { // End the recursion. return std :: make_tuple(std :: tuple{fsm_transition_key < TransitionEnum, FromT > {}, ToS }); } } };

I tried to explain as best I can in the comments. Ultimately, we recurse through our types and build a tuple<tuple<key, value>> for later. That will later be split into a tuple<keys...> and a tuple<values...> used to build our tuple_map .

Static Constexpr

There is one last thing to go over, and that is really the key to making this whole thing compile-time. Everything needs to be static constexpr . The “member variables”, the functions, absolutely everything. That is why, as a user, you don’t really need to use constexpr when you create your transitions or your states. The objects don’t really exist. Everything is encoded in templates, through static constexpr .

For example, lets take our fsm_transition_builder and initialize a real tuple_map with it. First, we need to convert the tuple<tuple<key, values>> into 2 separate tuples.

template < class Builder > constexpr auto make_tuple_map() { // At compile-time, take the tuple of tuple coming from the // builder, iterate through it, grab the first elements of // nested tuples (the key) and put that in a new tuple, grab // the second elements and put that in another tuple. // Basically, go from tuple<tuple<key, value>...> to // tuple<tuple<key...>, tuple<values...>> // which is our map basically. struct keys_tup { static constexpr auto unpack() { constexpr size_t tup_size = std :: tuple_size_v < decltype (Builder :: unpack()) > ; return detail :: tuple_expander5000 < tup_size > ([]( auto ... Idxes) constexpr { constexpr auto tup_of_tups = Builder :: unpack(); // Gets all the keys. return std :: make_tuple(std :: get < 0 > ( std :: get < decltype (Idxes) :: value > (tup_of_tups))...); }); } }; struct vals_tup { static constexpr auto unpack() { constexpr size_t tup_size = std :: tuple_size_v < decltype (Builder :: unpack()) > ; return detail :: tuple_expander5000 < tup_size > ([]( auto ... Idxes) constexpr { constexpr auto tup_of_tups = Builder :: unpack(); // Gets all the values. return std :: make_tuple(std :: get < 1 > ( std :: get < decltype (Idxes) :: value > (tup_of_tups))...); }); } }; return tuple_map < keys_tup, vals_tup > {}; }

Yikes!

OK here’s what to keep an eye out for. First off, we don’t even need to accept a variable as input. Since our fsm_transition_builder only has static constexpr functions, we only need to know its type. That is the magic described in mpark’s blog post and how you write truly compile-time only code.

std::apply didn’t work here as it resulted in a non-constexpr expression. Instead, I use the descriptively named tuple_expander5000 , which takes a non-type size_t and calls your lambda with some generated std::integral_constant<size_t, I>... . This way, we can use the underlying index value to access a compile-time index decltype(Idxes)::value . With that value, we std::get each nested tuple, and we either std::get<0> or std::get<1> to extract our desired key or value. Finally we rebuild the tuples.

Coffee break.

Notice this expansion is encoded in a struct, inside a static constexpr auto unpack() function as well. I use local structs throughout the system, since they can encode the incoming templates in their unpack() functions. We also need to do this for the tuple_map , as it needs to have static constexpr “member” variables. Like so.

// The real tuple_map. template < class KeysBuilder , class ValuesBuilder > struct tuple_map { /* ... */ private : // Used to find the index of the value inside the other tuple. static constexpr auto _keys = KeysBuilder :: unpack(); // <--- look here // Your values. static constexpr auto _values = ValuesBuilder :: unpack(); // <--- and here using keys_tup_t = std :: decay_t < decltype (_keys) > ; using values_tup_t = std :: decay_t < decltype (_values) > ; };

Now it should start to all make sense.

To initialize static constexpr member variables, we cannot use a constructor. Not even a constexpr constructor (those would be much more useful if the compiler could figure out this has been initialized as constexpr ).

To initialize our compile-time objects, we must pack data into template types. Later, we need to unpack that data calling a static constexpr function, which only requires we know the type. Magic!

Almost The End

One final gripe and the necessary macro. Visual Studio happily compiles and executes the following.

struct event_builder { template < class Func > static constexpr auto make_event(Func func) // Store a static constexpr version of the passed in user lambda, // until we need to unpack it later to copy it into a tuple_map. static constexpr auto f = func; // <--- problematic line struct func_wrapper { // Return the user function. static constexpr auto unpack() { return f; } }; using parent_t = /* not important right now */ return fsm_event_builder < /* things */ , func_wrapper > {}; } };

However, static constexpr variables inside static constexpr functions aren’t theoretically supported in c++. Even though I read about a constexpr function relaxation proposals, I couldn’t find the status of it. It has likely not been prioritized since use cases are few and far between. Anyhow, because of this language limitation we must provide a user macro which will wrap a user lambda into a static constexpr function.

// Do some pasting to make the struct name unique. // Create a struct which "pastes" the user lambda f in a static constexpr function. // Creates an instance of this struct, named name. #define FEA_TOKENPASTE(x, y) x##y #define FEA_TOKENPASTE2(x, y) FEA_TOKENPASTE(x, y) #define fea_event(name, f) \ struct FEA_TOKENPASTE2( \ FEA_TOKENPASTE2(fea_event_builder_, name), __LINE__) { \ using is_event_builder = int; \ static constexpr auto unpack() { \ return f; \ } \ } name // Usage fea_event(intro_onenter, []( auto & ){ /* do something */ }); auto intro_events = builder.make_event < fsm_event :: on_enter > (intro_onenter); // The macro generates : struct mybrainismeltingatthispoint { static constexpr auto unpack() { return []( auto & ){ /* do something */ }; } } intro_onenter;

This macro pleases clang and gcc.

Conclusion

WTF is wrong with me!?

Ahem.

In conclusion, we’ve seen that where there’s a will and a c++17 compiler, you can pretty much accomplish whatever the hell you want. I think I’ve been through all the key points required to create a compile-time executable state machine in c++. If you understand all these template meta programming concepts well, sky is the limit! The full code is actually quite compact and readable, which goes to show how much the language has evolved.

Below is the github repo with my state machine collection, including this one. There are a few more examples in the unit tests. I only very recently made these public and they are far from final. I still need to go through and cleanup a lot of stuff (especially the state chart). Also, I haven’t finished the real runtime state machine that uses the tuple_map design. It is in the works.

Here is the fill source code on github : constexpr_fsm.hpp

Here are some more examples : windows events and cross platform events

Thank you for reading, I hope you learned a thing or two!