Now that we have gone over what Agents are, let’s try to figure out what they are good for and under what circumstances do people use them.

What are Agents good for?

There is no simple answer here and any that I can come up with will be rather more an opinion than an answer… but let’s try to take a stab at it.

The Agent abstraction is an extremely useful tool, specially when there is no sane framework to think about state and concurrency (i.e. the JVM). In the BEAM, concurrency and state are at the center, embodied by processes that keep their own internal state and share nothing. But the abstractions provided in the BEAM (as their names indicate) are generic (e.g. gen_server, gen_statem, etc.), while the Agent abstraction serves a more defined, specific purpose.

A useful way to rephrase the question above would be, under what specific circumstances would you use an agent instead of something else that either the JVM or the BEAM provides?

To try to answer this question we can look for the usage of Agents in open-source GitHub projects that use Clojure (JVM). The investigation I did, which is not very scientific (to say the least), was to look up the term “agent” amongst the Clojure files in the public repositories from some of the companies listed as using Clojure here (this link will take you to one of these searches). The results I found show very few examples for the usage of Agents.

From the very limited observations I got from the above, the impression I get is that agents are used mostly in tests, whenever there is some concurrency that needs to be generated.

(If you know this is not the case, please let me know.)

Agents also sound a lot like BEAM processes: you can send actions (messages) to them, they keep a state of their own (internal state), that gets changed based on those actions (receive loop processes messages). But looks are deceiveing. The crucial difference between an Agent and a process, is the independence between state and control. With a process you need to send a message to get its state, if the process is busy for a long time, or the message queue is too long, we will be left waiting indefinitely for the state. But an Agent’s state is available immediately.

Additionally, there is a section in “The Joy of Clojure” (a great book) that compares Agents with processes, which has an interesting point of view (although I don’t fully agree with everything).

Finally, even though it seems like Agents usage is quite limited in the wild, it is one of a few Clojure concurrency tools and some people might rely on it. It could still prove fruitful to think about different implementation alternatives, to assess how much work it would involve and if it’s possible to have something robust enough to be useful.

Implementation Alternatives

So let’s try to think of different ways we could implement Agents using what the BEAM provides.

State access should be independent of the processing of actions, as we saw in point 3 above (“State is always immediately available for reading”), so we should keep the state separate from whatever mechanism changes it. As mentioned before this can be easily done by keeping the Agents’ state in an ETS table.

The next thing we need to think about is what will apply the actions to each Agent and how.

1. Single Process

A single process that applies all actions to all existing Agents. The approach is simple enough, but it has obvious limitations. One of them is that the single process would be a contention point for all actions trying to modify the Agents, both CPU limiting and blocking IO actions, since anything that blocks this single process would hold up every other Agent’s actions.

2. One Process per Agent

Actions for each Agent would be independent, but there is no safeguard against creating too many, and thus making a bad use of resources. Creating processes is extremely cheap, but the overhead would be more on the memory footprint. Both CPU limiting and blocking IO actions would be run by the same process, which is fine since actions must be applied sequentially to the Agent. The BEAM has a bulit-in feature where it uses an async thread pool for IO operations, to avoid blocking other processes. It’s possible to configure how many thread this pools has by tuning the VM when starting it.

3. Process Pool

Having a limited amount of processes to use would ensure more control over the usage of resources. All actions for a specific Agent should be sent to the same process in the pool so that they are applied sequentially (i.e. “at most one at any point in time”). This means that CPU limiting and blocking IO actions should use the same pool of processes if we want to ensure sequential processing. This situation would be an improvement over option 1, because there is no single point of contention, but there is still the issue that a set of Agents whose actions end up in the same process from the pool, need to potentially wait on others.

4. Dispatch Process & Process Pool

A dispatch process would keep track of the actions sent for each Agent. The dispatcher would then use a process from the pool for CPU limiting actions, or spawn a new process for blocking IO actions, all the time ensuring that actions are applied sequentially to each Agent.

Trade-offs

The two main points that come up in the previous implementation options are (1) process contention and (2) resource management. The One Process per Agent and Dispatch Process & Process Pool options provide a good balance, but they involve some trade-offs:

One Process per Agent

The Agent can be treated as a resource and will therefore be cleaned up when it is “closed” or its owner process dies. This is the same approach used for file resource management in Erlang (for details check out the modules file, file_server and file_io_server).

The price we would be paying here is that when using an Agent, the user needs to remember to close the Agent when they are done with it. Also, if the owner process dies, the Agent’s state will be lost. This could prove problematic in some cases, after all files can be re-opened, but if the state of an Agent is deleted from memory, then it’s gone.

The Agent can be treated as a resource and will therefore be cleaned up when it is “closed” or its owner process dies. This is the same approach used for file resource management in Erlang (for details check out the modules file, file_server and file_io_server). The price we would be paying here is that when using an Agent, the user needs to remember to close the Agent when they are done with it. Also, if the owner process dies, the Agent’s state will be lost. This could prove problematic in some cases, after all files can be re-opened, but if the state of an Agent is deleted from memory, then it’s gone. Dispatch Process & Process Pool

All Agents’ actions would go through a single process that does as little work as possible to dispatch each action to another process (either from the pool or spawned). This dispatch process must:

- Keep track of the current actions being run for each Agent.

- Collect any sent action and store it as pending if one is already being applied to the Agent.

- Apply pending actions once an Agent has no current action running.

- When the action is a send, ask a process from the pool; if it’s a send-off, create a process.

These are quite a lot of concerns for a process that needs to do little work and there are some other situations that will require coordination with the pool. For example, what happens when there are no workers available, do we fail or do we store the action as pending and subscribe for a notification when a worker becomes available? Probably the second one, but more situations of the sort may arise.

Also if we are keeping the state of the Agents in an ETS table, once the Agent is no longer used, there’s still some cleanup required (i.e. delete the entry from the table). Which means we would still need to require the user to “close” the Agent.

Both options seem promising. However, the One Process per Agent is a lot simpler and its main shortcoming (required cleanup) is also present in the Dispatch Process & Process Pool alternative (for further reasoning see the Appendix).

Conclusion

Porting Clojure constructs into the BEAM has seldom been trivial, but it has almost always been possible if we are willing to make some trade-offs.

After having postponed the implementation of Agents for so long, this was a good exercise to get to the bottom of it and sketch it out.

The (experimental) implementation using One Process per Agent can be found in the latest master branch of the Clojerl repository.

Please try it out!

Feedback is very welcome!

Enjoy. ❤️

Appendix

If the main problem with the Dispatch Process & Process Pool is that we need to clean up the ETS table, why not just keep the state somewhere else?