It is interesting and surprising how some of the fundamental things are missing from Erlang OTP lib. Of course it is impossible to include everything into OTP but some of the things are almost obvious. On my opinion Ulf Wiger’s Gproc lib is one of such examples, I use it all the time to give a process a name and refer to it by a name rather by PId later on. It is easy to do with Gproc (without having to make a process into registered one) and project code quality is superb (not surprising given who the author is). Hopefully GProc will make it into OTP one day.

But another example is resource discovery problem in Erlang cluster. It is not easy, if you have a system made up from multiple components you kind of need to hardcode the names of the nodes that provide a service of a given type. And what if you have new nodes entering and exiting your cluster all the time? I don’t think OTP addresses this problem easily. Fortunately, Martyn Logan showed the possible solution in “Erlang and OTP in Action” book. In chapter 8 there is an example of a simple resource discovery protocol. Apart from the fact that “Erlang and OTP in Action” is a great book in its own right, it is worth buying just for this chapter alone.

Martyn implemented the initial idea and I and other people made few additions to it and now it is available on github resource_discovery .

I think it is as useful as Ulf’s GProc. The idea is simple, you have nodes in your cluster which provide services (e.g. logger or webserver or task worker, etc) and there are nodes which need to consume such services (e.g. a send a log message to a one of 10 different loggers). This is where you need automatic resource discovery mechanism, so instead of picking a name of the node from some config file you can ask a question : give me a resource of type ‘logger’ or whatever. And the system will reply with a list of all resources of this type. The resource could be a name of the node or PID of the process, it doesn’t matter. The important thing it is all dynamic, so if you need to add extra task workers to your cluster, you do it and then the resource discovery protocol will know that you have new nodes which provide ‘worker’ service. And the same thing happens when services are dropping off – one of 10 loggers could disappear from the cluster and it will be purged from the resource discovery automatically.

If you decide to use it you need to add “Resource Discovery” as a dependency to your project:

{deps, [ {'resource_discovery', ".*",{git, "git@github.com/erlware/resource_discovery.git", "master"}} ]}.

you need to start ‘resource_discovery’ application, I usually do it as part of my start/0 function in _app.erl module:

-module(example_app). -behaviour(application). -define(APPS, [lager, resource_discovery, example]). %% Application callbacks -export([start/0, start/2, stop/1]). %% =================================================================== %% Application callbacks %% =================================================================== start() -> [begin application:start(A), io:format("~p~n", [A]) end || A <- ?APPS]. start(_StartType, _StartArgs) -> lager:info("starting example on a node ~p", [node()]), example_sup:start_link(). stop(_State) ->

then in the init/0 function of the process which provides the service, you announce that you have a service of the given type by adding it to resource discovery:

resource_discovery:add_local_resource_tuple({worker, self()}),

You can also register your interest to the service of another type that some other resource in the cluster provides and trigger resource synchronization:

resource_discovery:add_target_resource_types([?LOGGER]), resource_discovery:trade_resources(),

here is possible example for init function:

init([]) -> process_flag(trap_exit, true), lager:info("starting task server on: ~p", [node()]), %% announce via resource_discovery that we have available resource resource_discovery:add_local_resource_tuple({worker, self()}), %% add request for logger resource_discovery:add_target_resource_types([?LOGGER]), %% synch resources resource_discovery:trade_resources(), {ok, #state{}}.

Now, if you need to find a PID of resource ‘worker’ from another nodes and use it (e.g. by sending a message with a task to it), you can ask ask how many such resources exist or get all resources or get a single resource:

NofResource = resource_discovery:get_num_resource('worker'), AllWorkers = resource_discovery:get_resources('worker'), SingleWorker = resource_discovery:get_resource('worker')

and when your worker leaves the cluster, you can cleanup the resource from the global resource_registry:

terminate(_Reason, _State) -> %% make resource 'worker' unavailable for other clients resource_discovery:delete_local_resource_tuples(resource_discovery:get_local_resource_tuples()), resource_discovery:trade_resources(), lager:info("worker is shutting down on node ~p", [node()]), ok.

I find resource_discovery app hugely useful and hope that somebody else feels the same.