Erlang (and Elixir) distribution without epmd

2016-10-26 by Magnus Henoch

When you deploy a distributed Erlang application, you’ll most likely have to answer the question “which ports need to be open in the firewall?”. Unless configured otherwise, the answer is:

port 4369 for epmd, the Erlang Port Mapper Daemon, and

an unpredictable high-numbered port for the Erlang node itself.

That answer usually doesn’t translate neatly into firewall rules. The usual solution to that is to use the environment variables inet_dist_listen_min and inet_dist_listen_max (described in the kernel documentation) to limit the distribution ports to a small port range, or even a single port.

But if we’ve limited the Erlang node to a single port, do we really need a port mapper? There are a few potential disadvantages with running epmd:

We may want to run epmd under the same user as the Erlang node — but as we’ll see, it can be hard to guarantee that.

We may want to configure epmd to listen only on a certain interface — again, that relies on us being the first to start epmd on this host.

Anyone who can connect to epmd can list the local Erlang nodes, without any authentication.

Connections to epmd are not encrypted, so even if you use distribution over TLS, epmd connections are not protected, and potentially vulnerable to man-in-the-middle attacks.

While epmd is very stable, it can happen that it crashes. In that case, any distributed Erlang nodes will keep running but not reconnect to epmd if it comes up again. That means that existing distribution connections will work, but new connections to those nodes cannot be made. It’s possible to fix this situation without restarting the nodes; see Экстренная реанимация epmd (in Russian).

So let’s explore how epmd works, and what we can do to run an Erlang cluster without it.

How does epmd get started in the first place?

Let’s have a look at the code! epmd is started in the function start_epmd in erlexec.c. In fact, epmd is started unconditionally every time a distributed node is started. If an epmd instance is already running, the new epmd will fail to listen on port 4369, and thus exits silently.

In fact, that’s what will happen even if the existing epmd instance was started by another user. Any epmd instance is happy to serve Erlang nodes started by any user, so usually this doesn’t cause any problem.

So who can connect to epmd?

Anyone! By default, epmd listens on all available interfaces, and responds to queries about what nodes are present, and what ports they are listening on. Hearing this tends to make sysadmins slightly nervous.

You can change that by manually starting epmd and specifying the -address option, or by setting the ERL_EPMD_ADDRESS environment variable before epmd gets started. This is described in the epmd documentation. That requires that the place where you do this is actually the first place where epmd gets started — otherwise, the existing epmd instance will keep running unperturbed.

Why do we need a port mapper daemon?

Clearly, epmd is just the middleman. Can we cut out the middleman?

We could make every Erlang node listen on a well-known port — perhaps use the port reserved for epmd, 4369, if we’re going to get rid of epmd. But that means that we can only run one Erlang node on each host (of course, for some use cases that might be enough).

So let’s specify some other port number. I mentioned inet_dist_listen_min and inet_dist_listen_max earlier. Those two variables define a port range, but if we set them to the same value, we narrow down the “range” to a single port:

erl -sname foo \ -kernel inet_dist_listen_min 4370 \ inet_dist_listen_max 4370

That’s all well and good, but we’d also need a way to tell other nodes not to bother asking epmd about the port number, and just use this number instead. And if we have several nodes on the same host, we’d need some kind of configuration to specify the different port numbers for those nodes.

Let’s use something else

In Erlang/OTP 19.0, there are two new command line options:

When you specify -start_epmd false , Erlang won’t try to start epmd when starting a distributed node.

, Erlang won’t try to start epmd when starting a distributed node. -epmd_module foo lets you specify a different module to use for node name registration and lookup, instead of the default erl_epmd .

Those are the building blocks we need!

I want to use a state-less scheme for this: since the connecting node already knows the name of the node it wants to connect to, I use that as the source of the port number. I pick a “base” port number — why not 4370, one port higher than epmd. Then I extract the number at the end of the “node” part of the node name, such that myapp3@foo.example.com becomes 3 . Then I add that number to the base port number. As a result, I know a priori that the node myapp3@foo.example.com is listening on port 4373. If there is no number in the node name, I treat that as a zero. This means that the nodes myapp3 and myotherapp3 couldn’t run on the same host, but I’m ready to live with that. (Thanks to Luca Favatella for perfecting this idea.)

Let’s write a little module for that:

-module(epmdless). -export([dist_port/1]). %% Return the port number to be used by a certain node. dist_port(Name) when is_atom(Name) -> dist_port(atom_to_list(Name)); dist_port(Name) when is_list(Name) -> %% Figure out the base port. If not specified using the %% inet_dist_base_port kernel environment variable, default to %% 4370, one above the epmd port. BasePort = application:get_env(kernel, inet_dist_base_port, 4370), %% Now, figure out our "offset" on top of the base port. The %% offset is the integer just to the left of the @ sign in our node %% name. If there is no such number, the offset is 0. %% %% Also handle the case when no hostname was specified. NodeName = re:replace(Name, "@.*$", ""), Offset = case re:run(NodeName, "[0-9]+$", [{capture, first, list}]) of nomatch -> 0; {match, [OffsetAsString]} -> list_to_integer(OffsetAsString) end, BasePort + Offset.

And a module to use as the -epmd_module . One slight complication here is that 19.0 expects the module to export register_node/2 , while from 19.1 onwards it’s register_node/3 . Let’s include both functions to be sure:

-module(epmdless_epmd_client). %% epmd_module callbacks -export([start_link/0, register_node/2, register_node/3, port_please/2, names/1]). %% The supervisor module erl_distribution tries to add us as a child %% process. We don't need a child process, so return 'ignore'. start_link() -> ignore. register_node(_Name, _Port) -> %% This is where we would connect to epmd and tell it which port %% we're listening on, but since we're epmd-less, we don't do that. %% Need to return a "creation" number between 1 and 3. Creation = rand:uniform(3), {ok, Creation}. %% As of Erlang/OTP 19.1, register_node/3 is used instead of %% register_node/2, passing along the address family, 'inet_tcp' or %% 'inet6_tcp'. This makes no difference for our purposes. register_node(Name, Port, _Family) -> register_node(Name, Port). port_please(Name, _IP) -> Port = epmdless:dist_port(Name), %% The distribution protocol version number has been 5 ever since %% Erlang/OTP R6. Version = 5, {port, Port, Version}. names(_Hostname) -> %% Since we don't have epmd, we don't really know what other nodes %% there are. {error, address}.

As you can see, most things are essentially stubbed out:

start_link/0 is invoked as this module is added as a child of the erl_distribution supervisor. We don’t actually need to start a process here, so we just return ignore .

is invoked as this module is added as a child of the supervisor. We don’t actually need to start a process here, so we just return . The register_node function would normally connect to epmd and tell it what port number we use. In return, epmd would return a “creation” number. The “creation” number is an integer between 1 and 3. epmd keeps track of the creation number for each node name, and increments it whenever a node with a certain name reconnects. That means that it’s possible to distinguish e.g. pids from a previous “life” of a certain node. Since we don’t have epmd, we don’t have the benefit of it tracking the life span of the nodes. Let’s return a random number here, which has a 2 in 3 chance of being different from the previous “creation” number.

port_please/2 gets the IP address of the remote host in order to connect to its epmd, but we don’t care; we use our algorithm to figure out the port number. We also need to return a distribution protocol version number. It has been 5 ever since Erlang/OTP R6 (see the Distribution Protocol documentation), so that’s simple.

Finally, names/1 is called to list the Erlang nodes on a certain host. We have no way of knowing that, so let’s pretend that we couldn’t connect.

So far, so good — but we need a way to make sure that we’re listening on the right port. The best way I could think of is to write a new distribution protocol module, one that just sets the port number and then lets the real protocol module do its job:

-module(epmdless_dist). -export([listen/1, select/1, accept/1, accept_connection/5, setup/5, close/1, childspecs/0]). listen(Name) -> %% Here we figure out what port we want to listen on. Port = epmdless:dist_port(Name), %% Set both "min" and "max" variables, to force the port number to %% this one. ok = application:set_env(kernel, inet_dist_listen_min, Port), ok = application:set_env(kernel, inet_dist_listen_max, Port), %% Finally run the real function! inet_tcp_dist:listen(Name). select(Node) -> inet_tcp_dist:select(Node). accept(Listen) -> inet_tcp_dist:accept(Listen). accept_connection(AcceptPid, Socket, MyNode, Allowed, SetupTime) -> inet_tcp_dist:accept_connection(AcceptPid, Socket, MyNode, Allowed, SetupTime). setup(Node, Type, MyNode, LongOrShortNames, SetupTime) -> inet_tcp_dist:setup(Node, Type, MyNode, LongOrShortNames, SetupTime). close(Listen) -> inet_tcp_dist:close(Listen). childspecs() -> inet_tcp_dist:childspecs().

Mostly stubs here; it’s just the listen/1 function that sets the inet_dist_listen_min and inet_dist_listen_max variables according to our node name, before passing control to the real module, inet_tcp_dist .

(Note that while inet_tcp_dist is the default module, it only provides unencrypted connections over IPv4. If you want to use IPv6, you would use inet6_tcp_dist , and if you want to use Erlang distribution over TLS, that would be inet_tls_dist or inet6_tls_dist . Adding that flexibility is left as an exercise for the reader.)

And we’re ready! Now we can start two nodes, foo1 and foo2 , and have them connect to each other:

erl -proto_dist epmdless -start_epmd false -epmd_module epmdless_epmd_client -sname foo1

erl -proto_dist epmdless -start_epmd false -epmd_module epmdless_epmd_client -sname foo2

System working?

(foo2@poki-sona-sin)1> net_adm:ping('foo1@poki-sona-sin'). pong

Seems to be!

Once more, with Elixir!

Of course, since Erlang and Elixir run on the same virtual machine, there is nothing stopping us from doing all of this in Elixir instead.

In Elixir, we can put all the code in a single file, and the compiler will compile it into the different modules we require:

# A module containing the function that determines the port number # based on a node name. defmodule Epmdless do def dist_port(name) when is_atom(name) do dist_port Atom.to_string name end def dist_port(name) when is_list(name) do dist_port List.to_string name end def dist_port(name) when is_binary(name) do # Figure out the base port. If not specified using the # inet_dist_base_port kernel environment variable, default to # 4370, one above the epmd port. base_port = :application.get_env :kernel, :inet_dist_base_port, 4370 # Now, figure out our "offset" on top of the base port. The # offset is the integer just to the left of the @ sign in our node # name. If there is no such number, the offset is 0. # # Also handle the case when no hostname was specified. node_name = Regex.replace ~r/@.*$/, name, "" offset = case Regex.run ~r/[0-9]+$/, node_name do nil -> 0 [offset_as_string] -> String.to_integer offset_as_string end base_port + offset end end defmodule Epmdless_dist do def listen(name) do # Here we figure out what port we want to listen on. port = Epmdless.dist_port name # Set both "min" and "max" variables, to force the port number to # this one. :ok = :application.set_env :kernel, :inet_dist_listen_min, port :ok = :application.set_env :kernel, :inet_dist_listen_max, port # Finally run the real function! :inet_tcp_dist.listen name end def select(node) do :inet_tcp_dist.select node end def accept(listen) do :inet_tcp_dist.accept listen end def accept_connection(accept_pid, socket, my_node, allowed, setup_time) do :inet_tcp_dist.accept_connection accept_pid, socket, my_node, allowed, setup_time end def setup(node, type, my_node, long_or_short_names, setup_time) do :inet_tcp_dist.setup node, type, my_node, long_or_short_names, setup_time end def close(listen) do :inet_tcp_dist.close listen end def childspecs do :inet_tcp_dist.childspecs end end defmodule Epmdless_epmd_client do # erl_distribution wants us to start a worker process. We don't # need one, though. def start_link do :ignore end # As of Erlang/OTP 19.1, register_node/3 is used instead of # register_node/2, passing along the address family, 'inet_tcp' or # 'inet6_tcp'. This makes no difference for our purposes. def register_node(name, port, _family) do register_node(name, port) end def register_node(_name, _port) do # This is where we would connect to epmd and tell it which port # we're listening on, but since we're epmd-less, we don't do that. # Need to return a "creation" number between 1 and 3. creation = :rand.uniform 3 {:ok, creation} end def port_please(name, _ip) do port = Epmdless.dist_port name # The distribution protocol version number has been 5 ever since # Erlang/OTP R6. version = 5 {:port, port, version} end def names(_hostname) do # Since we don't have epmd, we don't really know what other nodes # there are. {:error, :address} end end

When starting Elixir, we need to pass some of the parameters with --erl in order for them to make it through:

iex --erl "-proto_dist Elixir.Epmdless -start_epmd false -epmd_module Elixir.Epmdless_epmd_client" --sname foo3

Let’s try to ping the two Erlang nodes we started earlier:

iex(foo3@poki-sona-sin)1> Node.ping :"foo1@poki-sona-sin" :pong iex(foo3@poki-sona-sin)2> Node.ping :"foo2@poki-sona-sin" :pong

All connected, and no epmd in sight!

Conclusion

This is just one possible scheme for Erlang distribution without epmd; I’m sure you can come up with something else that fits your requirements better. I hope the example code above proves useful as a guide!

We thought you might also be interested in: