As part of some research I have been doing I wanted a sliding window data structure and thought it would be interesting to see how to implement custom data structures in Elixir.

Custom data structures in Elixir

Note: What follows is a very naive implementation, I was more focused on the interface and learning Elixir over performance but I would be very interested in more sophisticated implementations, so please comment.

A sized sliding window

The first data structure I looked at is a sized sliding Window. The idea here is that we set the Window to a fixed sized and any new items added once the sized as been reached push the oldest item out of the Window.

Since the data structure is very close to a FIFO queue, we will base our implementation on on the erlang queue data structure.

Lets start off by writing a few tests to define our interface. We are going to call our structure Window.Sized.

defmodule Window.SizedTest do use ExUnit.Case test "i can create a window" do w = %Window.Sized{ size: 5 } assert w.size == 5 end test "i can add items to window" do w = %Window.Sized{ size: 5 } |> Window.Sized.add(1) |> Window.Sized.add(2) |> Window.Sized.add(3) |> Window.Sized.add(4) |> Window.Sized.add(5) assert :queue.len(w.items) == 5 end test "a window slides" do w = %Window.Sized{ size: 5 } |> Window.Sized.add(1) |> Window.Sized.add(2) |> Window.Sized.add(3) |> Window.Sized.add(4) |> Window.Sized.add(5) |> Window.Sized.add(6) assert :queue.len(w.items) == 5 end end

So from experimenting with our tests we can see we can get by with just one function, add, but a user needs to know we are using a queue internally to process whats stored in the Window. This obviously needs to be improved and we will revisit it shortly.

defmodule Window.Sized do defstruct size: nil, items: :queue.new() def add(window = %Window.Sized{size: size, items: items}, item) do if :queue.len(items) == size do {_, q} = :queue.out_r(items) %{ window | items: :queue.in_r(item, q)} else %{ window | items: :queue.in_r(item, items)} end end end

A timed sliding window

A timed sliding Window works similarly to a sized Window, but rather than removing items once a size threshold has been reached we remove items based on a “time”.

To keep things flexible, the definition of “time” and the resolution that is used is up to the end user. Lets write some tests to tease out the API.

defmodule Window.TimedTest do use ExUnit.Case test "i can create a window" do w = %Window.Timed{ duration: 5 } assert w.duration == 5 end test "i can add items to window" do w = %Window.Timed{ duration: 5 } |> Window.Timed.add({1, 1}) |> Window.Timed.add({2, 1}) |> Window.Timed.add({3, 1}) |> Window.Timed.add({4, 1}) |> Window.Timed.add({5, 1}) assert :queue.len(w.items) == 5 end test "a window slides" do w = %Window.Timed{ duration: 5 } |> Window.Timed.add({0, 1}) |> Window.Timed.add({0, 1}) |> Window.Timed.add({3, 1}) |> Window.Timed.add({4, 1}) |> Window.Timed.add({5, 1}) |> Window.Timed.add({6, 1}) assert :queue.len(w.items) == 4 end end

Yet again we see we need only an add function (that has the same signature as our Window.Sized), and yet again, we need to exposing the internal queue.

defmodule Window.Timed do defstruct duration: nil, items: :queue.new() def add(window = %Window.Timed{duration: duration, items: items}, {time, item}) do start = time - duration left = Enum.take_while(:queue.to_list(items), fn({t, _}) -> t > start end) %{ window | items: :queue.in_r({time, item}, :queue.from_list(left))} end end

Review (and Refactor)

So far our data structures are very simple, but the API is cumbersome and a user needs to know we are using a queue internally to process the data. It would also be nice for our end user to not care which type of Window they are using when they are writing algorithms to process a Windows data. In an object oriented language we would normally use an interface or base class to solve this, but in Elixir we can use protocols.

First, lets create some tests to help us focus on our API.

defmodule WindowTest do use ExUnit.Case doctest Window test "i can create a sized window" do w = Window.sized(5) assert w.size == 5 end test "i can create a timed window" do w = Window.timed(30) assert w.duration == 30 end test "a sized window slides" do w = Window.sized(5) |> Window.add(1) |> Window.add(2) |> Window.add(3) |> Window.add(4) |> Window.add(5) |> Window.add(6) assert Enum.count(w) == 5 end test "a timed window slides" do w = Window.timed(5) |> Window.add({0, 1}) |> Window.add({0, 1}) |> Window.add({3, 1}) |> Window.add({4, 1}) |> Window.add({5, 1}) |> Window.add({6, 1}) assert Enum.sum(w) == 4 end end

Ok, so not only have we now hidden the Window type once it has been created, we have also decided to make a Window Enumerable. We could have added our own count, sum etc. methods to our Window module, but why bother when Enum provides these functions already, and behaving like an Enumerable means we can take advantage of many other third party Elixir libraries.

So now that we know what we want our Window API to look like, lets create our Windowable protocol.

defprotocol Windowable do def add(window, item) def items(window) end

Our Windowable protocol is very simple, and only requires a structure to implement two functions to participate. The first one of these, add, is obvious but we have also decided to require an items function. This will allow a structure to return its contents while hiding its internal implementation.

Next we will create our Window module, the module that actually works on Windowable structures.

defmodule Window do def sized(size) do %Window.Sized{ size: size } end def timed(duration) do %Window.Timed{ duration: duration } end def add(window, item) do Windowable.add(window, item) end end

Next, we need to go back and modify our structures so they behave as Windowable structures. We can also tidy up our unit tests but I will leave that as an exercise to the reader (you can see the tidied up tests in the github repo).

Lets create an implementation of Windowable for Window.Sized

defmodule Window.Sized do defstruct size: nil, items: :queue.new() end defimpl Windowable, for: Window.Sized do def add(window = %Window.Sized{size: size, items: items}, item) do if :queue.len(items) == size do {_, q} = :queue.out_r(items) %{ window | items: :queue.in_r(item, q)} else %{ window | items: :queue.in_r(item, items)} end end def items(window) do :queue.to_list(window.items) end end

and for Window.Timed.

defmodule Window.Timed do defstruct duration: nil, items: :queue.new() end defimpl Windowable, for: Window.Timed do def add(window = %Window.Timed{duration: duration, items: items}, {time, item}) do start = time - duration left = Enum.take_while(:queue.to_list(items), fn({t, _}) -> t > start end) %{ window | items: :queue.in_r({time, item}, :queue.from_list(left))} end def items(%Window.Timed{items: items}) do Enum.map(:queue.to_list(items), fn {_, i} -> i end) end end

and finally we need to create our Enumerable implementation for our two Window types

defimpl Enumerable, for: [Window.Sized, Window.Timed] do def count(window) do {:ok, length(Window.items(window))} end def member?(window, value) do {:ok, Enum.member?(Window.items(window), value)} end def reduce(_, {:halt, acc}, _fun), do: {:halted, acc} def reduce(window, {:suspend, acc}, fun), do: {:suspended, acc, &reduce(window, &1, fun)} def reduce(window = %Window.Sized{ items: items}, {:cont, acc}, fun) do if :queue.len(items) == 0 do {:done, acc} else h = :queue.head(items) reduce(%{ window | items: :queue.tail(items)}, fun.(h, acc), fun) end end def reduce(window = %Window.Timed{ items: items }, {:cont, acc}, fun) do if :queue.len(items) == 0 do {:done, acc} else {_, h} = :queue.head(items) reduce(%{ window | items: :queue.tail(items)}, fun.(h, acc), fun) end end end

Conclusion

For an Elixir newcomer there is quiet a bit to digest here and I recommend typing in the code yourself to fully understand all the pieces. We have seen how to create simple data structures in Elixir and how to create our own protocols to provide a unified API to these structures. We have also seen how to provide an implementation for the built in Enumerable protocol so our data structures can be used by many standard and third party Elixir functions.

Structures and protocols are very powerful abstraction and provide a great way for us to write reusable code with familiar APIs without the need for classes, interfaces or inheritance.