I work on a product that helps software teams track their work. Over a few years, we transitioned our implementation to React, and our programming paradigm changed from object oriented to functional. A core feature of our product is an editor for work items, and our team was responsible to build the next version in React. This editor allows users to update fields on their work items such as name, owner, description, and the project to which the item is assigned. After many units of work, we slowly released the editor to our customers and started listening to feedback.

The new side panel editor

We discovered a performance issue that plagued customers with many projects in their workspace. Some quick investigation revealed the root cause. Our system stored the project permissions as a list, and in order to find all the editable projects, the project picker used a O(n²) algorithm similar to the one shown below:

To fix the issue, we decided to store the list of projects as a map. An example of the change and the O(n) algorithm are shown below.

While the targeted fix was straightforward, there were side effects. The raw permissions data (stored as a list) was used in many different modules, and some consumers depended on the list interface.

Why was the code difficult to modify?

We wanted to change the representation of the permissions data from a list to a map, but because consumers of the data had direct access to the representation, any change to it had the possibility to introduce bugs.

Object oriented (OO) programmers use class visibility to abstract the representation of data when designing classes. Most OO languages have visibility features built in, therefore storing the representation of data as a private member of a class is idiomatic. In the OO world, we could imagine the following Permissions class:

If this class were passed around, then a change in the representation of the private permissions data is isolated to the Permissions class.

The design principle behind class member visibility is called “information hiding”.

I formally learned about information hiding during graduate studies and later read the paper “On the Criteria To Be Used in Decomposing Systems into Modules” by D.L. Parnas (1972) which discusses the concept.

What is information hiding?

D.L. Parnas introduced the term information hiding in the 1970s, and discusses its application to system design in [1] and [2]. The big idea is that each module should hide some design decisions from the rest of the system, especially decisions that would have cross-cutting effects if changed.

In [2], he proposes two contrasting system designs. The first is based on a flowchart, and the second is based on information hiding. After comparing both, he describes specific changes that would be costly in the first, and cheap in the second.

“Every module in the second decomposition is characterized by its knowledge of a design decision which it hides from all others. Its interface or definition was chosen to reveal as little as possible about its inner workings.” [2]

In addition to the general rule that each module should hide some design decisions from the rest of the system, he offers the some more specific design criteria to use when decomposing systems into modules. The most relevant criterion to this discussion is:

“A data structure, its internal linkings, accessing procedures and modifying procedures are part of a single module.” [2]

How is this concept applied in the Functional Paradigm?

The story above sparked an interest in the application of information hiding to functional programs. Others share this same interest. Daniel Kaplan asked a question about information hiding and functional programming on Stack Exchange [4], and by reading the discussion, I learned that the well quoted and classic text “Structure and Interpretation of Computer Programs” (SICP) has much to say on the topic.

I work on a few code bases that use the functional paradigm. In these code bases, the data structures passed between modules are typically lists or maps instead of specialized objects. As mentioned in the forward to SICP:

“It is better to have 100 functions operate on one data structure than to have 10 functions operate on 10 data structures.” [3, Forward]

Without reading the text further, one could conclude that functional programs should pass around lists and maps, and because functional programs are so “powerful”, data abstractions are not necessary for modifiable code.

In some cases, passing around lists and maps will not lead to unmodifiable code. But as we observe in the story above, as list has a different interface than a map, and if modules depend on concrete representations of data, it can be difficult to make changes. It is prudent for a module to expose an abstract representation of the the data it manages, so that changes to that representation are contained to the module.

Abelson and Sussman agree. Chapter 2 in SICP is dedicated to the topic.

“The general technique of isolating the parts of a program that deal with how data objects are represented from the parts of a program that deal with how data objects are used is a powerful design methodology called data abstraction. We will see how data abstraction makes programs much easier to design, maintain, and modify.” [3, Chapter 2]

I believe that data abstraction is one tactic under the umbrella of information hiding, and it is the specific tactic pertinent to this discussion. The following excerpts from SICP teach us how to use the data abstraction methodology:

“The basic idea of data abstraction is to structure the programs that are to use compound data objects so that they operate on “abstract data.” That is, our programs should use data in such a way as to make no assumptions about the data that are not strictly necessary for performing the task at hand. At the same time, a “concrete” data representation is defined independent of the programs that use the data. The interface between these two parts of our system will be a set of procedures, called selectors and constructors, that implement the abstract data in terms of the concrete representation.” [3, Chapter 2] “Constraining the dependence on the representation to a few interface procedures helps us design programs as well as modify them, because it allows us to maintain the flexibility to consider alternate implementations.” [3, Chapter 2.1.2]

One way to apply the principles of information hiding to our functional programs is to export constructors and selectors. Constructors are responsible to create data. Selectors are responsible to access data.

How about an example?

To recap, the opening story revealed one consequence of a design choice. The representation of the permissions data was “hard” to change, because consumers of the data were dependent on the concrete representation of the data (in this case, a list), instead of an abstract interface. By applying the concept of constructors and selectors, the following design emerges:

By hiding the representation of the permissions data inside of the Permissions module and exposing an interface to consumers, all changes to the representation of the data are isolated to the Permissions module.

NOTE: If you are into Redux, data abstraction is idiomatic for Reducers. The payload from an action handled in a Reducer is constructed into module specific state, then the Reducer state is accessed by selectors. Neat-o!

Data abstraction with Redux

Summary

Information Hiding is a software design principle that describes how to decompose software into modules so that programs easier to design, maintain, and modify. The concept was published in the 1970’s, and is still a key design principle for modern software systems.

Data Abstraction is one aspect of Information Hiding, and when applied thoughtfully, can lead to higher quality software.

Although functional programs tend to use maps and lists as the primary values that flow through the system, it is prudent to consider the question, “What happens if the representation of this data needs to change?”

For the functional programmer, consider hiding your data behind constructors and selectors, so that the programs are easier to modify, maintain, and grow over time.

References

[1] Information Distribution Aspects of Design Methodology, D.L. Parnas

[2] On the Criteria To be Used in Decomposing Systems into Modules, D.L. Parnas

[3] Structure and Interpretation of Computer Programs, Abelson et. al.

[4] Does functional programming ignore the benefits gained from information hiding?

More Information