Privacy and Exposure, Gatekeepers and Privileged Consumers

Encapsulation and Information Hiding

Encapsulation and information hiding , are well known principles in object-oriented programming, common mechanisms for maintaining DRY code and enforcing the single responsibility principle . While these foundational elements imbue the programmer with significant expressive power and are essential for writing software with possessing anything more than a semblance of maintainability, they sometimes introduce restrictions that are ultimately antithetical to those goals. We, the writers of code, should, therefore, have a solid understanding of not only the standard use of these core object-oriented principles, but also, and, perhaps, more importantly, where they break down in addition to how and when to work around them.

We will look at the ways in which we are able to subvert typical safety nets, and this exploration of various means for exposing otherwise private members will also serve as an introduction to some features of the Ruby programming language less frequently encountered in day to day practice. We will, ultimately, introduce additional safety to objects that traditionally have little, in an attempt to limit the use of our newfound powers for good. Let us begin by looking at the three levels of visibility available and how they are used via a concrete, storybook example.

Visibility or Protection

Ruby has three different levels of visibility, the two most common of which are public and private . Public members, as the name implies, are unequivocally accessible to any and all callers. Private members, on the other hand, much like Penelope, have extremely restricted access, described in the official documentation section on visibility as:

The third visibility is private . A private method may not be called with a receiver, not even self . If a private method is called with a receiver a NoMethodError will be raised.

While not necessarily clear from that description, what this means is that private members are only available in any context where they can be accessed without a receiver, specifically in an instance of a class or any of its subclasses. A separate document explains explicit receivers: in summary, as long as the dot syntax is not required, a private method may be called.

What is strange about private access in Ruby is its similarity to protected access in most other object-oriented languages (e.g. C++, Java, C#), yet Ruby also has protected members. This leads to the question of what exactly that means, and, fortunately, we can fall back on the official documentation.

The second visibility is protected . When calling a protected method the sender must be a subclass of the receiver or the receiver must be a subclass of the sender. Otherwise a NoMethodError will be raised. Protected visibility is most frequently used to define == and other comparison methods where the author does not wish to expose an object's state to any caller and would like to restrict it only to inherited classes.

That definition, and its accompanying example, are opaque enough that that it warrants looking at a simplified example. Imagine we have two kinds of Mammal , Kitten s and Puppy s, both of which are capable of cuddle ing. In this sad scenario, however, affection between species is nonexistent, so kittens and puppies will never cuddle with each other. This can be accomplished with the use of protected methods.

class Mammal def cuddle ( other ) [ reaction , other . reaction ] end end class Kitten < Mammal protected def reaction :purr end end class Puppy < Mammal protected def reaction :nuzzle end end

>> gweeby , luxe , bella = Kitten . new , Kitten . new , Puppy . new >> gweeby . cuddle ( luxe ) => [:purr, :purr] >> gweeby . cuddle ( bella ) NoMethodError: protected method `reaction' called for #<Puppy:0x00005581ee417790>

As you can see, our two kittens, Gweeby and Luxe, will happily share their warmth, but Bella is disallowed from partaking. To put protected access in plain English, it allows different instances of the same class (including superclasses) to access methods on each other. In this example, both instances of Kitten can call reaction on each other, but not on an instance of Puppy .

Ancestry or Injection

In a world more worth living in, while the animosity between dogs and cats may yet persist generally, but there can also be exceptions to the rule, such as when particular animals grew up in close proximity. At this point, we will look at how we can exploit the fact that Ruby violates the open/closed principle in order to realize this possibility.

The obvious approach, for many seasoned Ruby developers, would be to simply fall back on the facilities Ruby provides for accessing non-public members, namely the functionally equivalent approaches of instance_eval and send .

>> bella . instance_eval { reaction } => :nuzzle >> bella . send ( :reaction ) => :nuzzle

While these approaches may be Good Enough™, we can do better by utilizing the class ancestry of our kittens and puppies. Inspecting the output from Module#ancestors , we can see that our two classes, as expected, are nearly identical.

>> Kitten . ancestors => [Kitten, Mammal, Object, Kernel, BasicObject] >> Puppy . ancestors => [Puppy, Mammal, Object, Kernel, BasicObject] >> Kitten . ancestors . drop ( 1 ) == Puppy . ancestors . drop ( 1 ) => true

Including a module on a class will actually modify the ancestors of that class, thereby affecting the path of method lookup. For example, if we open the Puppy class using class_eval , we can include an anonymous module, that subsequently appears in the list of ancestors.

>> Puppy . class_eval { include Module . new } >> Puppy . ancestors => [Puppy, #<Module:0x000055c94e207ca0>, Mammal, Object, Kernel, BasicObject]

This little bit of knowledge is not particularly useful by itself, since the method lookup will stop at the Puppy class, where the reaction method is still protected. This can be more easily visualized by stating that our classes for kitten and puppies are leafs in the object tree, which can only be extended through subclassing, which is not desirable in this instance. Were only there some way to inject an module before the class in which our protected method is defined. Enter the singleton class . A singleton class is a special, unique class that exists for each object. The easiest way, perhaps, to demonstrate how it operates is to again turn to our class ancestry, but, this time, by using Object#singleton_class method.

>> gweeby . singleton_class . ancestors => [#<Class:#<Kitten:0x000055f164658910>>, Kitten, Mammal, Object, Kernel, BasicObject] >> bella . singleton_class . ancestors => [#<Class:#<Puppy:0x00005581ee417790>>, Puppy, Litter, Mammal, Object, Kernel, BasicObject]

Now we can see a point of inflection where we can potentially inject a shared ancestor that will act as a gatekeeper of sorts; for example, consider the following module, called Litter to denote that mammals including it have had close familial relations since an early age.

module Litter protected def reaction super end end

As is clear, this module does nothing other than proxy the call to reaction , but, combined with the ability to open the singleton class, this gives us the incredible ability to open protected methods to instances of certain other objects as we see fit. We can see, in the following example, that Gweeby and Bella will cuddle, based on their being part of the same litter, while Luxe and Bella will not.

>> gweeby . singleton_class . class_eval { include Litter } >> bella . singleton_class . class_eval { include Litter } >> gweeby . cuddle ( bella ) => [:purr, :nuzzle] >> luxe . cuddle ( bella ) NoMethodError: protected method `reaction' called for #<Puppy:0x00005581ee417790>

Asymmetry or Refinement

We now have the means of exposing methods between two collaborating objects that share an interface, in this case, the #reaction method. Let us imagine a world where kittens refuse to be cuddled unless they are the instigator in the interaction. Our injected gatekeepers cannot accommodate this asymmetrical relationship; we will need to use a different means of exposing our protected methods. Here, we can make use of a language feature first introduced in Ruby 2.0, the refinement .

This seldom used feature gives up the ability to change class definitions within the local context, rather than globally like a typical monkey patch. The particular rules for this locality are fairly complicated, but, most importantly, the refined class is modified from the point at which the refined module is used, to the end of the block or file. The following example shows how we can refine the Puppy class to make its reaction method available in a Kitten instance.

class Puppy protected def reaction :nuzzle end end module Instigator refine Puppy do public :reaction end end class Kitten using Instigator def cuddle ( other ) [ reaction , other . reaction ] end protected def reaction :purr end end

>> Kitten . new . cuddle ( Puppy . new ) => [:purr, :nuzzle]

Now we have what may be termed a privileged consumer. The Kitten class may instigate interactions with the Puppy class, but no other object may do the same. This leads to some interesting consequences when we make use of these intricacies in a more practical example.

To Redact an Interface

This tale of mammalian intrigue has, thus far, provided us with an avenue for the exploration of the building blocks of member access in Ruby, but, as yet, has made no stride toward convincing us that this approach can be useful in the real world. Let us, then, consider a typical Rails application, with heavy use of ActiveRecord throughout. Much of this is inspired by the book Objects on Rails , by Avdi Grimm . Therein, Grimm presents a series of refactorings that, over time, encapsulates direct access to the database within the model layer, only exposing it to consumers via a well defined interface. Here, we will take a slightly different approach, and use our newly minted concept of privileged consumers to bestow access upon certain other objects.

Imagine, further, if you will, an application with complex authorization requirements that cannot be simply defined in a single Ability class, a pattern popularized by the cancancan gem . Instead, we want to have a separate category of objects that mediate access to the underlying models. While we could simply decorate our models with objects that perform our unsafe operations, we want to be able to programmatically enforce this restriction.

We will first need to discuss protected class methods. It may seem obvious that it is possible to protect a method by simply placing it after the protected call, but that is not so.

class Protected protected def self . show :protected end end

>> Protected . show => :protected

Instead, we can make use of the fact that, in Ruby, a class is simply an instance of a Class . Consequently, it too has a singleton class that is open for modification. The pattern for doing so inside the class definition is rather well known, namely the colloquial class << self syntax. In fact, the name of the method responsible for programmatically generating class methods is Object#define_singleton_method , a remnant of the days of yore (pre Ruby 1.9.1) when this pattern was necessary to metaprogram at the class level.

class Protected class << self protected def show :protected end end end

>> Protected . show Traceback (most recent call last): 2: from /home/sonny/.rbenv/versions/2.5.0/bin/irb:11:in `<main>' 1: from (irb):11 NoMethodError (protected method `show' called for Protected:Class)

With this in hand, the first step toward protecting the query interface of our objects will be to create a Redactor module that simply takes a class name and a list of methods to hide from the outside world via use of the protected method on the singleton class.

module Redactor def redact! ( klass , methods ) klass . singleton_class . class_eval do methods . each { | method | protected method } end end end

This module simply opens the singleton class of the class on which we want to redact the methods, and, using Module#class_eval , marks each method as protected. The next piece of the puzzle is to define our mediators, starting with the BaseMediator , which ensapsulate all the complexity of redacting methods and refining them within the context of the appropriate subclasses.

class BaseMediator extend Redactor # http://guides.rubyonrails.org/active_record_querying.html#retrieving-objects-from-the-database QUERY_INTERFACE_METHODS = % i ( find create_with distinct eager_load extending from group having includes joins left_outer_joins limit lock none offset order preload readonly references reorder select where ) def self . redact_and_refine! ( subclass ) model = subclass . name . gsub ( /Mediator$/ , '' ) . safe_constantize redact! ( model , QUERY_INTERFACE_METHODS ) Module . new do refine model . singleton_class do QUERY_INTERFACE_METHODS . each { | method | public method } end end end end

The redact_and_refine! class macro has two primary responsibilities, as indicated by its name:

Redact the methods we do not want to have exposed. Create a module with a refinement that makes them public again.

Normally, we would want to simplify this for the subclass by using Class::inherited , but, in this case, we are unable to do so, since trying to call Module::using from within a method results in the following error: RuntimeError: Module#using is not permitted in methods . We are, therefore, forced to have each subclass pull in its own refinement, such as in the following CustomerMediator below.

class CustomerMediator < BaseMediator using redact_and_refine! ( self ) def self . load ( id ) Customer . find ( id ) end end

While not ideal, this is not an onerous task for significantly increased safety. Because the CustomerMediator is responsible for hiding the methods inside the Customer model from the outside world, we must first load it. Then, using that same class, we are able to access the Customer::find method, but when we try to do so directly, we are greeted by a NoMethodError remarking that the method is protected.

>> CustomerMediator => CustomerMediator >> CustomerMediator . load ( 1 ) Customer Load (0.3ms) SELECT "customers".* FROM "customers" WHERE "customers"."id" = $1 LIMIT $2 [["id", 1], ["LIMIT", 1]] => #<Customer:0x0000559d5f12a4d8 id: 1> >> Customer . load ( 1 ) NoMethodError: protected method `find' called for Customer(id: integer):Class

The Wall of Least Surprise

In a perfect world, we would want to provide a better error message for consumers of our classes, rather than the generic default message. This desire to provide the developer with as easy an interface as possible is an example of the principle of least surprise . Attempts to add this feature, however, are not forthcoming, because of the dynamic dispatch semantics of Ruby. Take, for instance, the following modification on the existing Redactor module and BaseMediator class:

module Redactor def redact! ( klass , methods ) methods . map do | method | redacted_method = "__ #{ method } _redacted" . to_sym klass . singleton_class . class_eval do alias_method ( redacted_method , method ) protected redacted_method end klass . define_singleton_method ( method ) do |* args | begin public_send ( redacted_method , * args ) rescue NoMethodError => e raise e unless e . message =~ %r{^protected method ` #{ redacted_method } ' called for #{ klass . name } } raise SecurityError . new ( " #{ klass . name } cannot be queried directly, please use #{ klass . name } Mediator" ) end end redacted_method end end end

class BaseMediator def self . redact_and_refine! ( subclass ) model = subclass . name . gsub ( /Mediator$/ , '' ) . safe_constantize redacted_methods = redact! ( model , QUERY_INTERFACE_METHODS ) Module . new do refine model . singleton_class do redacted_methods . each { | method | public method } end end end end

The major difference here is that we alias the existing method we are redacting and redefine the original method to wrap a call to the alias, catching the default exception and raising a more helpful one. The BaseMediator then makes the new methods public, rather than the original, which is already a public wrapper. Where this fails is that we have no way to call the original method while respecting its updated visibility. The use of Object#public_send fails, since it does not honor the refinement and believes that that method is still protected, as can be seen in the following.

>> CustomerFinder . load ( 1 ) SecurityError: Customer cannot be queried directly, please use CustomerMediator

Attempting to use either Object#send or Object#method to call the method directly both fail by ignoring the visibility rules altogether. As such, we cannot provide a better message by wrapping the call to our redacted method with some exception handling.

Another approach worthy of attempt is to inspect, via Object#public_methods and Object#protected_methods , which methods are available, but this also fails to produce the expected results. The following example leads us to a better understanding of what is happening behind the curtains when using refinements to modify method visibility.

class Protected class << self protected def print :protected end end end module Publicize refine Protected . singleton_class do public :print end end class Public using Publicize def self . print Protected . public_methods . include? ( :print ) # false Protected . protected_methods . include? ( :print ) # true Protected . print # works Protected . public_send ( :print ) # fails end end Public . print

Traceback ( most recent call last ) : 2 : from protection.rb:29:in ` <main> ' 1: from protection.rb:25:in `print' protection.rb:25:in ` public_send ': protected method `print' called for Protected:Class ( NoMethodError )

When this script is run, the static call works, but the dynamic call fails. The Protected::print method, moreover, is list as protected, not public, from within the class using the refinement! As a consequence of this final result, I cannot recommend using refinements for modifying method visibility when there is any intention to use dynamic dispatch or rely upon reflection. Having encountered this first hand, it is worth pointing out that this is, at least indirectly, mentioned in the documentation for refinements:

When using indirect method access such as Kernel#send , Kernel#method or Kernel#respond_to? refinements are not honored for the caller context during method lookup. This behavior may be changed in the future.

Since this leads to somewhat unexpected results, I have decided to start a conversation about changing this behavior, which can be found here .