Journal Friends Archive Profile Memories Chalain Jan. 12th, 2008 06:30 pm Proxying vs. Demeter A while back I wrote this post about dishonesty in programming, and someone recently asked a very good question. My answer was so long that it broke LiveJournal's comment size restrictions, so I decided to make it a full post. Your comments are welcome.



>> don't write a public accessor to a private member and tell me that member is still private



So if you need to add an object to a list what is your proposal? Because I can see less problems in:



a.getList().add(obj)



than in creating a new method for every operation in the list interface (for each private collection in the class!). Like:



a.addObjectToList(obj)



The accessors are meant to wrap the field so logic can be attached to the operation but in the end they are making private fields public.



Am I missing something here?



Short answer: Yes. In general, when you have conversations with an object, you should only ever have conversations with that object. If you're talking to something inside that object you're violating encapsulation. If you're talking to a parent's child object directly, the parent can no longer guard access to the child, and must guard itself against changes to the child. In other words, it becomes a peer object and should either be treated as such or the access should be proxied.



Longer answer: Your concern is valid. If we proxied everything all the way down every interface in every system would have 700 methods and that gets messy. It doesn't "get messy real fast", however, but yes it can get out of hand if taken to an extreme. You have to find the right balance, but in my experience the balance is like 90/10 in favor of proxies. Classes and objects that exist solely to broker connections, for example, can violate this rule. Frameworks often violate this rule as well, but the hows and whys of this are a topic for a whole 'nother post.



I have a vital software design concern for you, however, and I say this not as a challenge to you but as hopefully food for thought. When you say "if you need to add an object to a list", you have already gotten into trouble. It's hard to see it in a generic example like this one, however. Let's use a concrete example to make things clearer.



Let's say class Customer has a List of Address objects, and you need to add a new address to a customer.



Consider the statement "Add a new address to a customer's known addresses" versus "Append a new address to a customer's address list". That second statement makes my teeth itch. Why do you need the address *list*? Why are you performing class-dependent operations on a member object of another object?



The proxies don't always have to look awful, either. Consider:



customer.addAddress(address);



I would also write removeAddress(address). For collections that have a lot of churn, I would add clearAddresses(). These mimic the standard List operations, but it is vitally important to see that we are not bound to a List object underneath.



It is a balance; for trivial data you may not care and you just want to expose it. Also, objects that exist only to give you access to other objects--such as a caching object--also should not proxy.



Remember, though, that every time you reach through an object to a child object, you



1. Violate encapsulation

2. Tightly couple all three objects together

3. Cripple modularity/type independence

4. Deprive the parent class of the ability to guard, or even be aware of, the child data.



That last one is particularly scary: the parent can no longer guard the child data. If it needs to know the child object's state, it must ask the child for its state. They are no longer parent/child objects; they are peers.



Two quick examples:



1. I worked on a Java project last year that broke this rule with abandon. One List object we were using everywhere suddenly needed to be changed to a Dequeue. Because it was accessed in over 70 places in the code as a List object, the refactoring took all afternoon rather than 5 minutes behind the proxy.



2. Same project. Because a List object was exposed, programmers got in the habit of grabbing the list and modifying it directly. Later I discovered that programmers were iterating over the list performing Load() operations on each child. Each load took 30-80ms, and some collections had 500 objects in them. This meant our app would hang for 15 to 40 seconds when you opened one form. I quickly refactored it so the parent object could guard the data with one single load, taking about 50-100ms. But then I discovered literally dozens of places in the code where not only did the code expect to be able to grab the list, but that had intertwined their logic around the notion of a List on the other side of the parent object. Before I could refactor that collection, I had to rewrite about 3,000 lines of code throughout the application to get its dirty paws off my data--in such a way that the other areas of code could still do their jobs.



Thank you for this comment. Before now I had thought that it wasn't a balance at all; that proxies were always preferable to violating encapsulation. But as I wrote this I realized that it is a tradeoff: when you proxy, you create type independence through encapsulation, but you do it at the expense of separating the business logic on either side of the interface, and there are times when seamless logic is more important than encapsulating data. I need to think more on this.



Thanks! Current Mood: contemplative

Current Music: Yin Yang - Candice Pacheco

17 comments - Leave a comment From: darthparadox Date: January 13th, 2008 05:55 am (UTC) (Link) Thank you for writing all of this down. There's nothing here that strikes me as unobvious (though I'd never specifically thought of a caller reaching through a parent object to the child object as binding all three together before), but seeing an effectively complete discussion of the basics of parent-child encapsulation is thought-provoking.



I'll be turning this over in my head for the next week or two at work, most likely.



Finally: You really need to consider that book of essays. This kind of stuff is why. Reply ) ( Thread From: (Anonymous) Date: January 13th, 2008 05:00 pm (UTC) If you have to proxy, perhaps your design (Link) It doesn't "get messy real fast", however, but yes it can get out of hand if taken to an extreme. You have to find the right balance, but in my experience the balance is like 90/10 in favor of proxies.



Indeed. Another question to ask yourself if you find yourself writing lots of proxy methods: is your abstraction trying to do too much? In other words, break abstractions down into smaller, simpler abstractions each with a single purpose. I think that lots of proxy methods often indicates a complicated and/or confused abstraction.



-naasking Reply ) ( Thread From: chalain Date: January 13th, 2008 07:14 pm (UTC) Re: If you have to proxy, perhaps your design (Link) Yes. This is the thought that sprang up at the end of my post: proxies create a schism in the business logic. If you have a lot of proxies (or violate the Law of Demeter often), perhaps your business logic needs to be revisited. You're trying to do too much low-level stuff from too high a place. Reply ) ( Parent ) ( Thread From: randytayler Date: January 13th, 2008 05:30 pm (UTC) (Link) Get your dang Masters degree already. You're embarrassing the Establishment by not wielding a degree. Reply ) ( Thread From: chalain Date: January 13th, 2008 08:43 pm (UTC) (Link) Three problems with this.



1. Getting a Master's degree is a problem because I would have to actually finish my Bachelor's first.



2. Getting a Bachelor's degree is a problem because I would have to attend school to do it.



3. Attending school is a problem because I have Attention Defic--OH LOOK! PONIES!!



Reply ) ( Parent ) ( Thread (no subject) - (Anonymous) From: chalain Date: January 15th, 2008 01:32 am (UTC) (Link) Are who riding bikes? Reply ) ( Parent ) ( Thread From: kazriko Date: January 14th, 2008 03:20 am (UTC) (Link) The establishment deserves to be embarrassed. I've seen the students they graduate. ;)

Reply ) ( Parent ) ( Thread From: (Anonymous) Date: January 13th, 2008 07:30 pm (UTC) Interesting arguments (Link) Yes, interesting arguments. I'm more concerned about the first two than the fourth. I mean, you can always create a property change listener. The thing is some frameworks (Hibernate comes to my mind) and Java in general expect public accessors. This will break encapsulation right there. And once in that position it's a difficult proposition to enforce, isn't it? I mean, how can I convince one of my developers that the getter has to be there but that nobody should use it? Should we enforce immutable collections from getters? And what about getters that return simple beans?



When I first asked the question I expected some discussion about encapsulation. I'll have to think more about this as well. May be even blog (http://internna.blogspot.com) a little further! Reply ) ( Thread From: chalain Date: January 13th, 2008 08:39 pm (UTC) Re: Interesting arguments (Link) I mean, you can always create a property change listener.



I actually cried out in dismay when I read this, because it felt like I was saying "this hurts, because XYZ" and you were saying, "yes, but rather than fix the hurt we can throw more hurt at it". :-)



On second thought, however, I think we're actually in agreement. I argued that when the parent class loses the ability to guard the child, they become peers; I failed to expand the logical followthrough of that thought: you should change your design so that they ARE peers.



The thing is some frameworks



This is that topic for a whole 'nother post, which I will summarize now and then probably never get around to writing: frameworks propagate this design flaw rampantly. Because a framework's entire object model is published, documented, well-established, and frozen, the first three of my concerns are inconsequential. I mean, getChildWindows is always going to return Collection<Window>. It is not only acceptable but expected that the caller will get that child window collection and do something collectiony with them.



My chief concern with this is that programmers write what they read; they see a clever design and ape it. At least, I know I do this. But there's nothing in the framework code that clearly states the precon{di,cep}tions that went into the code. And so, we sit down and copy this pattern into code that is brand new, undocumented, not established, and completely fluid, we suddenly get bitten by the consequences of that design choice.



I have posted elsewhere that frameworks in general are at best a necessary evil, and no programmer should ever write framework code unless the actual deliverable of the project is that framework (as opposed to a product that would "clearly benefit" from a framework). I accept that this makes me a crazy old coot. :-)

Reply ) ( Parent ) ( Thread From: weavejester Date: January 13th, 2008 09:59 pm (UTC) (Link)

Collection<Address> getAddresses() { return m_addresses; } This would make switching between lists and queues simple.



Unfortunately, adding custom validations to this would be tricky to do in Java. But in a more flexible OOP language, this is probably a better solution. Ideally you'd use an interface as your proxy. e.g.This would make switching between lists and queues simple.Unfortunately, adding custom validations to this would be tricky to do in Java. But in a more flexible OOP language, this is probably a better solution. Reply ) ( Thread From: chalain Date: January 13th, 2008 10:35 pm (UTC) (Link)



This delegation pattern, however, is still (IMO) quite flawed, unless the first object really is nothing more than a delegate. To illustrate, I think the function Customer.getAddresses():Collection<Addre ss> is inappropriate design, but Cache.getAddresses():Collection<Address> makes perfect sense.



As long as you don't ever want to change the "get this object and manipulate it" pattern, you're fine; and using an interface then effectively gives you type-independence. For example if you later found you needed extra intelligence in the collection--such as shallow loading, or deferred write caching--you could write a custom collection template class that inherits from Collection and handles everything.



But you really have now made the address book a peer object to the Customer, and this all falls apart if any operation on the child class depends on an interaction with the parent--i.e. they really shouldn't be peers. For example, let's say that manipulating a customer's addresses is affected by some rules at the business logic level: retail customers can only have two addresses, but business customers can have five. In order to implement this via interfaces, you're going to need Address to notify Customer that it's being updated, and to ask permission to add the new address based on type. Or you're going to have to customize the address collection itself to know that it can only allow maxAddressCount entries.



To make address a peer class, it must stand alone and be unique in its own right. But suddenly it has this dependence on Customer or it has this weird logic that seems quite specific to a weird case. You can tell when you're in this position because the names get really wacky and convoluted. The first might be InterlockedCollection (with Customer implementing ICollectionEventListener) while the second might be a generic class called SizedCollection.



SizedCollection is one of those classes that really doesn't add much to the whole notion of a Collection, and I try to weed out "unfit specialization" like this whenever possible. This is a classic example of bizarro-world programming: we have created an entire class template to avoid TWO LINES of business logic code!

if (this.addresses.getCount() >= this.maxAddressCount) { throw IllegalArgumentException; } ALL of these issues would go away if Customer simply proxied the request, and clients realized that if they want to add an Address to a Customer, they need to have a conversation with the Customer rather than Customer's address book.



Thank you for your comment. I think we agree that we should use delegation when delegation is appropriate and proxying when proxying is appropriate; if we disagree is in mainly in how often and which situations we find appropriate.



HMMM, that's ANOTHER thought that hadn't occurred to me until you provoked it: My experience has been that many people tend to use delegation when they aren't sure, and those people are my target audience here, because I think that in almost all of those cases proxying would be better. It is true, however, that in statically-typed languages like Java and C++, proxying requires lot more typing and boilerplate than delegation. And since a little delegation doesn't hurt, we use it perhaps more often than we should. I must think on this more.



Thanks! True, and this is definitely a step in the right direction in languages that support interface-based programming. I'm not a big fan of Java's implementation of interfaces but it IS a way to lock down the interactions between objects.This delegation pattern, however, is still (IMO) quite flawed, unless the first object really is nothing more than a delegate. To illustrate, I think the function Customer.getAddresses():Collection is inappropriate design, but Cache.getAddresses():Collection makes perfect sense.As long as you don't ever want to change the "get this object and manipulate it" pattern, you're fine; and using an interface then effectively gives you type-independence. For example if you later found you needed extra intelligence in the collection--such as shallow loading, or deferred write caching--you could write a custom collection template class that inherits from Collection and handles everything.But you really have now made the address book a peer object to the Customer, and this all falls apart if any operation on the child class depends on an interaction with the parent--i.e. they really shouldn't be peers. For example, let's say that manipulating a customer's addresses is affected by some rules at the business logic level: retail customers can only have two addresses, but business customers can have five. In order to implement this via interfaces, you're going to need Address to notify Customer that it's being updated, and to ask permission to add the new address based on type. Or you're going to have to customize the address collection itself to know that it can only allow maxAddressCount entries.To make address a peer class, it must stand alone and be unique in its own right. But suddenly it has this dependence on Customer or it has this weird logic that seems quite specific to a weird case. You can tell when you're in this position because the names get really wacky and convoluted. The first might be InterlockedCollection (with Customer implementing ICollectionEventListener) while the second might be a generic class called SizedCollection.SizedCollection is one of those classes that really doesn't add much to the whole notion of a Collection, and I try to weed out "unfit specialization" like this whenever possible. This is a classic example of bizarro-world programming: we have created an entire class template to avoid TWO LINES of business logic code!ALL of these issues would go away if Customer simply proxied the request, and clients realized that if they want to add an Address to a Customer, they need to have a conversation with the Customer rather than Customer's address book.Thank you for your comment. I think we agree that we should use delegation when delegation is appropriate and proxying when proxying is appropriate; if we disagree is in mainly in how often and which situations we find appropriate.HMMM, that's ANOTHER thought that hadn't occurred to me until you provoked it: My experience has been that many people tend to use delegation when they aren't sure, and those people are my target audience here, because I think that in almost all of those cases proxying would be better. It is true, however, that in statically-typed languages like Java and C++, proxying requires lot more typing and boilerplate than delegation. And since a little delegation doesn't hurt, we use it perhaps more often than we should. I must think on this more.Thanks! Reply ) ( Parent ) ( Thread From: (Anonymous) Date: January 14th, 2008 07:45 am (UTC) (Link) I do a lot of GUI programming, and I can ensure you, your proposal wouldn't work there. It would be a total waste of time to proxy all methods of all controls on a form. Of course you could see this as the exception, but it does prove the proxy method isn't efficient. You lose all benefits Object Orientated has to offer. Another example would be, if you had 10 different objects all having an address list, you have to copy/paste your safe-guarding routines 10 times.



My solution would be, don't use the list class. Make your own list class that encapsulates the regular list. You can extend your own object with guards and events. For example the list could call the parent (or better yet, multiple listeners) requesting if it's OK to add the new item. Also adding a smarter load to your list, would have solved that performance problem you mentioned. And not only for that list, but for all lists. Or add a property BehaveAsDequeue. Reply ) ( Thread From: chalain Date: January 14th, 2008 05:05 pm (UTC) (Link) I'm gonna have to turn off anonymous posting here soon, so I can track you people down. :-) I would love to have a followup discussion with you, because I find this statement interesting. I'm coming from about 15 years of GUI programming, and I've seen it work over and over again, so we're clearly not on the same page yet. The thing is, I learned this principle late in life and spent years doing GUI programming where I would have agreed with you... but I cannot remember why I would have agreed with you, and I need to track you people down so I can ferret out what that thought pattern was. Ah well.



I think I see most of your point, though--see my note elsewhere in the comments here about frameworks. When you work within an established framework, you don't have to worry about the first three drawbacks of violating the Law of Demeter. Because you're never going to change the framework code and because you have the full framework clearly documented and mapped out, you don't need to worry about most of the design complications there.



As for the rest of the point... I need to make another post, I guess. You people are writing custom classes to replace existing classes, stacking listeners on top of child classes to make them peers (but leaving them as children) and architecting class hierarchies that make it impossible to operate the program in any other way than specified.... all of which could have simply been solved with two lines of business logic. Are you really that allergic to business logic?



At some point we have to stop ourselves and ask, "Does programming have to hurt this much?" Reply ) ( Parent ) ( Thread From: billgoates Date: January 14th, 2008 09:45 pm (UTC) (Link) Yeah, sorry, normally anonymous posting still requires a name + email. I discovered I didn't leave a sign just after pressing post.



Lets try to explain it different then. When you have an address object, you will need all kinds of tests to validate the data. I can do these tests on the parent, but when I have multiple objects owning an address list, that validation code is needed multiple times. So the best place to include the validation code is in the address object. When a validation rule changes, you know exactly where to modify, and only have to modify it once.



And the same goes for list objects. All problems you described, were all because the list object isn't safe, smart and/or fast enough. So there are two choices, try to invent a work around, or solve the problem. Your proposal is a work around, I prefer to extend the platform.



Maybe the difference is that you use a framework, I didn't recognize such from the examples, where I try to stay away from off the shelf code. I think frameworks are overrated, bloated, too general purpose and a waste of time. Experienced taught me, that 3rd never party software rarely does everything it is required to do, and the time needed to find the best framework, learn every detail of it, and get it to play nicely with the rest of the code, is equal to much more time than to write the necessary code yourself. The big difference is that my knowledge doesn't go obsolete within 5 years, and I don't have to worry about a vendor going out of business or planning a major rewrite or have to solve/workaround code that isn't mine.



But not everyone has this luxury and are forced to work with their current employers framework. And since it's best not to rewrite parts of it, you will end up doing your proposal, but in my eyes it always will be a work around.



I do have to admit I wasn't familiar with the Law of Demeter, so I did some reading up on it, and more than ever I am convinced it's an outdated law that should be forgotten and buried as quick as possible.



It originates from 1987, long before OO became popular or anyone heard of event based objects. I saw lots of blogs promoting the law, but didn't see any valid reason of why to use it. I even saw someone suggesting to use a Demeter generator to overcome the disadvantage of writing a large number of wrappers. Talking about insanity. Anyone thinking that auto generating pointless heaps of code will make their software more stable, safer or easier to maintain instead of the opposite, may have a free lifetime straightjacket.



The advantage of replacing existing classes is much more than saving 2 lines. Those 2 lines you mention are the forwarding call only. Validation code is still required, without it the forwarding doesn't serve any purpose other than to comply to some ancient law. And those 2 lines + validation, are for every child method, multiplied every time this object is used. And after all that work you still don't have any guarantees if some 'smart' developer isn't accessing the child directly.



Encapsulating a list takes 5 maybe 10 minutes. And all you need after that is one global replace. From there on you can slowly upgrade your software where needed. And a list object is important enough, that even if you would take a full week off to fully customize it, it will pay back in the first medium size project. Reply ) ( Parent ) ( Thread From: chalain Date: January 14th, 2008 10:27 pm (UTC) (Link) OH OH OH! I see the disconnect! Thank you for replying!



I think I have created in your mind an image where you can only ever have conversations with Customer objects, and if you need to talk to an Address object you must proxy it. This is not at all what I am trying to say.



We're both agreed that if you want to talk about customers, you need to have a conversation with a Customer object, right? If you want to talk about addresses, you need to have a conversation with an Address object--not a proxy through a Customer. I hope that clears things up a bit about what I am not saying. :-)



Now then, what I *AM* saying is that if you have a mailing address here, and you've already had the conversation with it where you set its data and it has validated itself, and now you want to add this address to a customer, that is a conversation with a Customer. Importantly, it is NOT a conversation with a Customer's address collection mechanism. I guess you could be talking exclusively about validating the collection mechanism itself, but that proxies easily; if Customer::addAddress() has no validation problems of its own, it simply returns whatever mAddressBook.addObject() returns.



From there, the rest of your post is essentially a continuation of that original misunderstanding. I generally agree with what you're saying with just a few exceptions and/or tweaks:



- I see the 2-line business logic as the smartest and best solution. If there's a business rule that says customers of type X can only have Y addresses, then some code in the controller subsystem--the place business logic should be stored--should check that condition and handle it accordingly. This is a hard point to make out of context if (and I believe this is the case) you and I are coming from very different viewpoints. Both of us have tight, compact solutions at either end of the system (static design vs. runtime behavior) and see solutions at the other end of the spectrum as wildly out of control. If we worked at the same shop you and I would have many entertaining lunches together, I am sure. :-)



- Re: Law of Demeter. Okay, I thought *I* was a crazy old coot. You sir, are just plain nuts. And coming from a crazy old coot, that may be saying something. I'll accept for now that you don't see any value in it, but I implore you to dig deeper and try to find out why the Demeter crew thought it was such a good idea in the first place. (I also welcome your response here; I see adherence to the Law of Demeter as being equivalent to eschewing tightly-coupled, monolithic code. Given the viewpoint shifts we already had, I suspect that you do not see it this way.) I will say this: my adherence to the Law of Demeter has produced MUCH more decoupled, reusable, and testable code than just about any other single design principle.



- The Law of Demeter was inspired by Object-based programming, which was 20 years old by 1987; Programmers the world over had been shooting themselves in the foot in C for nearly a decade by then, only then it was reaching through the humble struct that was killing them.



- I find it fascinating that you detest frameworks but love modifying and customizing systems. In my mind that's like saying you love making bricks but people who build walls should be shot. :-) (Another viewpoint discussion, I am sure; obviously I can see there's a difference between trying to make every possible kind of brick so that masons don't have to, and making just enough brick to throw through a window.)



Anyway, thanks for writing back. Cheers! Reply ) ( Parent ) ( Thread From: billgoates Date: January 15th, 2008 02:17 am (UTC) (Link) From what I have read so far, I believe Demeter wouldn't agree with your Address object. But not important, the reason I used the Address object was to show an obvious example to explain a less obvious one.



But I will stick to the list now. The max number of addresses was something I thought on while writing the previous post, but didn't include. It's simple enough, extend your own list with a MaxEntries property. The parent class has to initialize it, but the validation is in the list it self.



Not convincing enough? Lets make it a little more complicated then. You want to prevent double entries. To do so you will have to compare all fields in your address against all the fields of the other addresses. That's another 10 lines of code you will have to include in every AddToList proxy according to Demeter.



I could go on with extending the list with like multiple indexes, filters, searches, smart loading, etc. Every addition will add to the current and following projects at no extra price, where as Demeter will stick at copy/pasting the same basic rules.



In his defense, he didn't know about events, interfaces or callbacks, so you cannot blame him for being obsolete. Use any of those method and you still can have the business object specific code in the parent object.



Your viewpoint about bricks is almost correct, but I don't want to shoot wall builders, but the other brick makers. See it as most new languages are comparable to a box of Meccano. It's supposed to build anything, but it's hard to use and everything created with it will look like a bunch metal plates screwed to each other. Then you get 3rd party additions, more boxes of Meccano. Not much changed, at best you need a few less screws. What we should have is Lego. Anyone can use it, and if you are crafted enough you can build entire cities people from all over the world will want to see. Reply ) ( Parent ) ( Thread From: (Anonymous) Date: January 14th, 2008 03:24 pm (UTC) (Link) Finally I've got some time to write a post of my own about the topic. Thank you very much for the insight!



It's available at http://internna.blogspot.com/2008/01/evilness-of-accessors.html Reply ) ( Thread