For better or worse, SOA (service-oriented architecture) continues to be the current industry fad. As SOA continues along the “hype curve” (a term I’m borrowing from Gartner), more and more people are starting to realize that SOA isn’t a silver bullet, and that it doesn’t actually replace n-tier client/server or object-orientation.

What will most likely happen over the next couple years, is that SOA will fall into the “pit of disillusionment” (part of the hype curve, that I think of as the “pit of despair”), and many people will decide, as a result, that it is totally useless. This will happen, not in small part, because some organizations are investing way too much money into SOA now, when it is overly hyped – and they’ll feel betrayed when “reality” sets in.

After a period of disrepute, SOA may then rise to a “plateau of productivity”, where it will finally be used to solve the problems it is actually good at solving.

Some technologies don’t live through the “despair” part of the process. Sometimes the harsh light of reality is too bright, and the technology can’t hold up. Other times, a competing technology or concept hits the top of its hype curve, derailing a previous technology. Over the next very few years, we’ll see if SOA holds up to the despair or not.

This is a pattern Gartner has observed for virtually all technologies over many, many years. If you think about any technology introduced over the past 20 years or more, almost all of them have following this pattern: over-hyping, over-reacting-to-reality and finally used-as-a-real-solution.

My colleague and mentor, David Chappell, recently blogged about some of the realities people are discovering as they actually move beyond the hype and try to apply SOA. It turns out, not surprisingly, that achieving real benefits in terms of reuse is much harder than the SOA evangelists would have anyone believe.

I think this is because SOA focuses on only one part of the problem: syntactic coupling. SOA, or at least service-oriented design and programming, is very much centered around rules for addressing and binding to services, and around clear definition of syntactic contracts for the API and message data sent to and from services.

And that’s all good! Minimizing coupling at the syntactic level is absolutely critical, and SOA has moved us forward in this space, picking up where EAI (enterprise application integration) left off in the 90’s.

Unfortunately, syntactic coupling is the easy part. Semantic coupling is the harder part of the problem, and SOA does little or nothing to address this challenging issue.

Semantic coupling refers to the behavioral dependencies between components or services. There’s actual meaning to the interaction between a consumer and a service.

Every service implements some tangible behavior. A consumer calls the service, thus becoming coupled to that service, at both a syntactic and semantic level. At the syntactic level, the consumer must use the address, binding and contract defined by the service – all of which are forms of coupling. But the consumer also expects some specific behavior from the service – which is a form of semantic coupling.

And this is where things get very complex. The broader the expected behavior, the tighter the coupling.

As an example, a service that does something trivial, like adding two numbers, is relatively easy to replace with an equivalent. Such a service can even be enhanced to support other numeric data types with virtually no chance of breaking existing consumers. So the semantic coupling between a consumer and such a service is relatively light.

Another example is credit card verification. Obviously the internal implementation of this behavior is much more complex, but the external expectations of behavior remain very limited. Like adding two numbers, verifying a credit card is a behavior that accepts very little data, and returns a very simple result (yes/no).

Contrast this with many other possible business services, such as shipping an order, or generating manufacturing documentation. In these (quite common) scenarios, the service performs, or is expected to perform, a relatively broad set of behaviors. The result is a whole group of effects and side-effects – all of which should be considered as black-box effects by any caller. But the more a service does, the less “black-box” it can be to its callers, and the tighter the coupling.

And this leaves us in a serious quandary. There’s a high cost to calling a service. There’s a lot of overhead to creating a message, serializing it into text (XML), routing it through some communications stack onto the wire, getting the electrons across the wire through some protocol (probably TCP) and all the attendant hardware involved, picking it up off the wire on the server, routing it through another communications stack, deserializing the text (XML) back into a meaningful message and finally interpreting the message. Only then can the service actually act on the message to do real work.

Worse, that’s only half the story, because most people are creating synchronous request/response services, and so that whole overhead cost must be paid again to get the result back to the caller!

Before going further, let me expand on this “overhead cost” concept to be more precise.

I worked for many years in manufacturing. In that industry there’s the concept of cost accounting – people make their living at tracking costs. They divide costs into overhead, setup and run (there are other models, but this one’s pretty standard).

To make this somewhat more clear, I’ll use the metaphor of baking cookies.

Overhead cost are all the salaried people, the buildings, equipment and so forth. Costs that are paid whether widgets are manufactured or not. When baking cookies, this is the cost of having a kitchen, a stove, electricity, natural gas, and of course the person doing the baking. In most homes these costs exist regardless of whether cookies are baked or not.

Setup costs are applied overhead. They are costs that are required to build a set of widgets, but they are only incurred when widgets are being manufactured. These costs include setting up machines, programming devices, getting organized, printing documents, etc. When baking cookies, this is the cost (in terms of time) of getting out the various ingredients, bowls, spoons and other implements. It is also the cost of cleaning up after the baking is done – all the washing, drying and putting-away-of-implements that follows. These costs are directly applied to the process, but are pretty much the same whether you bake one dozen or ten dozen cookies.

Run costs are those costs that are incurred on a per-widget basis to make a widget. This includes the hourly rate of the workers manning the assembly line, the materials that go into the widget and so forth. When baking cookies, this is the time spent by the baker, the cost of the flour, eggs and other ingredients consumed in the process. Ideally it would include the amount of electricity or natural gas used to run the stove as well. Obviously detailed run costs can be hard to determine in some cases!

When calculating the cost of your cookies, each of these three costs is added together. The run rate is easy, as it is per-cookie by definition. The setup rate is variable – the more cookies you make in a batch the lower the relative setup cost, and the fewer cookies the higher the relative setup cost. Overhead is typically aggregated – the annual overhead cost is known, and is divided by the number of cookies (and other things) made over a year’s time. Obviously there’s lots of wiggle room in this last number.

For my purposes, in discussing services, the overhead rate isn’t all that meaningful. In our industry this is the cost of the IT staff, the servers, the server room, electricity and cooling and so forth.

But the setup rate and run rate become very meaningful when talking about services.

Calling a service, as I noted earlier, incurs a lot of overhead. This overhead is relatively constant: you pay about the same whether you send 1 byte or 1024 bytes to or from the service.

The run rate is the actual work done by the service. Once the message is parsed and available to the service, then the service does real, valuable work. This is the run rate for the service.

In manufacturing it is always important to manage the overhead and setup costs – they are a “pure cost”. The run rate cost must also be managed, but it is directly applicable to a product, and so that cost can be factored into the price you charge. Perhaps more importantly, your competitors typically have a comparable run rate (materials and labor cost about the same), but the overhead can vary radically.

To switch industries just a bit, this is why Walmart does so well (and is so feared). They have managed their overhead and setup costs to such a degree that they actually do focus on reducing their run rate (in their case, the per-unit acquisition cost of items).

Coming back to services, we face the same issue. Typically we deal with this using intuition rather than thinking it through, but the core problem is very tangible.

Would you call a service to add two numbers? Of course not! The setup/overhead cost would outweigh the run cost to such a degree that this makes no sense at all.

Would you call a service to ship an order, with all the surrounding activities that implies? This makes much more sense. The setup/overhead cost becomes trivial when compared to the run cost for such a service.

And yet coupling has the opposite effect. Which of those services can be more loosely coupled? The addition service of course, because it performs a very narrow, discrete, composable behavior.

Do you even know what the ship-an-order service might do? Of course not, it is too big and vague. Will it trigger invoicing? Will it contact the customer? Will it print pick lists for inventory? Will it update the customer’s sales history?

I would hope it does all these things, but very few of us would be willing to blindly assume it does them. And so we are forced to treat ship-an-order as something other than a black box. At best it is gray, but probably downright clear. We’ll require that the service’s actual behaviors be documented. And then we’ll fill in the gaps for what it does not provide, or doesn’t provide in a way we like.

(Or, failing to get adequate documentation, we’ll experiment with the service, probing to find its effects and side-effects and limitations. And then we’ll fill in the gaps for the bits we don’t like. Sadly, this is the more common scenario…)

At this point we (the caller of the service) have become so coupled to the service, that any change to the service will almost certainly require a change to our code. And at this point we’ve lost the primary goal/benefit of SOA.

Why? How can this be, when we’re using all the blessed standards for SOA communication? Maybe we’re even using an Enterprise Service Bus, or Biztalk Server or whatever the latest cool technology might be. And yet this coupling occurs!

This is because I am describing semantic coupling. Yes, all the cool, whiz-bang SOA technologies help solve the syntactic coupling issues. But without a solution to the semantic, or behavioral, coupling it really doesn’t get us very far…

What’s even scarier, is that the vision of the future portrayed by the SOA evangelists is one where we build services (systems) that aggregate other services together to provide higher-level functionality. Like assembling simple blocks into more complex creations, that in turn can be assembled into more complex creations or used as-is.

Except that each level of aggregation creates a service that provides broader behaviors – and by extension tighter coupling to any callers (though the setup vs run costs become more and more favorable at the top level).

To bring this (rather long) post to a close, I want to return to the beginning. SOA is heading down the steep slope into the pit of disillusionment. You can head this off for yourself and your organization by realizing ahead of time, right now, that SOA only addresses syntactic issues. You must address the much harder semantic issues yourself.

And the tools exist. They have for a long time. Good procedural design, use of flow charts, data flow diagrams, control diagrams, state diagrams: these are all very valid tools that can help you manage the semantic coupling. Unfortunately the majority of people with expertise in these tools are nearing retirement (or have retired) – but the tools and techniques are there if you can find some old, dusty books on procedural design. Just remember to include the setup/overhead cost vs run cost in your decisions on whether to make each procedure into a "service".