A string is sometimes not a string. Model it accordingly.

Recently, I was reviewing some code that looked like this:

public IHttpActionResult Get( string userName) { if ( string .IsNullOrWhiteSpace(userName)) return this .BadRequest( "Invalid user name." ); var user = this .repository.FindUser(userName.ToUpper()); return this .Ok(user); }

There was a few things with this that struck me as a bit odd; most notably the use of IsNullOrWhiteSpace. When I review code, IsNullOrWhiteSpace is one of the many things I look for, because most people use it incorrectly.

This made me ask the author of the code why he had chosen to use IsNullOrWhiteSpace, or, more specifically, what was wrong with a string with white space in it?

The answer wasn't what I expected, though. The answer was that it was a business rule that the user name can't be all white space.

You can't really argue about business logic.

In this case, the rule even seems quite reasonable, but I was just so ready to have a discussion about invariants (pre- and postconditions) that I didn't see that one coming. It got me thinking, though.

Where should business rules go? #

It seems like a reasonable business rule that a user name can't consist entirely of white space, but is it reasonable to put that business rule in a Controller? Does that mean that everywhere you have a user name string, you must remember to add the business rule to that code, in order to validate it? That sounds like the kind of duplication that actually hurts.

Shouldn't business rules go in a proper Domain Model?

This is where many programmers would start to write extension methods for strings, but that's just putting lipstick on a pig. What if you forget to call the appropriate extension method? What if a new developer on the team doesn't know about the appropriate extension method to use?

The root problem here is Primitive Obsession. Just because you can represent a value as a string, it doesn't mean that you always should.

The string data type literally can represent any text. A user name (in the example domain above) can not be any text - we've already established that.

Make it a type #

Instead of string, you can (and should) make user name a type. That type can encapsulate all the business rules in a single place, without violating DRY. This is what is meant by a Domain Model. In Domain-Driven Design terminology, a primitive like a string or a number can be turned into a Value Object. Jimmy Bogard already covered that ground years ago, but here's how I would define a UserName class:

public class UserName { private readonly string value; public UserName( string value) { if (value == null ) throw new ArgumentNullException ( "value" ); if (! UserName .IsValid(value)) throw new ArgumentException ( "Invalid value." , "value" ); this .value = value; } public static bool IsValid( string candidate) { if ( string .IsNullOrEmpty(candidate)) return false ; return candidate.Trim().ToUpper() == candidate; } public static bool TryParse( string candidate, out UserName userName) { userName = null ; if ( string .IsNullOrWhiteSpace(candidate)) return false ; userName = new UserName (candidate.Trim().ToUpper()); return true ; } public static implicit operator string ( UserName userName) { return userName.value; } public override string ToString() { return this .value.ToString(); } public override bool Equals( object obj) { var other = obj as UserName ; if (other == null ) return base .Equals(obj); return object .Equals( this .value, other.value); } public override int GetHashCode() { return this .value.GetHashCode(); } }

As you can tell, this class protects its invariants. In case you were wondering about the use of ToUpper, it turns out that there's also another business rule that states that user names are case-insensitive, and one of the ways you can implement that is by converting the value to upper case letters. All business rules pertaining to user names are now nicely encapsulated in this single class, so you don't need to remember where to apply them to strings.

If you want to know the underlying string, you can either invoke ToString, or take advantage of the implicit conversion from UserName to string. You can also compare two UserName instances, because the class overrides Equals. If you have a string, and want to convert it to a UserName, you can use TryParse.

The original code example above can be refactored to use the UserName class instead:

public IHttpActionResult Get( string candidate) { UserName userName; if (! UserName .TryParse(candidate, out userName)) return this .BadRequest( "Invalid user name." ); var user = this .repository.FindUser(userName); return this .Ok(user); }

This code has the same complexity as the original example, but now it's much clearer what's going on. You don't have to wonder about what looks like arbitrary rules; they're all nicely encapsulated in the UserName class.

Furthermore, as soon as you've left the not Object-Oriented boundary of your system, you can express the rest of your code in terms of the Domain Model; in this case, the UserName class. Here's the IUserRepository interface's Find method:

User FindUser( UserName userName);

As you can tell, it's expressed in terms of the Domain Model, so you can't accidentally pass it a string. From that point, since you're receiving a UserName instance, you know that it conforms to all business rules encapsulated in the UserName class.

Not only for OOP #

While I've used the term encapsulation once or twice here, this way of thinking is in no way limited to Object-Oriented Programming. Scott Wlaschin describes how to wrap primitives in meaningful types in F#. The motivation and the advantages you gain are the same in Functional F# as what I've described here.

Isn't this over-engineering? A 56 lines of code class instead of a string? Really? The answer to such a question is always context-dependent, but I rarely find that it is (over-engineering). You can create a Value Object like the above UserName class in less than half an hour - even if you use Test-Driven Development. When I created this example, I wrote 27 test cases distributed over five test methods in order to make sure that I hadn't done something stupid. It took me 15 minutes.

You may argue that 15 minutes is a lot, compared to the 0 minutes it would take you if you'd 'just' use a string. On the surface, that seems like a valid counter-argument, but perhaps you're forgetting that with a primitive string, you still need to write validation and business logic 'around' the string, and you have to remember to apply that logic consistently across your entire code base. My guess is that you'll spend more than 15 minutes on doing this, and troubleshooting defects that occur when someone forgets to apply one of those rules to a string in some other part of the code base.

Primitive values such as strings, integers, decimal numbers, etc. often represent concepts that are constrained in some ways; they're not just any string, any integer, or any decimal number. Ask yourself if extreme values (like the entire APPP manuscript, Int32.MinValue, and so on) are suitable for such variables. If that's not the case, consider introducing a Value Object instead.

If you want to learn more about Encapsulation, you can watch my Encapsulation and SOLID Pluralsight course.