Domain Integrity: What's the best way to check it?(guest post)

6 january 2014 by vijay Sarin

Entity integrity

Domain integrity

Referential integrity

User-defined integrity

public class Dog{ int age; String color; public void setAge(int age) { this.age = age; } public int getAge() { return age; } public void setColor(String color) { this.color = color; } public String getColor() { return color; } }

The age attribute is declared as integer, which is not true, indeed an age is superior to 0 and less than a maximum, let's say 100.

The color is declared as string, and a class user could use it like this: dog.SetColor("aaa");

"aaa" is not a color, and no check exist in the class implementation to prevent this assignation.

The Gui.

Calculated.

XML files.

Database.

csv files.

other sources...

public void setAge(int age) { if(age<0 || age>100) throw new IllegalArgumentException(“not valid age”); this.age = age; }

Enforcing data integrity ensures the quality of the data in the system, and contributes to have bug free applications. Two important steps in checking the data integrity are to identify valid values for a type and to decide how to enforce its data integrity.Data integrity is very known by database designers, and it falls into these categories:In this Post we will focus on the domain integrity, and discover how we can enforce it.Let's take as example the following class:Even if this class is basic, its design and implementation are not perfect, here are two design weakness:Let's discover some ways to check the data integrity:Many developers check the data from the Gui and think that the problem is resolved, however these data could be assigned from:Checking the data from the front end is not sufficient, however there are some cases where it's recommended to check it from the front even the check exist in other places.For example in case of web applcations, Front-end validation provides instantaneous feedback, and also reduces server traffic.Checking that the age attribute must be between a min and a max concern the business logic of the class Dog, so It's better to enforce the domain integrity inside the class.With this solution even if the class is used in another context, the class user would not be aware about adding the check, which contribute to reuse easily existing classes.For the age attribute you can add the check in the setter, and its implementation will be modified toAnd about the color where only a few values could be assigned, it's better to use an enum, and in this case no need to change the setter.This solution is very simple to implement, however there’s an overhead due to the condition to check at runtime, and it could impact the performance in case of the setter is invoked many times.Another interesting alternative is to use the Design by contract, here's the definition from its wiki page Design by contract (DbC), also known as contract programming, programming by contract and design-by-contract programming, is an approach for designing software. It prescribes that software designers should define formal, precise and verifiable interface specifications for software components, which extend the ordinary definition of abstract data types with preconditions, postconditions and invariants. These specifications are referred to as "contracts", in accordance with a conceptual metaphor with the conditions and obligations of business contracts.After reading this definition we tend to confirm that the DbC is the magic solution to implement the domain integrity check. However it's not popular as we can expect. Maybe because it’s not easy to implement it. TDD has become so popular nowadays that most organisations have adopted it as industry standard. TDD creates an automatic test suite that allows detecting and preventing issues, and it will check the domain integrity in the unit tests instead of the class itself.Unit testing are very popular when developing with Java or C# languages, however for C++ it’s not the case, and in this case the assert technique is mostly used to check the domain integrity.The advantage of this solution is that the assert will be executed only in debug mode, and not in release one.Another alternative is to document the contract of the class, and specify the condition of each data type. It's not really a solution to the domain integrity check, indeed there's no protection to prevent giving wrong values to the data types, however it's a good practice to inform the class users of the data type constraints.I think this is the case of many projects, no solution is adopted to enforce the domain integrity, and if there's a bug related to the constraint violation it will be resolved, and a new version will be released.