Rethinking Data-Driven Applications

Changing the way how and what we are developing

Photo by Urfan Hasanov on Unsplash

Today, the development of Data-Driven Applications consists of much tedious and error-prone work. We suggest here an architecture to minimize the boring work of building CRUD backend and frontend functionality to focus instead on providing the real value to our customers and users. We propose an approach that kickstarts Data-Driven Application development by providing the “first 80%” of the work, and additionally, keeps the generated parts of the application always up to date. Of course, the other development does not get harder when using our approach.

What are Data-Driven Applications?

In our context, the core of a Data-Driven Application is data — its visualization and manipulation. Here we do not consider big data applications or purely analytical applications — let us talk instead about “small” and “medium” data. Examples for such applications may be:

Excel: When you are using Excel as a database (although in many cases you probably should not) this is for us a Data-Driven Application as the main use cases are to show, modify and visualize the data.

Admin Interfaces: Admin interfaces give the user often access to a lot of data in tabular format. Users can view and modify data they are allowed to. Users can trigger actions that have an impact on the users of the app itself as sending emails or push notifications.

CRUD Applications: Many applications that are built with frameworks as Ruby on Rails and friends are often “just” CRUD + something. This something can be for example processes needed to manipulate the data or access rights to parts of the data.

We think that most business applications, especially applications for master data management, are Data-Driven Applications. Often these applications are wrappers around a database which handle the translation of API calls (REST, oData, GraphQL, SOAP, …) to SQL or another query language. The backend consists mainly of CRUD code with some access limitations and processes attached to it.

How do we approach Data-Driven Applications today?

Often we start building such applications using MVC Frameworks like Ruby on Rails, Java Spring and so on. Most of them provide generators to automatically generate the model, view, and controller for a new entity that should live in the database. When building Single-Page Applications, the framework can generate API Code (e.g. REST) instead of the view code.

When the structure of the data changes we write a migration that applies this changes to the database. Then we do the necessary changes to our backend application, modifying the model, the controller, and the view if needed.

In this approach, the single source of truth of how the data looks like is the backend and the database and frontend need to be adapted if the data structure changes.

Issues with our current approach

Given this approach, we can build new CRUD applications very fast. Every time I am developing a new Rails application I am astonished by how fast the application is usable and provides already some value to the users.

The generator approach of Rails is very powerful although the more I am using Rails, the less I am using the generators as they provide less and less value when the application grows. When I want to add a field to a database table, often I need to touch many different files:

Create a new migration to adapt the database to the new requirements

Change the controller to add the field to the allowed parameters

Change the view to add the field to it

Some of this difficulties are already solved by using admin interfaces like rails_admin but this helps only the administrators, not the normal users.

In the end, every change of the data model yields to changes all across the application and there are no generators or other tools that are helping us to get our job done faster.

What should we build instead?

We, as an industry, are still focused on solving today's issues and approach today’s requirements from the business but ignore the fact that in almost all cases the requirements will change after the product is done. To say it in the words of the book “The Agile Architecture Revolution“: We have tried all the tricks to solve these problems: „Waterfall“ architecture approaches take years from the requirement phase to a product and when the product is ready, nobody needs it anymore. Then we tried iterative models to architecture where we have fast understand-design-develop-deploy cycles, but often this leads to poorly designed systems as we are doing it wrong over-and-over again. Development is not about a sequence of sprints, we need to have the time to think about what we are actually building.

The conclusion of the authors is, that in the end, the issue does not lie in how we build systems, but actually what we are building. Often we design systems to meet current business requirements but ignore the fact, that actually flexibility of a software system should be a non-negotiable meta-requirement. If we build systems that are inflexible to new business requirements, i.e. it takes at least hours or days to just add a simple field to an application, maybe the development effort is not worthwhile and the business is just burning money.

We want to help people! We want to solve their problems with tech! Let us not introduce more new problems than absolutely necessary

The objective of IT should be to enable business (everything that is not IT) agility. In today's world, it is critical to adapt to changes very quickly and often IT has its part in slow innovation cycles although the unique selling point of IT was always „with IT, we are faster than the rest of the market“. We are missing this goal!

Time to adapt to new business requirements and changes in the environment is critical. The systems we are building should be flexible and adapt to new business needs with ease.

We hear often the following argument:

„Sorry, this is a cloud software, we cannot adapt it to you, you should adapt yourself to the software“

In many cases, this is the absolutely wrong approach! Systems should adapt to business needs and not vice versa. IT should enable business users to get their job done, not hinder them.

Technical Requirements

Before we discuss our suggested approach let us take a deeper dive into the technical requirements of such a flexible Data-Driven Application: Data, its structure and the integration into the outer world are the core of such applications. Thus, from a technical perspective, we should focus on a clean way to represent this, meaning that there should be only a single source of truth for the data structure and the dependencies between data. The rest of the application should, wherever possible, adapt automatically to the changes. Updating the single source of truth should yield in an update of all relevant parts of the application, e.g. the database, the backend and the frontend.

Our Suggested Approach

Kickstart app development by providing the “first 80%” automatically!

We suggest a classical three-tier web application approach, but our approach is applicable to many more kinds of application design. The overall architecture looks like in the following diagram:

The suggested overall architecture for flexible Data-Driven Applications

Instead of starting the development process at the backend we suggest a more functional approach to application design. We start thinking about data and their relations and implement them in a relational database. This database is the single source of truth for the data structures, the relations and also for the rules for authorization and data access permissions. Modern databases like PostgreSQL provide the right tools to get the job done.

The database is also responsible for the consistency of the data and the data validation. It is worth noting that in many cases databases will not and should not handle all business requirements. It should be only used for checks and guarantees that are easily implemented in a database. Often even this reduces the development work at the backend drastically.

The backend is a web API that is automatically generated from the database structure. This web API should be very easily extendable to provide new functionality that the database cannot cope with. When the data structures in the database are updated, the API should update itself without any manual code changes. This way, we do not need to put any development effort in the first iterations of our application.

Given the API and the API documentation provided by the backend, the frontend is generated automatically and keeps also up to date when the API changes. Often the frontend is the place where most customizations are necessary. Thus the generator of the frontend should be very flexible and it should be even possible to build the frontend manually and only generate parts as tables, forms, etc. Again, the focus should be that the generated parts are keeping themselves up to date with the API.

Proposed Technology to Satisfy the Requirements

At the time of writing, we are not aware of any technical approaches to satisfy these requirements. Our overall approach is to develop the following architecture for building such flexible Data-Driven Applications:

Architectural Diagram with proposed Technology

As the database we propose to use PostgreSQL as it is the most powerful Open Source Database and provides many features needed for our Data-Driven Applications. It is easily extendable and provides data access rules on the column-level and row-level of the shelf. In the future, more databases should be supported.

Postgraphile is a tool that takes a connection to a PostgreSQL as input and provides a GraphQL API. It is a very thin, but easily extensible layer between the database and the frontend. Find an introduction to Postgraphile here. Postgraphile is a Node.js library and is extended using hooks with JavaScript.

Finally, to the best of our knowledge,on there is currently no easy way to generate a frontend automatically given a description of an API that keeps up to date with API changes. We are currently working on such a tool that can, on the one hand, provide a frontend without any configuration needed for an API and that can be fully extensible or integrated into existing applications.

Conclusion

We have discussed today a new way using old, battle proofed technology to reduce the iteration time when building Data-Driven Applications. This approach seems to make it feasible to fulfill the always-changing requirements of our users with minimal development effort. Even now we are using automation at many stages of the application development, especially when talking about DevOps, but very few developers are automating the actual development of new applications.

What do you think? Is the proposed approach a feasible way to reduce the pressure of the business to us, the developers? Or is automation of the development process even a threat to our jobs? We are happy for all your feedback!