Monday, February 27, 2017

Applying the DRY principle

Post previously published in Spanish Aplicación del principio DRY


I always have very clear that in software development is very important to have each business concept in one place only. On the other hand, is also true that the code duplicity is a problem that we should try to avoid or at least limit and remove systematically when needed.

But, sometimes, we follow the DRY principle blindly, without having in mind that each decision have a cost, so I will expose in this post some points that can help us to decide whenever we should remove the duplication or when we can live with it.
  • We should differentiate between duplication of business concepts, rules, validations, flow, etc and duplication at the implementation level (for example source code with small duplication). 
  • The duplicity of business concepts and definition should be avoided as much as possible.
  • Sometimes duplicity at the code level can be a tip saying that there is an abstraction waiting to be discovered, an emergent behavior of the system or a pattern common in our application. Is important to minimize, but it is not a drama if there is "some" amount of duplicity. But be aware that you should be very careful to not generate a premature abstraction. In my experience premature abstractions are much worse than duplicity.
  • If you develop using TDD is better to duplicate code to reach green, and once in green and well covered by the tests, refactor to eliminate the duplication.
  • Depending on the language (C++, python, java...) there are some kind of duplicity that have a high cost of elimination or don't have an idiomatic solution. In these cases we must eliminate duplicity when the result is easier to understand (not only for you, but for the whole team). In the face of doubt, readability must always prevail.
  • It is important to know that when we eliminate duplicity we usually create a common class, a library, a method or any other artifact that allows us to reference / use it from several parts of our code. This is a dependency between the client code and the code to be reused. Dependency is one of the strongest relationships between code and is always an important cost so you should have this in mind when evaluating whether we should eliminate duplicity or not.
  • We should always depend on artifacts that are more stable than ourselves, that is, if we extract common code to a library, but the API of this library changes all the time, this mean that we have created a bad/wrong abstraction and the maintaining cost will increase a lot.

In summary:

  • Is important not to have duplication for business concepts (low level duplicity is less important).
  • We should always evaluate the danger of creating a premature abstraction.
  • We should evaluate the trade offs between the cost of adding a new dependency vs the maintainability cost improvements derived from removing the duplication.
  • If readability is lost by eliminating duplicity, we are doing it wrong.
  • Is better to follow a process of use, use, reuse, and later create the abstraction (instead of directly creating the abstraction).

If you can't afford to wait before creating an abstraction or you can't eliminate/modify the abstraction once you detected that is not the correct one, your problem is that you are generating a great problem and worse complexity than the one generated by leaving a small duplicity.

Although the DRY principle may seem simple to understand, its application, as always in software development, is not simple or systematic.



No comments: