eferro's random stuff: Eliminating Waste in Software Development

Translated from the original article in Spanish https://www.eferro.net/2024/04/eliminar-desperdicios-en-el-desarrollo.html

In our first post, we explored the origins and foundational principles of Lean Software Development. In the second, we introduced certain basic concepts that I’ll use throughout this series. Now, we’ll focus on the first of these principles: Eliminating waste. It’s essential to understand and reduce activities that don’t add value to optimize our development processes and increase the value we deliver to our customers.

In this article, I’ll describe examples and practices I’ve applied in various agile teams over the years. It’s important to note that the practices and examples we mention are specific to our context, such as product development and empowered teams, and they often reinforce each other. Therefore, implementing them in isolation is not advisable. For example, starting continuous deployments without an adequate automated testing system could be more harmful than beneficial.

Adapting Lean Manufacturing Principles to Software Development

In the original Lean Manufacturing, seven main types of waste were identified: Inventory, Extra Processing, Overproduction, Transportation, Waiting, Motion, and Defects. Mary and Tom Poppendieck, based on their extensive knowledge of Lean and software development, adapted these concepts to make them more relevant in this new context. For instance, they redefined "Inventory" as "Partially Done Work", "Extra Processing" as "Extra Process", "Overproduction" as "Extra Features", and "Transportation" as "Task Switching". They considered that the remaining types of waste retained their direct applicability to software development.

Identifying and Eliminating Waste

To eliminate waste, the first step is to train the team to identify what constitutes waste. In this regard, it’s important to:

Analyze from the Customer’s Perspective: We must always ask ourselves whether an activity adds value to the customer/user and if we can eliminate it without affecting their perception.
Foster a Culture of Constructive Criticism: Being critical of our actions and methods allows the team to periodically analyze its way of working to identify and eliminate waste.
Consider Long-Term Impact: It’s vital to distinguish between what might seem like waste in the short term but isn’t necessarily so in the medium or long term, always keeping customer/user satisfaction in mind.

It’s essential to classify the identified waste into two categories: those necessary for specific reasons, such as regulations or laws, and those we can completely or partially eliminate without compromising customer/user satisfaction. For the first category, we should focus on understanding the reason behind these limitations to minimize waste as much as possible. For the second, it’s crucial to take a more decisive approach, systematically working to eliminate them.

Partially Done Work

In Lean Manufacturing, the inventory of partially completed parts is physically visible and requires organization and, at times, maintenance. In contrast, in our context, "inventory" (code, knowledge, information, analysis, etc.) is not as visible but is just as costly.

The real value for the customer or user only arises when they access the new functionality or change. Often, even at that moment, the value remains uncertain until we receive feedback. Therefore, since true value is only realized at the final stage, it is crucial to shorten the time from idea conception to delivery. In other words, we must strive to reduce work in progress and decrease lead time. As Dan North puts it, the goal is to minimize the gap between the initial idea and the user's "thank you."

These are the practices we use to eliminate partially done work:

Analyze/Prepare backlog work on demand: We maintain a backlog for no more than a month, and if it grows, we eliminate initiatives. If it is important, it will resurface.
Radical vertical slicing: Both at the product and technical levels, enabling us to deploy increments within a few hours or a day. This, of course, requires Continuous Delivery (CD).
Trunk-Based Development: We avoid partially done work in feature branches and the merge-related problems.
End-to-end management by our team: We handle deployments, validate quality, monitor the product, etc., avoiding wait times for other teams or specialists.
Immediate deployment of improvements: Both user improvements, which provide business feedback, and technical ones, which provide system feedback.

As shown in the team's Cumulative Flow Diagram, we maintain a minimal backlog and manage related tasks only when necessary and in the smallest possible quantity.

Additional Process: Simplification Toward Value

Aligning with the Agile Manifesto, Lean Software Development promotes measuring progress primarily by the software value delivered to the customer. In this framework, any element—such as excessive documentation, redundant processes, unnecessary meetings, or approvals—that does not directly contribute to value for the customer/user should be assessed for elimination.

Since adopting Agile principles, I have collaborated with teams to streamline our processes, discarding anything that does not generate value.

From that experience, I highlight two significant changes:

Transition from Scrum to Kanban: Scrum initially helped us, but we evolved toward Kanban to focus on continuous workflow. This reduced extensive meetings and planning, favoring shorter, more focused sessions, adapting to Lean’s Just-In-Time model.
Elimination of Estimates: We prioritize small, continuous changes, allowing us to forgo traditional estimates. We still make high-level estimates for large initiatives but with a focus on minimizing risk and unnecessary time investment.

We have found that by working in small steps and preparing only what is immediately necessary, we minimize rework (failure demand) because we do not perform "speculative" work. This significantly simplifies backlog prioritization and management, allowing us to focus on essentials and save considerable effort.

In summary: Adopting a just-in-time approach to do only what is necessary has led us to a more efficient process, with less rework and more agile backlog and priority management.

While it is impossible to eliminate all documentation or bureaucracy associated with safety regulations and certifications, it is possible to address these requirements creatively to avoid additional work. In our case, we have adopted Trunk-Based Development. Every commit or push includes co-authors and triggers a series of exhaustive tests. This methodology not only satisfies auditors but is, in fact, more effective than traditional asynchronous review methods (feature-branching + PRs) and the need for explicit approvals to move forward to production.

Functionalities and Extra Code

This, along with partially completed work, is the most significant waste we see in software development and product creation. Far too often, software developed over months ends up unused or avoided by users because it fails to meet their expectations. This is the WASTE in software development. As Mary Poppendieck said, "The biggest cause of failure in software-intensive systems is not technical failure; it's building the wrong thing."

In addition to applying Lean Software Development, we must use other techniques to truly understand our users' needs and uncover which problems are worth solving. Tools like Lean Product Management, Continuous Discovery, and Impact Mapping are essential in this process, though we won’t detail them in this series of articles.

Assuming we have identified a problem worth solving and that we have customers/users with a clear need, our goal is to solve this problem/need with the least amount of software possible and as quickly as we can. In our case, we use the following practices:

We adopt the Agile principle of “Simplicity—the art of maximizing the amount of work not done—is essential.”
We see software as a means, not an end, aiming to solve needs with as little software and technology as possible.
We focus on customer value, ensuring that every initiative and functionality aligns with the real needs of users.
We delay technical and product decisions as much as possible, increasing the chances of never having to implement them or at least not implementing them fully. We always aim for the minimum version that is sufficient.
We employ Outside-In TDD, which ensures we only write the minimum code necessary to implement the use case.
We follow the YAGNI principle (You Aren’t Gonna Need It), focusing on the functionality required now, avoiding speculative design or development.
When something we’ve developed stops being used or doesn’t fulfill its objective, we either remove it entirely or adapt it until it has a positive impact again.
We work in very small steps (<1.5-2 days), presenting new increments to users to receive quick feedback that allows us to adapt and decide on the next steps. This often lets us stop investing in an application’s functionality when it’s “good enough” for the user, thus avoiding unnecessary development.

Task Switching

Frequent task switching can significantly disrupt a team’s productivity. Each change forces the mental process to restart, delaying re-entry into the “flow” state of work. To minimize these switches, we apply several strategies:

Minimizing WIP: The most effective strategy to prevent frequent task switching is to reduce the team’s Work in Progress (WIP). We strive to focus on one or at most two initiatives simultaneously. Ensemble/mob programming is our preferred technique to limit WIP, as when the entire team focuses on a single task, internal interruptions are naturally eliminated.
Continuous and Synchronous Code Reviews: By working in ensemble/mob programming, we eliminate all task switches generated by asynchronous code reviews. See Code reviews (Synchronous and Asynchronous).
Vertical Slicing and Technical Slicing: By rigorously applying these techniques, we can work on truly small increments. This helps us maintain workflow continuity until completing and deploying an increment. Naturally, after each deployment, the possibility of task switching arises without the negative impact of doing so mid-increment.
Task Completion and Spikes: We ensure tasks can be completed from start to finish. If we see this isn’t possible, we conduct a spike (http://www.extremeprogramming.org/rules/spike.html) to eliminate uncertainty or look for other approaches that don’t require interruptions.
Pomodoro Technique: We use Pomodoros for periods of focused work and synchronized team breaks.
Quality at Every Level: High quality prevents interruptions caused by failures. We apply TDD, ensemble/mob programming, and other Extreme Programming practices to maintain it.
Operations/Support Rotations: For teams with support functions, we implement rotations, concentrating part of the team on emergent work and the rest on planned initiatives.

Waiting

When we analyze the product development process in depth, the most common finding is that every increment/idea/backlog item spends almost all of its time waiting. Waiting for answers to questions, waiting to analyze the problem more thoroughly, waiting for feedback on the design, waiting for architectural change approvals, waiting for certain specialists to be available, waiting for someone to approve the change, waiting for code reviews, waiting for the feature toggle to be activated, waiting to communicate the change... Waiting, waiting, waiting. Clearly, if we view the process from the customer/user's perspective, any type of waiting is simply waste.

To eliminate much of this waiting, here are some tactics that have worked for us in the past:

Assign the team end-to-end responsibility for going from problem definition to production and operation of the solution. If possible, even give them the freedom to find the problem worth solving. This means taking charge of product management, development, quality, deployment, and operations.
Even when the team is empowered, it sometimes lacks all the necessary skills. In these cases, we need to secure collaboration from a specialist, but always try to have the specialist help us improve our skills in that area instead of solving the problem for us. This won't cover all cases, but it will ensure that in simpler cases, we don't need to call on the specialist again.
On the other hand, the more multidisciplinary the team members are, the easier it will be to meet needs within the team itself. This doesn't mean everyone knows everything, but rather that we promote T-shaped skills (https://en.wikipedia.org/wiki/T-shaped_skills).
On a technical level, the way to systematically eliminate most waits is to move toward Continuous Delivery (CD), which typically involves placing a lot of emphasis on agile technical practices (TDD, CI, decoupling deployment from activation, etc.) and having very high confidence in our automated testing.
One of the most efficient ways (flow efficiency) is to work in mob/ensemble programming so that all the available knowledge and skills are fully dedicated to the single ongoing initiative (on which the mob/ensemble is working).
There’s no point in releasing to production early if we then passively wait for customer/user feedback. It's much more efficient to seek that feedback proactively and to have instrumentation at the product and system levels to learn as quickly as possible.

Movement

Another of the seven basic wastes considered by Lean Manufacturing is movement. In this context, it is evident that the movements an operator must perform in a factory, whether between machines, to pick up materials, or to make inquiries, are a clear waste. In the case of Lean Software Development, the category of movement was also retained, even though in our domain, this type of waste is not as direct and obvious.

In the original book, movement refers to the effort required to access the customer, obtain domain information, carry out hand-offs between specialties (ops, QA, security), etc.

Many of the tactics used to eliminate waits are also valid for reducing the waste of movement, especially regarding hand-offs between specialties.

Additionally, to eliminate other types of movements, the following strategies have proven useful:

Provide the team with direct access to the customer/end user or their closest representative. A practical solution could be to take on operations responsibility, so the team is directly exposed to the complaints and needs of customers/users.
Develop specific tools that allow direct and efficient access to necessary information, avoiding repeated processes (e.g., data extraction tools, observability tools, etc.).
Create information radiators so that it’s easy to visualize progress or relevant information without the need to actively search for it (visual management boards, automated notifications, etc.).

Defects

Lastly, Lean Software Development considers defects as a significant source of waste. From my experience, I would say that defects are the second most important source of waste after performing unnecessary activities (extra features/code). Although, if you think about it, not doing what the client needs could also be considered a specific type of defect :).

Every defect we make not only generates the waste of the time spent creating that incorrect code but also the time spent fixing it, the impact on our credibility with the client/user, and all the effort from the creation of the problem to its resolution. Therefore, it is not only important to avoid generating defects but also to find them as soon as possible, since the waste/cost associated increases exponentially the longer it takes to detect the problem.

With this in mind, the tactics and practices we usually use to minimize this waste are:

Minimizing as much as possible the code we need to develop to achieve the desired impact. As you know, less code implies fewer opportunities to make mistakes.
Using Outside-In TDD, starting with acceptance tests for the use case. This, by definition, generates the minimal amount of code possible, which is also well-tested from the outset.
This process does not cover all scenarios and issues, so it is also necessary to create certain end-to-end tests and have strategies for specific topics such as security analysis, load testing, performance testing, etc.
Another important point is testing the third-party components we use to avoid problems when updating versions or using them in non-standard ways. See Thin Infrastructure Wrappers.
With all the above points, we have a good starting point, but it is increasingly common to rely on third-party infrastructure and services (SaaS, clouds, etc.). In these cases, it is more essential than ever to use production testing tactics. After all, our clients/users don’t care what the source of the problem was; they only care about the impact it had.

Conclusions

As can be seen in the tactics and practices we employ to minimize waste, many of them relate to having solid development practices (pair programming, TDD, BDD, continuous reviews, CD, CI, etc.) that allow us to develop sustainably. Others focus on avoiding unnecessary tasks as much as possible, concentrating on what the client truly values (which doesn’t always align with what they ask for), limiting the software to current needs, and working in very small steps so we can change direction or stop investing in something as soon as necessary.

Working in such small steps and adapting continuously allows us to streamline the required process: we need little backlog management if we have very little in it; there’s no need to coordinate different workflows if we’re all working on the same thing at the same time; there’s no need to structure communication or handoffs with other teams if we manage them ourselves. In the end, it’s about simplifying everything as much as possible to do only what is absolutely necessary, always focusing on what truly adds value. This obviously involves constantly questioning what we do and how we do it. Just because something was useful a couple of months ago doesn’t mean it still is.

It’s not as simple as it seems, as it requires deep engagement in our work (passion) and, at the same time, the ability to let go of what doesn’t add value (detachment). It’s about living focused on a sliding window of what adds value now, of what is useful to us in the present.

Remember that eliminating waste is just the first step on the path to Lean Software Development. In our next post, we’ll explore how to “Amplify Learning” to ensure the excellence of our products. See you soon!

Páginas

Monday, November 18, 2024

Eliminating Waste in Software Development