Friday, August 18, 2017

Good talks/podcasts (Purging the queue II)

Sunday, August 13, 2017

Books I have read lately and Reads In Progress (RIP)

Continuing with my process of recovering my previous reading habit (books-i-have-read-in-last-12-months these are the books that I read lately:

  • "Nonviolent Communication: A Language of Life"  Marshall B. Rosenberg Review
  • "The Power of Habit: Why We Do What We Do in Life and Business" Charles Duhigg
  • "Los 7 Habitos de la Gente Altamente Efectiva" Stephen R. Covey
  • "The Toyota Way" Jeffrey Liker
  • "Algorithms to Live By: The Computer Science of Human Decisions" Brian Christian, Tom Griffiths
  • "El Arte de Vivir: Meditación Vipassana tal y como la enseña S.N. Goenka [The Art of Living]" William Hart
  • "The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life" Mark Manson
  • "Tribes: We Need You to Lead Us" Seth Godin
  • "The Five Dysfunctions of a Team: A Leadership Fable" Patrick M. Lencioni Review
  • "The Happiness Hypothesis: Finding Modern Truth in Ancient Wisdom" Jonathan Haidt
  • "Team of Teams: New Rules of Engagement for a Complex World" General Stanley McChrystal, Tantum Collins, David Silverman, Chris Fussell Review
  • "Focus: A Simplicity Manifesto in the Age of Distraction" Leo Babauta
  • "The Sales Bible: The Ultimate Sales Resource" Jeffrey Gitomer 
  • "The Ideal Team Player: How to Recognize and Cultivate the Three Essential Virtues: A Leadership Fable" Patrick M. Lencioni
  • "Ego Is the Enemy" Ryan Holiday
  • "Barking up the Wrong Tree: The Surprising Science Behind Why Everything You Know About Success Is (Mostly) Wrong" Eric Barker
  • "Scrum" Jeff Sutherland, JJ Sutherland

Read in progress (RIP) :)

  • "The Lean Product Playbook: How to Innovate with Minimum Viable Products and Rapid Customer Feedback" Dan Olsen
  • "This is Lean: Resolving the Efficiency Paradox" Niklas Modig, Par Ahlstrom
  • "Implementation Patterns" Kent Beck (2º reading)
  • "The real startup Book" v0.3 

Friday, August 11, 2017

Scalable design pattern sample (tech pill)

As a complement to the previous blog post Queues vs Distributed logs (tech pill) this blog post describes how to solve a concrete business problem using a distributed log and a queue.... I used this design several times in production with good results.

Any feedback or improvements of the design will be more than welcomed :)

The problem:

We have to implement a business process with the following characteristics:
  • The business process (Job1) can be divided into three different sequential phases (S1, S2, S3). For example, we can think about the generation and mailing of all the invoices for all the customer of one account manager.
  • The second phase (S2) can be divided in several (hundreds or thousands) individual and independent sub-jobs (S2.1, S2.2, ...). This sub-jobs can be executed/processed in any order. For example, generate and email one invoice. Each sub-job require one minute of process time.
  • The complete process should complete in less than 90 minutes without being affected by the number of sub-jobs of the second step (up to 10k sub-jobs).

We also need to:
  • Notify when each step starts and when each step finished.
  • Generate some statistics of each the process.
  • Access to the detailed status of the process at any moment.
The solution should have the following characteristics:
  • We can have several numbers of concurrent business processes of this type for each customer.
  • For the sub-jobs at step 2:
    • We have retries for the sub-job execution.
    • We can balance the time of the process and the cost.
    • We have horizontal scalability.

The Design:

After analyzing the requirements and make a fast web storming session we consider the following events:

If we need to include more information about the detailed state of the process we can also signal the starting point of each step using a StepXStarted event. But these Starting Events are not required because we already know when a step started (just when the previous step finished).


As we want to implement several functionalities for the same events and we want to have each one completely separated, we can use a distributed log / stream that allows us to design a simple solution to manage the workflow, calculate statistics, compute a detailed status.
The distributed log / stream and the corresponding services can scale using as partition/sharding key the job id or the customer id (assuming that each customer can generate several jobs at the same time).

The Step2 require additional design.
It can be subdivided in several individual jobs (S2.2, S2.2, ...), this jobs can be processed in any order and in parallel so we can have a queue for dispatching this subjobs to a pool workers that can be scaled out if needed.
We can balance the number of workers (and the corresponding cost) with the duration of the Step2 that we want. Each worker get a job description from the queue, execute the job, and include the result in an event (Step2SubjobCompleted) published in the distributed log / stream.

The Workflow Manager should implement a response for each event and generate corresponding events to make the job progress.

The responses for each event are:


  • Execute Step1
  • Publish Step1Completed


  • Split the Step2 in several subjobs (S2.1, S2.2, ...)
  • Store the identifiers of the jobs created (S2.1, S2.2, ...)
  • Send the subjobs to the queue of S2 subjobs


  • Mark the id of the job as no longer pending
  • Validate if we have jobs pending
  • if there is no more jobs pending, publish Step2Completed


  • Execute Step3
  • Publish Step3Completed


  • Do any garbage recollection needed
  • Publish Job1Completed


  • Nothing, Everything is already done  :)

For the same events, other functionalities as Statistics, reports or notifications, will define other reactions/implementation to process each event... Compute and store statistics, generate emails or push notification, create a view with detailed information on the status of the job, etc...

Notes and complementary designs:

  • The queue allows duplication so the workflow should be prepared to receive several events Step2SubjobCompleted for the same subjob.
  • We can have retries for the queue jobs, so we should include a mechanism to detect when we should stop waiting for Step2SubjobCompleted. For example, we can use a periodic tick event and use this event to decide if we should continue with the next step (for example marking a subjob as an error).
  • Is also possible to continue receiving Step2SubjobCompleted even if we are at Step3, the easy solution is to ignore this messages.
  • If we select the Job id as the sharding/partition key for the distributed log / stream we can easily scale out the number of workflow processors. We only need to add more stream shards and the corresponding new processes.
  • For the fault tolerance of the workflow we can store the events that we already processed and recover in case of failure from this events.

Related references (very recommended):

Sunday, July 30, 2017

Honey Badger Team - Visual History III - CD as driving force

After publishing Honey Badger Team visual history, @artolamola asked me on twitter if I used "continuous delivery" as the driving force for the transformation and why. The short answer is Yes, and in this blog post will try to explain the motivations behind.

For me, agile requires two things (see the-two-pillars-of-agile-software):
  • A healthy knowledge culture focused on people (collaboration, learning, respect, team work, creativity...) 
  • Looking for quality and technical excellence (for example XP practices are a great starting point).
To introducing the healthy knowledge culture, I introduce retrospectives and worked as a facilitator,  change agent and "Developer Whisperer".
In our profession knowledge is the bottleneck

For the technical excellence I think that we should follow this steps:
  • If there is a classic vicious cycle of bad quality or bad practices going on. We should stop and break this cycle immediately. The typical example consists in having technical debt, that generates bugs, that generate even more technical debt that ends in the complete collapse in few months.
  • Assume that we have the capacity and very good intentions.
  • Help to create a virtuous cycle to improve our technical quality day by day.
  • To create this virtuous cycle we need to define a clear goal to align all of our efforts.
Vicious Cycle of technical debt :)

In our case, and having in mind that we are a cloud native product, I thought that we can use Continuous Delivery as our team Goal

Why Continuous Delivery?

The answer is that I thought that we require a good practice that force us to do the thing right and doesn't allow us to hide our unprofessionalism or incompetence. And for this, and in the cloud, I think that in our case Continuous Delivery will put a lot of pressure on doing the things right.

Continuous Delivery implies:
  • Low cost and a smooth process for deployment that doesn't affect customers. This requires (or at least recommend) introducing DevOps, automation, rolling deploys and Continuous Integration.
  • Having good confidence in our product. This requires Continuous Integration, Automatic testing, and Monitoring for production.
  • Automatic testing requires integration tests and unit tests, and for sure, having unit tests put a lot of pressure on doing a good design (for example to have a hexagonal architecture or other architecture that decouple business logic from infrastructure).
So indirectly, trying to implement Continuous Delivery requires or recommend doing the following practices: DevOps, Infrastructure Automation, Zero down time deploys (for example rolling deploys), Continuous Integration (perhaps with trunk based development), Integration testing, Unit testing, Good Design (for example using hexagonal architecture).

Virtuous Cycle created when CD is the goal

Continuous Delivery doesn't allow you to hide bad technical quality and force you to introduce good technical practices and work very hard to have a healthy code base.

With this approach, we, as a team, improve a lot our technical knowledge, our code base and our practices. We have to improve many technical practices but we have already come a long way and I sincerely believe that we are going in the right direction. :)


  • If you use micro services, CD force you to think about independent deployability.
  • CD force you to learn how to make parallel changes (using feature toggles, database refactoring, etc.)
  • Some additional insights from @artolamola:
    • Above CD you can implement any agile method for team organization (scrum, kanban, etc.).
    • CD force you to think how to decompose each feature in vertical slices.


Some of these ideas come from:

Some context about how I understand agile:

The history of the Honey Badger team: