Alexis Hui - Improving Software Delivery and IT Organizational Maturity: 2011

Thursday, December 8, 2011

Five Step Illustrated Guide to Setup a Kanban System in an Enterprise Organization

We've been through a significant number of Kanban implementations and our approach has largely been based on the approach outlined in David J. Andersons Kanban book. During our travels, we find that we sometimes need to make certain adjustments to the standard steps and during our recent work with a large-scale organizational transformation where we are applying Kanban across the entire organization of 200+ people we have encountered some situations that you only run into when you are in a larger enterprise setting. I thought it would be useful to others for me to provide a real-world step by step illustration of how we have been approaching the problem. If your about to kick-off a Kanban adoption in an enterprise IT organization or in the midst of one and struggling, you may find this useful. It's a simple 5 step approach that has always produced good outcomes for us while respecting the pace of change a typical IT organization can absorb.

Step 1) Identify the various upstream and downstream groups with a special emphasis on external "support groups" for the group adopting Kanban

In our scenario, we were helping an IT Development Group adopt Kanban. To start, we need to first identify who are upstream parties that give us work, in this it's the clients/business themselves that have a PM/BA function and there are 5 of them. There is one downstream group, the Release Support Group that is responsible for packaging, and pushing code into staging and production. The final group to look at is other support groups which tends to be a key consideration in enterprise settings since most IT organizations have moved towards functional consolidation. For this particular group, they rely on a DBA Support Group, and a Testing Support Group in order for them to complete their work and pass it to the release group.

Step 2) Identify the work types that come in from the upstream provider and work with the IT Development Group to agree on a set of standard work types to setup a "Work Type Filter"

One of the lessons we learned during our work with a number of teams is the importance to treat the work type analysis as a capability and not a one-time event. Work type requests evolve over time and the team needs to develop the capability to conduct this activity and evolve a standard set of work types. I find thinking of this as implementing a Work Type Filter for the team helpful so they understand it's a capability and mechanism they need to operationalize. Work type analysis tends to be always be an interesting conversation with the IT Development Group in enterprise IT settings since the development teams will have other "responsibilities" besides writing code. They are often called upon by the client or other groups to provide two types of non-development services, Consulting, and Vendor Support / Management. Since functional consolidation tends to exist in most IT organizations I have seen, some type of centralized PM/BA function is often responsible for gathering requirements and providing estimates and plans. These groups lack the technical knowledge to complete these activities and as a result seek out the development groups for help. The second type of non-development request is often Vendor Support / Management. Business demand for solutions often tends to be non-stable and peaks near the end of the annual budgeting cycle as they feel the pressure to spend their budget which causes demand to exceed internal development capacity which in turn leads to the organization running to scale up with external vendors to take on the extra work. This will generate some work for the development group as the PMs will often ask the group to provide technical oversight or support to guide design, coding and help with code integration and promotions. I find it extremely helpful to separate these non-development requests from development work types since they often cause noise in the system and impact capacity in a different way. Turning back to our real-world example, here's what we ended up with:

Step 3) Map out the internal value stream, and identify where hand-offs and coordination with support groups occur in the process

Now that we understand the boundaries of our Kanban system we can start digging into the internal workings and understand how everything connects. Value stream mapping is often a useful tool for this, but I find keeping it simple works best. The important part of this step is to explicitly map out where the "interface" to the team is with other support groups and understand what information / value / artifacts is traded (i.e. what the interface contract should look like).

Step 4) Visualize all work in terms of the standard work types, let the system run to understand throughput and then set WIP limits with an emphasis on the input queue and external support group columns

At the end of step 3, we should have a kanban board and we can start onboarding all the work the IT Development Group is currently working on in terms of the standard work types we defined for the Work Type Filter. I find many teams struggle with setting WIP limits right away, and they need to use the Kanban system for 2-4 weeks before they have the right comfort level to have this discussion. At that point, the throughput metric should be tracked as this will provide a helpful frame of reference to set the initial guesstimate for the WIP limits. The key columns I focus on with the group is the input queue and external hand-off columns. The rationale behind focusing on these columns is they are the interfaces to the outside world and we want to start setting some expectations and commitments around these as the IT Development Group needs their help to deliver work. The input queue keeps them fed and the hand-off columns keep them running. I often leverage the throughput metric to orient the team around a WIP limit for the input queue based on the simple guideline of only take in what you can finish.

Step 5) Expose the queues to the clients and help setup and operationalize a prioritization framework as a Prioritization Filter

At this point, we expose the queues to the clients and start pushing prioritization of work to the customer. We also find in enterprise IT settings, providing only a next/this month queue is insufficient for their planning process. As a result we often provide a prioritized backlog mechanism with two replenishment cadences, an annual/quarterly one for strategic/budget planning a monthly one for just-in-time decisions. This is where we often see the most churn and is typically the most challenging part of the Kanban process. Many IT organizations are great at prioritizing within one group, but across organizational groups is a struggle despite attempts at elaborate and agonizing prioritization committees/boards. To help with this problem, the Kanban system arms us with two pieces of information, a common unit of currency (the standard work types) and delivery throughput. Leveraging these two pieces of information, we recommend working with the clients to analyze all their known demand, break it down using the Work Type Filter into our standard work types, and then calculating the gap between demand vs capacity in terms of throughput. Once the gap is known, there are two options, scaling up with more external vendors or prioritizing the work. In this scenario, scaling up was not an option due to the tight deadlines and the ramp up time needed to onboard new resources so we had to go with the prioritization option. To help with the prioritization problem, we recommend holding a workshop to align the requests with strategic priorities for the organization and then applying a FIFO policy to these requests while considering true fixed date items. This provides a workable interim solution as it forces the clients to focus on today with FIFO. The goal is to get to a set of demand channels based on what's ready to go now or soon and set % allocation against these channels. We then provide the client an opportunity to adjust the % allocation during the monthly replenishment meetings.

After this step, you should have a fully operational Kanban system. Now the hard but fun part remains which is evolving it and optimizing it for performance.

Monday, August 1, 2011

Maximizing the Benefits of Kanban as an Accelerator for Lean/Agile Transformations

One of the common approaches to many Lean/Agile Transformations is to start off with a set of "pilots". Typically these are selected based on the likelihood of success, priority and business risk. Often six months or so down the line after a few successful pilots the organization is eager to accelerate the process and attempts a larger scaling initiative to spread adoption. At this point the organization typically hits a wall as they run into a new set of challenges with the new teams/units being "transformed":

These teams/units are typically less enthusiastic about the change - pilots are typically "cherry-picked" and are often the most capable and enthusiastic of the change while the rest of the organization is not
Larger and more legacy systems/teams bring about different challenges the pilots did not face
Limited capacity and budget within the organization to support the larger scale adoption resulting in less time spent with the team/units
Executive deadline for the transformation looms closer, typically organizations are looking for organization wide results within a year or two

As a result, the transformation typically starts switching to a "big-bang" type of change as patience starts running out and new challenges to the transformation starts slowing down progress. This where organizations often get into trouble.

As an alternative to the piloting approach is a Kanban change management approach to transformations. The great thing about Kanban is that it allows an organization going through a transformation to adjust the pace of change from fast to slow depending on the tolerance of change within the organization. The only requirement is to ask each team/unit to follow a subset of the Kanban properties:

Visualize what you do today
Limit the amount of work in progress
Improve flow

With the Kanban approach to transformations, the organization can start "transforming" the entire organization on Day 1 without significantly disrupting business as usual while gaining the benefits of scaling out change to a larger group of teams/units. Another way to look at this is from the J-Curve model:

The J-Curve is the amount of disruption/pain an organization goes through when change happens. If you are starting/going through a Lean/Agile Transformation today, this approach is worth considering if you are facing a few of the challenges I discussed.

I'll blog more about the specifics for how we are applying the Kanban change approach in a current large-scale transformation in a later blog post.

Wednesday, March 23, 2011

Why co-located team rooms are important

One of the most valuable aspects of co-locating a team in a common area is that it eliminates a lot of the "us vs them" mentality that tends to happen when team members are working in different locations. I have observed that it is a fairly natural occurrence in teams that are distributed to be less collaborative and helpful to each other if they are separated by distance.

For example, here's an interesting email conversation trail from a previous project between two distributed teams (in the same building but different floors and opposite ends of the building):

Email 1 from Debugging Developer:

Hi Developer B,

I was wondering if processing transaction A is supposed to work for ActionTest. I’m using the following data with an error of “failed to execute called process”, can you advise, thanks.

Data used:

Some complex XML message

Response email from Developer A:

You sent a transaction per below. The changes were not deployed on the tActionTest. I will have to rebuild that deployable. (It was deployed for the end-to-end process, not the test webservice.)

I can't find your request in the logs to check the error.

Thanks,

Developer A

Email 2 from Debugging Developer:

Ok so I guess it shouldn’t work until its been rebuilt right?

Response email from Developer A:

Yes, I forgot about deploying the test service because I was focusing on the integration model.

Developer X just deployed the most recent ProcessingComponent changes on the test service, and this included my code updates, so the service should work now.

ProcessTransactionService fails this sample request as per below:

< Failed OrderDetail(s) : 9084694735 ErrServiceQueryFail

updateSubscriber_ERROR-1553

and more complex XML/>

Thanks,

Developer A

Email 3 from Debugging Developer:

Great, yeah I see that error now, middleware integration admin logs for ActionTest still times out. I’m not sure why the error occurs, is there any way that I can dig deeper?

Response email from Developer A:

Look at the ProcessTransactionService.

Email 4 from Debugging Developer:

Any idea what “Failed Tranasaction(s) : 1234567 (error=-65608)” means? I understand that transactions with FDR of 232 and IBC of 456 must be used, so I specified the following request:

Some complex XML...

Response email from Developer A:

Did you intend to address this to me? I don't know this. AquaLogicBus is calling the ProcessTransactionService (java). This error is returned from the ProcessTransactionService.

Thanks,

Developer A

Email 5 from Debugging Developer:

Yeah, I just thought you might know, Developer B might have a better idea.

Developer B, I’m sending the request below to AquaLogicBus and get the error: “Failed Tranasaction(s) : 1234567 (error=-65608)”, tn I used is: ,

What rules are there for Transaction A?

Thanks.

Response email from Developer B:

But the error is self descriptive...the transaction is not unqiue i.e. processed...you cannot process the transaction twice/more in ProcessTransactionService.

Email 6 from Debugging Developer:

I have yet to trained myself to understand (error=-65608)

I’ll try more numbers even though numbers such as 1234567 and more also do not work for Transaction A, these numbers have not been processed since they were verified by GetStatus of the ProcessTransactionService returning ErrGetStatus.

After a bit more back and forth, finally the Debugging Developer discovered the root cause of the problem. In total, this took 6 hours and ~10 emails. Think this is a good example of the communication effectiveness Alistair Cockburn talks about:

Friday, March 11, 2011

Kanban - Add colour to your work to help the team focus

One of the aspects I find valuable with Kanban, is the work visualization. However, sometimes I find that the work is so chaotic/iterative and collaborative in some portions of the process that the standard columns and rows in the kanban board don't provide enough flexibility. You'll see this when you have one work item that requires separate but related work from multiple team members due to their specialization.

In these situations, I avoid trying to fit the highly iterative work into columns and rows, and collapse them together into a single step on the board. The problem I run into when I do that, is it becomes hard for the team to figure out who is working on what and what work remains to be done on the work item.

To solve this problem, I apply colour to the work using coloured stickies that I "tag" on work items to denote what's be done on each particular work item.

In this example, the team pulls a work item (user story) into the "Create FitNesse Template and Automate" column of the board. To move an item through this part of the delivery process, requires the team to define detailed acceptance criteria in the language of the domain model, get it into a FitNesse page, and then automate the tests. This requires at least three specialized skillsets, a business analyst who understands the requirements that can help define the acceptance criteria, a developer that can verify that the acceptance criteria can be automated and develop the automation code if needed (fixtures etc.) and an architect/product owner that can review the acceptance criteria. Also, the work is not necessarily sequential and may be done in parallel or require multiple iterations.

To help visualize the work for the team, we applied coloured stickies that are tagged to each work item once something has been completed on it. For example, if the FitNesse template is done and data is defined than we tag it with a red sticky. If automation is completed / not required than it is tagged with a blue sticky and upon review from the architect/product owner we tag it as yellow.

A neat result of this process, is that each team member can immediately tell what item they should be working on. The business analyst just needs to look for work items that are missing red tags, which is a signal for them work. Similarly, blue for developers and yellow for the architects/product owners.