Software Development

An-Isotropic Scaling

In the vernacular of my former career as a metallurgist, 'scaling' is defined as 'the accumulation of unwanted material on solid surfaces to the detriment of function'.  Over the last year or two, I've been amused by the discussions around scaling in agile environments and the applicability of that definition.  There are many, although not all,  in the agile community who might define scaling as 'the accumulation of unwanted process and overhead on solid teams to the detriment of their function'.

In metallurgy, there is the notion of isotropic vs an-isotropic scaling.  That is, the difference between uniform scaling and non-uniform scaling.  It is natural to visualize scaling anything uniformly along all axes.  When we look at a picture, we think in terms of increasing or decreasing it uniformly in all directions, resulting in a larger (or smaller) version of itself.  Sometimes we use uniform scaling and fix the ratio of expansion or contraction along the axes to match the original and maintain a shape.  However, there are situations where the reason for scaling actually dictate non-uniform scaling; the bigger picture isn't the goal, rather increased utility or performance at a different size is the goal.  I believe that to be the case with scaling an agile organization.

Isotropic scaling of an existing system can very often lead to expansion (or contraction) in unnecessary areas and not enough expansion (or contraction) in others for the desired effect. 

Very often, expansion within an organization requires growth at different rates-- and in different directions.  This is precisely the definition of an-isotropic scaling. Further, it’s quite possible that the expansion in one area actually requires a contraction in others for the system to be effective.  Rather than focusing on keeping the shape of the organization intact through uniform scaling, it might be helpful to recognize that  non-uniform scaling, by definition, changes the shape of an organization.

Too often, expansion is seen as the solution to problems- problems that would actually benefit from contraction. A classic example is the notion that we need to increase the number of support teams to handle increasing numbers of customer support issues.  Using a systems thinking approach, the real solution to this problem is to uncover the cause of growth in customer support issues and address that, rather than expanding to handle the symptoms.

When we are scaling an organization, we need to identify what problem we are trying to solve.  Are we trying to coordinate existing and additional teams for some form of consistency (perhaps architectural or technological)?  Are we trying to increase the throughput of the organization in terms of different products/services delivered?  The specific solutions to those scaling issues may depend on the maturity of leadership, teams, products or portfolio management, and will likely require growth at different rates in each of those.  With mature products and teams, perhaps it is simply the portfolio management system that needs to scale up.  With mature leadership and products and services, perhaps it is only the teams that need to multiply.  Without taking into consideration the people and their relationships, it is very possible that process and overhead will be added to the detriment of an already well functioning part of the whole.  As many organizations consider people simply ‘human resources’ within a process, it is too easy for those organizations to ignore the human part of growth or contraction.

There are, of course, instances where isotopic growth is required.  When we are considering the increase in capacity for software delivery, there is often a desire to ‘hire developers’ to accomplish this.  In this case, the increase in capacity usually needs to be isotropic (perhaps with fixed axes); adding developers, testers, Scrum Master, Product Owner. Adding developers alone doesn’t increase capacity, adding multi-discipline teams increases capacity.

An agile mindset keeps us focused on inspecting and adapting based on a clear understanding of what problem we are trying to solve.  Scaling an agile organization is no different; we need to keep focused on the problem we are trying to solve, rather than simply uniformly scaling a system that works in its current situation.

 

When you're up to your ass in alligators ...

Most teams use some sort of defect management tool which allows them (and other interested parties) to record defects along with meta-data about the defect such as severity and priority.  Severity is usually an objective value but priority is subjective.  For instance, severity is usually defined in terms like 'High - Results in crash or data loss', 'Medium - Work around available', or 'Low - Cosmetic' etc.  High severity defects are sometimes low priority and sometimes low severity defects are high priority.  For instance, the misspelling of a company name may be low severity but high priority while an application crash generated from a situation that is unlikely to be encountered by the customer may be low priority due to a cost/benefit assessment.  Team members can assign severity but usually only a Product Owner is responsible for assessing the value of addressing (or not) a defect.  This is because usually those people who are responsible for fixing defects are the same as those responsible for adding value via new functionality.  Product Owners need to be able to prioritize the value of fixing a defect against adding new functionality.

When your team is faced with 'draining the swamp' of legacy defects, they must face the need for effective defect prioritization.  The first step in addressing this issue is assessing whether the 'defects' are indeed defects (functionality that does not behave as previously agreed upon by all involved parties) or enhancements (behaviour that has not yet been designed/implemented/tested).  In my view, if the team did not agree that some behaviour was expected, designed, implemented and tested, the behaviour is an enhancement to the current functionality.  Once the enhancements have been distinguished from the true defects, those enhancements can be turned into Stories and prioritized just like any other Story which adds value to the application.  The remaining defects then need to be prioritized in terms of the value they prevent the application from maintaining.  Something of value used to work properly and now does not do so.  How important is that to the success criteria of the product and/or release?

In order to mitigate risk on a software development project, one of the principles of Scrum is that teams try to focus on delivering the next most valuable functionality while keeping the product potentially shippable.  We are to work on the next most valuable functionality in order to insure that if we run out of money or time (and we will) that we have created the most value for the money and time expended.  This should apply in the world of defects as well as enhancements.   Often the difficulty with doing so is that the number of defects in various priority queues are so large that it is difficult to assess whether the team is working on addressing the most valuable defects at any given time.  If 100 defects are denoted as High Priority but we can't address them all in one iteration, which ones shall we address to accrue the greatest value?

In most defect management tools there are usually priority choices like Must Fix, High, Medium and Low.  These classifications are perhaps arbitrary in that the only important thing about them is that each is related to the other by a higher or lower value. However many 'buckets' exist, treating them as static is not an effective mechanism for executing against priority.  Prioritization is a subjective exercise and usually prone to changes based on business conditions and newly reported issues from the field.  To that end, the highest priority queues should constantly be being emptied by the end of any given iteration.  This means that Product Owners must be vigilante about either 'promoting' defects from Medium to High and from Low to Medium (which seems like busy work) or simply limiting the highest priority bucket to a queue size that the team is likely to be able to completely address.  The key here is always making it apparent to the team which defects are the most important to fix in any given iteration.  Very often I see queues of 100's of High priority defects and 10's of Low priority defects.  This is usually the exact reverse of what we'd like to see!  We are much better at managing smaller queues … for instance queues that we can see and contemplate in their entirety.

In order to keep a product potentially shippable at the end of each iteration, some teams adopt a Task Priority list describing a working agreement about the team's default task priorities:
a) Fix any build/install issues (if we don't have a build/install, we can't test)
b) Fix any automated tests (if our tests are broken, we don't know what works and what doesn't)
c) Fix any regression defects (if we have open regression defects then we have likely regressed in value)
d) Fix any current iteration Story defects (standard practice to meet acceptance criteria)
e) Implement new Stories

a) and b) above certainly keep the product from being in a known potentially shippable state while c) keeps the product from maintaining a known value.  Issues associated with d) and e) above are about adding incremental value to already valuable software.  On large multi-team projects, distinguishing those issues keeping the product from being potentially shippable from issues of maintaining or adding value can help with queue size.  For instance Must Fix can incorporate a) and b) while c) can be distributed across High Medium and Low priorities.  Note that in this context, Must Fix is not associated with a value judgement, it is associated with a fairly common Definition of Done that teams use to help them keep a product in a known state.

Of course the best solution to the problem of how to effectively drain the swamp is to prevent the swamp from forming in the first place.

Thoughts on 'Potentially Shippable'

Scrum calls for the delivery of a 'potentially shippable' product increment at the conclusion of each iteration.  The reason it is 'potentially shippable' (rather than simply 'shippable') is that ideally it should only be a pure business decision as to whether enough value has been accrued to warrant actually shipping.  Therefore, the functionality that is exposed to the user works as intended based on the implemented Stories/Acceptance Criteria which in turn presumes that the quality is fit for purpose.

The value of keeping software in a 'potentially shippable' state at regular intervals is twofold: a) a real/tangible indication of progress for use in making date vs scope decisions and b) the ability to garner meaningful feedback at regular intervals.

If the software is in a 'potentially shippable' state, progress towards the end-goal is based on working, tested workflows/functionality implemented in software and not based solely on overall task estimates.  If the software is in a 'potentially shippable' state, feedback from existing customers, potential customers, and internal stakeholders can be meaningful.  Otherwise feedback can be, at worst, invalid, and at best confusing.

One of the goals of iterative/incremental development is to minimize the difference between 'potentially shippable' and 'shippable'.  Ideally it is simply a business decision whether there is enough value to actually warrant shipping.  In practicality, however, for many teams there are activities that they need to perform prior to actually releasing that they are unable to perform every iteration.  In some cases, the totality of all manual, automated and performance acceptance tests possible and/or necessary to execute and analyze in order to fully assess whether a given build is 'shippable' takes on the order of weeks to months.  In other cases, there is just too much legacy code which is not covered by automated testing to allow for the creation of something considered 'potentially shippable' within any given iteration.

With this in mind, teams need to be able to focus on those activities that they can accomplish inside an iteration which will best lead them to having confidence that the iteration backlog Stories work and that previously implemented workflows still function correctly.  Doing so will lead to a smaller gap between 'potentially shippable' and 'shippable'.  If a significant part of the cost of change in a code base is the uncertainty created by the change and our inability to validate (in a timely manner) that our workflows have not been inadvertently effected, then we should always be striving to minimize the time it takes to do that validation.  Validating quicker leads to finding and fixing problems quicker and cheaper.  Automate, automate, automate.

IDEALLY:

- acceptance criteria outline the circumstances under which each new workflow functions and the associated expected results
- if the acceptance criteria are met, and we have proved that the acceptance criteria for previously accepted workflows continue to be met, then we are potentially shippable.

PRACTICALLY:

- acceptance criteria for any given Story need to include regression tests for previously working functionality (either manual or some subset of long-running automated tests) which the team has assessed are likely to have been effected by the code changes necessary to complete the Story in question.

- alternatively, the Definition of Done can be altered to include a statement about the inclusion of relevant, focused regression tests which are either performed manually or are a subset of an existing long-running test automation suite.

- those manual regression tests then need to become an ongoing part of the automated test suite

The usual objection to this approach is that it means that teams apparently deliver less in an iteration.  This of course is a red herring as the teams were never actually delivering as much as they thought in an iteration because the regression testing necessary to deliver functionality was hidden in the 'stabilization/hardening' period prior to release.  Moving that regression testing forward moves teams closer to the ideal and should lead to shorter stabilization/hardening periods.

I'm often asked "How do we measure if we are 'potentially shippable'?"  My response to this is generally the same, "You're 'potentially shippable' if the software behaves the way you say it does."  Without some way to adequately describe (and ultimately test) this behaviour, it is difficult to know if you are 'potentially shippable'.  This behaviour is, of course, described in stories and their respective acceptence criteria.

Spike ... Do the Right Thing. [Redux on Aug. 30 post]

Doing the right thing is often context dependent.  I've had occasion with my current clients to discuss  the concept of Spikes and how they are treated by numerous teams.  I've realized that where a team currently resides in the continuum of learning iterative/incremental development has quite an effect on how I discuss this topic.  The current industry norm is to treat a Spike as a type of Story.  This has been reinforced by web-based tools limiting trackable items to Stories, Tasks or Defects.  I've seen novice teams struggle with this notion and here's why:

  • Summarily, the purpose of a Story is to describe some piece of functionality valuable to, and testable by, a customer.
  • The purpose of a Spike is to investigate the solution options and feasibility of addressing a particular problem space or Story.

Stories should have direct customer value as well as a size estimate (representing effort, uncertainty etc) from the team, while Spikes have no direct customer value and are generally time-boxed to no more than 2 days.

From this perspective, Spikes should not be called Stories because they violate the spirit of the principal purpose of a Story (to express something of value to a customer).  I think much of the confusion on this topic stems from our desire to want to name things based on how we manage them.  I think people would like to manage Spikes in the same manner they manage Stories but then assume that means the Spike should be called a type of Story.

I've seen novice teams want (naturally) to do things like give the 'Spike Story' a size and then directly equate that size to the length of the time-box.  This tends to negate the value of using Story Points for sizing Stories in a backlog.

Conversely, if a team treats a Spike as a Story and gives it a size of zero, that is inconsistent with the fact that they are spending time on it.  It also grates against those who look for Stories which we "get for free" (and therefore give a size of 0) as a result of some overlapping work on another Story.  These are subtleties which a mature team understands but a novice team, new to Stories, can find challenging.

So how about Spikes as Tasks?

I believe Spike activities have much more in common with Tasks than Stories.  However, there are inconsistencies and pitfalls here as well.  If a team treats a Spike as a Task inside a relevant story for an iteration, when the Task is complete, then the relevant Story should technically be removed from the iteration and returned to the release backlog (unless it is determined that the entire story can be completed before the end of the iteration) for future prioritization along with at least some of the resultant child/replacement Stories.  Depending on the the size of those resultant Stories (and the team's velocity) some of them may remain in the current iteration.  If we emphasize striving for completing all stories initially planned in an iteration we end up having to make exceptions for Stories containing Spikes.  Another pitfall is that when a team employs "yesterday's weather" for their next iteration, they need to ignore the size of a Story containing the Spike task.   Again, these things may be easily comprehended by a mature team but perhaps are contentious for teams transitioning to Agile methods, inexperienced in the use of Stories, and looking for absolutes (and THAT is a subject for a future post!).

How about neither Story nor Task?

In my experience, Spikes are generally an extension of Backlog Grooming and as such could be considered release overhead.  Spikes are generally not included as part of the release backlog but they likely result in a set of more defined Stories which may be part of the release backlog.  Perhaps it would be better for some teams to simply reduce team member availability during an iteration to account for the Spike time-box?  After all, we do this for things like Backlog Grooming meetings.  The one drawback to this approach is that there would be no explicit/transparent way for the team to manage the time spent on the Spike activity.

So why isn't a Spike just a Spike?

While the notion of a Spike overlaps the notion of Story in that both have a goal; and the notion of a Spike overlaps the notion of a Task in that both represent an activity, I think the differences between the 3 are signficant enough to warrant maintaining/managing a separate entity - 'Spike'.  This is easily accomplished on a traditional index card wall, but is usually explicitly prevented when utilizing a web-based management tool.  I believe this is one of the mistakes we as a community made when we transitioned to using these tools.  When we were using index cards we simply called it a Spike and off we went.  We didn't have to choose to create that entity as a Story or a Task so there was less confusion. 

Of course this is all just semantics ... Story, Task, Spike, Backlog Grooming ... what's the big deal?  The answer to this lies in the fact that we use words to communicate intent and without some consistency in the meanings of these words, it is difficult for people new to the concepts to keep it all straight.  The English language itself is full of inconsistencies and similarly, it is not until one is familiar with the patterns that one can grasp and remember the exceptions.

In the end what do we care about?

I really only care that the size of the release backlog is consistent with the team's current understanding of the backlog, and that the team's velocity actually represents the rate at which they may be able to reliably and sustainably address prioritized items in that backlog.  As Spikes occur, the resultant child/replacement stories will likely affect the cumulative size of the release backlog and the product owner can make the relevant business decisions with current information.

So depending on the individual team and their circumstances, any one of the 4 options might be the right thing.  If you're using a web-based tool (that limits your trackable items to Stories, Tasks and Defects) to manage your iterations and your team is relatively new to the use of Stories, consider treating Spike activities in the same way you treat other iteration overhead activities.

Golf, Poker and Software

Last week I heard an interview with a  golf psychology coach who had recently added poker psychology coaching to his repertoire.  Golfers and poker players alike hire him to help them deal with the psychological aspects of their respective games.  Predominantly his clients are dealing with how to prevent golfers from 'choking' and poker players from 'tilting'.

His approach in both cases is to narrow the skill gap a player experiences between their 'A' game and their off-game which they can experience under stress.  Players who must perform under pressure tend to revert back to what is known as procedural memory - actions which are performed when the brain is no longer functioning normally.  The skill level captured at the procedural memory level governs how you will react when stress shuts down normal functioning of the brain.

Essentially, the best players under pressure are those whose skills in their off game are streets ahead of their competitor's.  How these players achieve this gap reduction is through repetition of the necessary skills until they become second nature and therefore part of procedural memory.  An example of things that move to procedural memory are activities like driving a car.  How many times have you arrived home, and realize that you can't really remember the details of the trip?  You were using skills at the procedural memory level which allowed your mind to be elsewhere.

This is interesting to me due to the work I've been doing with the Dreyfus skills acquisition model for assessing the skill level of agile teams.  It struck me that software teams (and individuals) react under date-related pressures in a manner that is governed by similar forces - if quality agile techniques are not second nature to the team, they tend to revert back to old, less than desirable ways.  For most people this transition of skills to procedural memory takes disciplined, directed effort.  Sports and game players must practice endlessly before they are expected to perform during a game.  Software developers must practice as a normal course of action; it's always game day in software development.

The Bus

In Jim Collins' "Good to Great" he touched on great leaders taking steps to ensure that the wrong people were off the bus and that the right people were in the right seats on the bus.

In his follow up, "How the Mighty Fall", Collins explores how once-great companies disappear into oblivion and how some manage to turn their decline around.  Once again, the first step that leaders take in trying to resurrect a company involves the concept of 'the bus'.  Here's how you know whether the right people are on the bus:

  • The right people fit with the company's core values.

    • You don't figure out how to get people to share your values ... you HIRE them based on your shared values.



  • The right people don't need to be tightly managed.

    • The right people are self-motivated and self-disciplined.



  • The right people understand that they don't have 'jobs'; they have responsibilities

    • These people can articulate that "I am ultimately responsible for ...".



  • The right people fulfill their commitments.

    • They take commitments seriously and thus are careful not to over-commit.



  • The right people are passionate about the company and their work.

    • Nothing great happens without passion.



  • The right people display 'window' and 'mirror' maturity

    • When things go well they point out the window to others, when thinks don't go well they point to the mirror.




My experience has shown me that having the right people on the bus is the most critical part of creating and maintaining a successful software development organization.  Keeping the wrong people on the bus has enormous risks associated with and it is always more cost effective to remove people from the bus as soon as possible rather than keep them around for perceived short term gains.

It's ALWAYS been the problem!

When I started working full-time with Agile development techniques in 2004 I also had the good fortune to start working with the concepts of Pragmatic product management.  What quickly became apparent to me is that the two approaches in their respective fields were beautifully suited to one another.

By articulating the market problems, the nature of the people who have them, and the circumstances under which they experience those problems, Pragmatic product managers communicate to development teams what is really necessary to build great solutions; context.  We've all heard the stories of how the software industry's predilection for waterfall methodologies from the 1970's through the 1990's led to an inordinate amount of 'failed' projects.  Early proponents of agile development techniques ascribed those failures to the waterfall process itself, rightly arguing that performing development tasks (specifically testing) in phases and a lack of iterative feedback were the biggest causes of these 'failures'.  However, I would contend that just as large a reason for those 'failures' was product management's inability to communicate what was actually required for those projects to be successful.  Too much emphasis was placed on building to a specification which had often been created and disseminated without providing sufficient context to the people responsible for providing solutions.  Because of this lack of context there was no framework for negotiation with a customer or stakeholders and therefore much of what was built was not actually required to meet the customer's needs.  This resulted in unnecessary delays and cost overruns at a minimum and often the creation of ineffective solutions which quite rightly were considered 'failures' regardless of whether they met the specification.  Even if the solutions were built iteratively, the focus on the specification, rather than the problems in context, might very well have led to the same ineffective solutions.

Many software development companies do not have the luxury of having real-time access to actual customers.  Nor is the concept of building a product based on a single customer's requirements necessarily attractive.  Instead these companies rely on their product management organizations to represent the needs of customers within a market.  There is a hierarchy of proxy which is created as a result.  The customer is a proxy for the market and the product manager is a proxy for the market.  If the product manager is too busy performing strategic tasks and does not have time to fulfill his product ownership duties for the team then a proxy of a proxy is incurred with all the risk that entails.

All too often, agile projects rely on an internal 'product expert' to act as the team's Product Owner.  The danger of this is that the product expert understands the existing incarnation of the product very well but may not have the viewpoint of a customer who has issues with the product.  Properly implemented agile development techniques usually provide the transparent and tangible progress necessary to make informed business decisions.  However, without a clear sense of the relative value of the problems to be solved it is difficult for the team to understand and embrace those business decisions.  This is where the value of the MRD resides; providing the context necessary for a Product Owner to make sound decisions throughout the development process on behalf of the product manager.

In the end agile product ownership is a subset of all that a product manager is responsible for.  From a Pragmatic perspective, product ownership entails the 'Product Planning' tasks of the product manager: Market Requirements, Road Mapping, User Personae, User Scenarios, and Release Milestones.

The Pragmatic Market Requirements Document (MRD) focuses on market problems, the people who have them, and the situations under which they experience them (scenarios).  It also focuses on valuing/prioritizing those problems, people, and scenarios.  Hmmmm ... sounds suspiciously like a good start on a backlog of stories!  A story used in an Agile project usually takes the form of: A <type of user> needs to be able to <perform some action> in order to <get some value>.  This is essentially a statement of the circumstances under which a user experiences a problem.  I suppose technically it is an anti-problem as the real problem is the users inability to get the required value by performing the desired action.  Nonetheless there is a direct correlation between the primary elements of a Pragmatic MRD and a product backlog used in an Agile project.  Each scenario is prioritized using the following additional context: impact, frequency, and criteria.  The value to the persona of solving the scenario, the frequency with which the persona experiences the scenario, and the value to the to the Product Manager /Business of solving the scenario, all provide valuable context necessary for establishing business priority.  Once these scenarios have been prioritized, the product manager has a solid starting point for a product backlog.  Further, the entire team has a solid understanding of what problems they are trying to solve and for whom, and what the relative value of those problems is to both the market and the business.  With this context, the team is much better equipped to be able to produce solutions appropriate for the target personae and any decisions around solution options/modifications can be made in the context of the user and the business value.

agile2009_webbadges_speaker

The fifth element.

The most overlooked of the 12 agile principles described alongside the agile manifesto is perhaps the hardest to come by - building teams around motivated individuals.

Build projects around motivated individuals.
Give them the environment and support they need,
and trust them to get the job done.


There it is ... that term four words in to the first sentence ... 'motivated'.

A very subjective word.  Some people are motivated by money, some people are motivated by the recognition of their peers, others by visions of climbing the corporate ladder.  What this term means in the context of agile principles though is something much more specific: 'motivated' by pride in their chosen profession and the desire to learn and improve.

All too often in large traditional organizations, software development team members simply have a job.  Occasionally they've simply been beaten into submission and can be resurrected by agile techniques.  Very often however, the people that survive for long periods in large organizations are those that are mediocre and are valued because they don't rock the boat.   These are the people who will, at best, go through the motions of implementing agile techniques while giving no thought to the principles and philosophy behind them.  At worst these people will simply figure out ways to 'game the system'.  Of course the ultimate responsibility for tolerating this behaviour lies with management.  Either way, having a motivated cast of characters is crucial to the success of an agile project.