Category Archives: agile

Do You Really Need Story Points?

To the Point

Instead of wasting time estimating story points, spend that precious time understanding the requirements. Figure out how to break them down into smaller and more valuable stories, and deliver less volume and higher impact.

Base all of your work and planning around small stories. Your predictability will increase. Your customers will get more frequent deliveries of truly valuable functionality. And you will have an overall happier and more engaged development team.

What do Story Points Provide Your Team?

If you are using story points:

  • Did you ever ask why?
  • Did you ever track the accuracy?
  • Or ever track the consistency of story points to any other metric?
  • Have you ever considered the effort required to guess at the story points?
  • What is the return on investment of the time spent generating a guess of dubious accuracy?
  • Do customers ask for Story Points?

The primary reason that Ron Jeffries and the XP team pivoted from estimating in ideal hours or ideal days was because of the confusion this practice caused. The unit of measure was well understood by business people. But the concept of “ideal days” was not. So turning to a made-up value and calling it “points” was a way to discourage the business folks from wondering how a three-(ideal)day effort required a week of calendar time.

Though Scrum does not dictate using Story Points, it seems everybody does so.

The Problems with Story Points

The problems I see with Story Points

  • The fact that people use numbers gives an illusion of “we know what we are doing, we have a precise metric.” No matter if you pull out the “but we are using a Fibonacci sequence” – that just exaggerates the degree of imprecision.
  • No two teams – maybe, even, no two people – have the same Story Point “Ruler.”
    • Yes, I know we can pick a well-known story as a 1 pointer and do relative sizing…
    • And I know we can work to make all teams estimate story points the same way. Hogwash.
  • The activity rarely (never?) reduces the scope
  • Estimation is usually coupled with a capacity planning session, further deepening the impact of imprecision on the overall process
  • The ceremonies around estimation sessions often miss the point – we are here to understand what needs to be delivered to the user.
    • Instead, the Scrum Master diligently asks for Story Point estimates for the story as written in Jira (because the business is too busy)
      • They get (3,3,3,3,5,5,5) from the team members.
      • They then proceed to ask one of the low number people (a 3) to defend their score, and one with the high number (a 5) to defend their score.
      • Once again the dogmatic ceremonies deliver low-value results
    • Yes, I know there are anecdotes of well-run ceremonies
  • NEWS FLASH – you still need to break down the work into meaningful, smaller features. Or at least you should.

A No Story Point Strategy

I have a different strategy.

We deliver features (aka, stories) that customers value. No customer is asking for more story points.

It is better to expend the “story-pointing” energy in breaking down the stories into smaller features. Along the way, we even learn more about the actual needs of the users, and we can frequently find ways to produce more value for less effort by cutting or delaying scope. We do this by bringing the business/subject matter experts, QA, development, and UX, into the conversations. We collaborate to deliver the highest value issues for the least effort. We use story mapping to help defer scope to later releases (or never).

We track completed features delivered.

It turns out to be a pretty stable metric over time – if you maintain the discipline.

If you also track the features you want to build for a given version release (as shown in the graphic), you can get a pretty good handle on time ranges.

(Ab)Using Jira Story Points

I have abused the Story Point field in Jira by setting it up as a signal to the team about simplicity:

  • 1 – super easy, maybe a day or two of work
  • 2 – A bit more complicated, could be a few days, maybe 4
  • 3 – these are never allowed to last long; they must be broken down! They can never be pulled into the “selected for development” column.

The reason I put simplistic values in the field is to allow us to get some Jira reports that use Story Points.

For example, look how we can guess when the release date will be for the set of issues!

Freedom

Try out the strategy of breaking down all features into small, client-valued stories, Make all issues a 1 or a 2 pointer.

If you aren’t reducing (or delaying) scope along the way, you might be doing it wrong.

Remove the ceremonies around story points and base all of your predictions on issues delivered.

So You Wanna Try Agile?

Well, in a nutshelll, agile is a state of mind.
Agile is relative.
Agile is not dogmatic.
Agile is pragmatic.

Agile is designed to reduce the gap in time between doing something, and seeing a result that you can “measure.”

Blend in Lean concepts.
Read The Goal.
Don’t do giant monolithic things.
Learn how to decompose your “features/expected outcomes” into bite-size chunks.
Don’t plan across a far horizon to the same degree.
What folks are tasked with doing today better be clear and have unambiguous meaning of “DONE”
What folks are tasked with doing in 2 months better be rather nebulous and broad.

Don’t be complacent and relax.
Question the value of deliverables that are needed to fulfill some process step for some other team.
Ask any downstream recipient what they truly need — and why.
Try new processes.
Reflect.
Agile takes constant effort and constant partial attention if you are doing it right.
Be holistic.

User Stories and FDD

FDD?

I bet you never heard of Feature-Driven Development, eh?

Well, Mike Cohn wrote this recent post:

Not Everything Needs to Be a User Story: Using FDD Features

Having worked with Peter Coad since the early 90s, and Jeff De Luca in the late 90s, I’ve been a fan of FDD and naturally turn to that style when “user stories” are not so user-centric. And yes, those are typically the minority items on our backlogs.

Software development is working through a prioritized to-do list. Most of the to-dos should be about addressing user needs. Call them user stories, call them features, maybe even call them requirements. Whatever works best to help you organize and communicate what needs to be built.

Another element of FDD is breaking down (or building up) the system into

  • Major Feature Sets (Quote Management), and their
    • Feature Sets (Clone Quotes, Create Quotation Documents), and their
      • Features (create a quote, edit a quote, archive a quote).

Major Feature Sets might loosely equate to epics 🙂

One of the keys to successful software development, is to combine the list of features with a domain model (and some UI mockups don’t hurt). The domain model need not be to the nth degree of UML detail. But one that clearly describes — in just enough detail — what your problem domain is all about. This eliminates the need to write all sorts of detail in the development issues, leaving that to the model. Then the feature list become more about the order in which we are building up various aspects of the product feature sets.

Thanks for paying a bit of homage to FDD. A blast from the past!

Measuring Effectiveness: The Software Industry’s Conundrum

On “Measuring Effectiveness”

I’ve been saying for seemingly decades, this is (one of?) our nascent software industry’s biggest conundrums (maybe even an enigma?).

Just how do you prove a team is “doing better?” Or that a new set of processes are “more effective?”

My usual reaction when asked for such proof:

“Ok, show me what you are tracking today for the team’s performance over the past few years, so we have a baseline.”

Crickets…

But seriously, wouldn’t it be awesome if we could measure something? Anything?

For my money:

Delivering useful features on top of a reasonably quality architecture, with good acceptance and unit test coverage, for a good price, is success.

Every real engineering discipline has metrics (I think, anyway, it sounds good — my kids would say I am “Insta-facting™”).

If we were painting office building interiors, or paving a highway, we could certainly develop a new process or a new tool and quantitatively estimate the expected ROI, and then prove the actual ROI after the fact. All day long.

In engineering a new piece of hardware, we could use costing analysis, and MTBF to get an idea on the relative merits of one design over another.

We would even get a weird side benefit — being relatively capable at providing estimates.

In software, I posit this dilemma (it’s a story of two teams/processes):

Garden A:

  • Produces 15 bushels (on average) per month over the growing season
  • Is full of weeds
  • Does not have good soil management
  • Will experience exponential (or maybe not quite that dramatic) production drop off in ensuing years, requiring greater effort to keep the production going. Predictability will wane.
  • Costs $X per month to tend

Garden B:

  • Produces 15 bushels (on average) per month over the growing season
  • Is weed free and looks like a postcard
  • Uses raised bed techniques, compost, and has good soil management
  • Will experience consistent, predictable, production in ensuing years
  • Costs $Y per month to tend

I could make some assertions that $Y is a bit more costly than $X… Or not. Let’s assume more costly is the case for now.

To make it easier to grok, I am holding the output of the gardens constant. This is reflected by the exorbitant rise in cost in the weedy Garden A to keep producing the same bushels per month… (I could have held the team or expense constant, and allowed production to vary. Or, I could have tried to make it even more convoluted and let everything vary. Meh. Deal with this simple analogy!)

 

Year 1 2 3 4 5 6 7 8 9 10
Garden A 100 102 105 110 130 170 250 410 730 1600
Garden B 120 120 120 120 120 120 120 120 120 120

If we look at $X and $Y in years 1 through 10, we might see some numbers that would make us choose B over A.

But if we looked at just the current burn rate (or even through year 4), we might think Garden A is the one we want. (And we can hold our tongue about the weeds.)

But most of the people asking these questions are at year 5-10 of Garden A, looking over at their neighbor’s Garden B and wanting a magic wand. The developers are in the same boat… Wishing they could be working on the cooler, younger, plot of Garden B.

What’s a business person/gold owner to do? After all, they can’t really even see the quality of the garden, they just see output. And cost. Over time. Unless they get their bonus and move on to the next project before anyone finds out the mess in Garden A. Of course, the new person coming into Garden A knows no different (unless they were fools and used to work in Garden B, and got snookered into changing jobs).

Scenario #2

Maybe we abandon Garden A, and start anew in a different plot of land every few years? Then it is cheaper over the long haul.

Year 1 2 3 4 5 6 7 8 9 10
Garden A 100 102 105 100 102 105 100 102 105 100
Garden B 120 120 120 120 120 120 120 120 120 120

I think the reason it is so challenging to get all scientific about TQM, is that what we do is more along the lines of knowledge work and craftwork, compared to assembly line.

The missing piece is to quantify what we produce in software. Just how many features are in a bushel?

I submit: ask the customer what to measure. And maybe the best you can do is periodic surveys that measure satisfaction (sometimes known as revenue).

Matt Snyder (@msnyder) tweeted me a nice video: Metrics, Metrics, Everywhere – Coda Hale


uTest Interview

I “ran” into the nice folks at uTest, and they asked me a handful of questions.

I answered… in a shockingly (for me) succinct way.

1. As a long-time agile coach, you can probably tell right away if agile is going to be employed successfully within a company or organization. If you had to pick one quality or trait that’s required for agile success, what would it be? In other words, what’s the first thing you look for when beginning a coaching project?

[JK] Willingness to change. That’s all I ask. Be open-minded to trying things a different way.

2. Looking back, did you ever think the agile movement would grow to where it is today? What’s surprised you the most about agile’s course over the last decade? The good and the bad.

[JK] No. How could 4 measly bullet points cause so much ruckus?! The biggest problem I see is the co-opting of the term “agile.” That is, folks are doing agile in name only. They don’t really get the subtle nuances about what it means to be agile, and simply go through some motions and try to “do” agile. While learning by doing is a key technique for learning anything new, somehow, many people seem to just do a handful of activities without much reflection or introspection.

3. Fill in the blank: The most common agile mistake development teams make is ____.

[JK] Not thinking. Agile requires continuous use of thinking… Are we improving? Will this help? Should we stop doing this activity? Should we do more of this activity? It takes effort to avoid complacency, which is hard for most of us.

4. From what we read, much of the inspiration for the agile movement originated not in the software space, but rather in the production/manufacturing space. How often (if at all) do you consult on non-software projects and how does it change the way agile is applied?

[JK] I’m not so sure that is true of the original 17 co-authors. While many of the lean/kanban concepts popular today owe their theories to manufacturing/production processes, I don’t recall any of the original folks waxing eloquent about being inspired by some non-software gigs. But I could be wrong. I have consulted on manufacturing automation/tracking/planning processes – and for that, a lot of the agile techniques apply. But I mostly focus on software projects.

5. A previous guest of our blog once referred to the “victims of fake agile” – i.e. the people whose lives were ill-affected by the misapplication of agile. Is this similar to what you refer to as the “pseudo-master of Agile”? And in your opinion, what’s the biggest threat to an organization that adopts agile in a half-hearted manner?

[JK] There have been snake-oil salesmen since the dawn of mankind. If an organization is not able to hire good talent or good agile consultants, then a lot of damage can be done. Although, mostly, it would be the cost of lost opportunity going forward. That is, by not embracing and practicing “Real Agile ™,” the company wastes time, and time just might be money. However, as a counterpoint/cynical view… Most companies that do agile in name only are often large, matrix orgs, where the software is one small aspect of their business. Screwing up Walmart.com likely has very little impact versus someone screwing up their entire logistics system that moves products to stores.

6. We assume (and we could be wrong) that there was a healthy amount of debate amongst the authors of the Agile Manifesto. If so, what was the biggest point of contention within the group and how was it resolved?

[JK] I think the biggest area where we disagreed was the “how long is an iteration” – that is, how frequently should we expect tangible results? Many were at the two-week level, and others (Alistair) were at the 4+ weeks.

7. As a longtime agile coach, you’ve helped countless organization achieve better results with the approach. We’re curious to know when it hasn’t worked so well. Have you ever advised a company or organization to forgo agile in favor of another method? If so, what were the circumstances?

[JK] Agile is ALWAYS the right answer. What can be wrong about doing better with a set of resources? What can be wrong with reducing the gap in time between taking some action, and getting some feedback? Agile is a state of mind.

8. As with any manifesto, people are bound to misinterpret or misapply the main tenets. If you had to single out one particular way that agile has been misinterpreted, what would it be?

[JK] The classic missteps are usually NO documentation and NO design work upfront.

9. As we’re sure you are aware, Agile has its fair share of detractors and skeptics – and they can be a very vocal bunch. Why do you think agile is so strongly disliked in some quarters? And what is the one argument against agile that irks you the most?

[JK] I don’t really care to change people’s minds. The Agile Manifesto is irrefutable, as it gets to the root of human nature in a software development context – and is analogous to the founding documents of the United States. Agile promotes strong, disciplined individual and team responsibility and continuous participation from the development “citizenry.” It is much easier to fall back on some process just because someone wrote it in a book than it is to use your brain. No particular argument irks me, because I don’t care what people with closed minds think. A fool with a tool is still a fool.

10. Fill in the blank: The key to a successful agile testing team is: ____

[JK] being totally involved, working the upstream part of the process in addition to downstream verification.

11. Have you stayed in touch with the other authors of the Agile Manifesto? And have you considered “getting the gang back together” to publish any other materials?

[JK] I “hang” with Ron, Chet, Bob, Alistair, and Martin mostly in cyberspace. We got together for the 10th-year anniversary (the only Agile Alliance conference I went to, as it was paid for). We talked about getting together sooner than in another 10 years because we had a great time. But who knows if it will come to fruition. Much like the USA’s Founding Fathers, they got together for a momentous occasion but then went their separate ways.

12. What’s Jon Kern doing when he’s not helping companies improve their development process?

[JK] I like to mountain climb, hike, drive my Audi on the track or autocross, and ski. But when I am not doing that, I am writing ruby and rails code using MongoDB, git, and the wonderful world of ruby gems. I wish this remarkable constellation of language and tools existed when I was doing C++ way back when!

Read more on:

Testing the Limits With Jon Kern, Agile Manifesto Co-Author

 


Easing New Developer Ramp-up Time

On a recent healthcare start-up team, it grew from my buddy (who moved on after 6 months or so) and I, to a handful of developers/sub-contractors.

Here is how we tried to make it fairly efficient. We just started tracking what was needed, getting feedback as each new developer went through the process, and improving the instructions and process along the way. If something was missing, we added it. If something was not clear, the new developer could amend the wiki.

By the 3rd new developer, or so, we had it down to where they could get started and begin a legitimate issue in less than a half day — from getting set up to being able to commit and deploy a new “feature.”

There was a section at the top that shared a good team chat room session with a new remote developer:

Getting Started Chat Room Conversation

That was followed by the FAQ-like list of links:

One of the first reasons I wanted to make it easy for a new team member to get rolling, was so that our friend Max — who would be doing our QA from Russia — could get started. As we added the first couple of devs, we probably decreased the start-up time as follows:

As part of “Getting Started,” I would include a simple Jira issue that helped them ensure that everything was working and that they followed our dev process:

  • Git and Dev and database (MongoDB) environment obviously had to be set up
  • Access to Jira to assign themselves to the issue, and move it — Kanban style — to the In Progress state.
  • Commit the work and the passing tests
  • Drag the Jira issue to “Done”

 

Since Atlassian’s Confluence Wiki does a stellar job at versioning pages, I actually looked back to see how the page grew and morphed over time. It started out rather modestly (and empty):

After it grew a bit bloated:

It was successively refactored into its current state, here is a snippet of the 70 versions that this page underwent from March 2011 through July 2012.

Wikis, like code, need to be tended to, nurtured, and refactored to provide the best value.

The Cost of Using Ruby’s Rescue as Logic

[notice]
If you use this sort of technique, you may want to read on.

node = nodes.first rescue return

[/notice]

 

[important]

Nov 2012 Update:

Though this post was about the performance cost of using a ‘rescue’ statement, there is a more insidious problem with the overall impact of such syntax. The pros and cons of using a rescue are well laid out in Avdi’s free RubyTapas: Inline Rescue

[/important]

Code like this:

unless nodes.nil?
  nodes.first
else
  return
end

Can be written using the seemingly more elegant approach with this ruby trick:

node = nodes.first rescue return

But then, that got me to thinking… In many languages I have used in the past (e.g., Java and C++), Exception handling is an expensive endeavor.

So, though the rescue solution works, I am thinking I should explore whether there are any pros/cons to allowing a “rescue” to act as logic. So I did just that…

Here are the two methods I benchmarked, one with “if” logic, and one with “rescue” logic:

def without_rescue(nodes)
  return nil if nodes.nil?
  node = nodes.first
end
def with_rescue(nodes)
  node = nodes.first rescue return
end

Using method_1, below, I got the following results looping 1 million times:

                  user     system      total        real
W/out rescue  0.520000   0.010000   0.530000 (  0.551359)
With rescue  22.490000   0.940000  23.430000 ( 26.487543)

Yikes. Obviously, rescue is an expensive choice by comparison!

But, if we look at just one or maybe 10 times, the difference is imperceptible.

Conclusion #1 (Normal Usage)

  • It doesn’t matter which method you chose to use if the logic is invoked infrequently.

Looking a bit Deeper

But being a curious engineer at heart, there’s more… The above results are based on worst-case, assuming nodes is always nil. If nodes is never nil, then the rescue block is never invoked. Yielding this (rather obvious) timing where the rescue technique (with less code) is faster:

                  user     system      total        real
W/out rescue  0.590000   0.000000   0.590000 (  0.601803)
With rescue   0.460000   0.000000   0.460000 (  0.461810)

However, what if nodes were only nil some percentage of the time? What does the shape of the performance curve look like? Linear? Exponential? Geometric progression? Well, it turns out that the response (see method_2, below) is linear (R2= 0.99668):

Rescue Logic is Expensive

Rescue Logic is Expensive

Conclusion #2 (Large Data Set):

In this example use of over a million tests, the decision on whether you should use “rescue” as logic boils down to this:

  • If the condition is truly rare (like a real exception), then you can use rescue.
  • If the condition is going to occur 5% or more, then do not use rescue technique!

In general, it would seem that there is considerable cost to using rescue as pseudo logic over large data sets. Caveat emptor!

Sample Code:

My benchmarking code looked like this:

require 'benchmark'

include Benchmark

def without_rescue(nodes)
  return nil if nodes.nil?
  node = nodes.first
end

def with_rescue(nodes)
  node = nodes.first rescue return
end

TEST_COUNT = 1000000

def method_1
  [nil, [1,2,3]].each do |nodes|
    puts "nodes = #{nodes.inspect}"
    GC.start
    bm(12) do |test|
      test.report("W/out rescue") do
        TEST_COUNT.times do |n|
          without_rescue(nodes)
        end
      end
      test.report("With rescue") do
        TEST_COUNT.times do |n|
          with_rescue(nodes)
        end
      end
    end
  end
end

def method_2
  GC.start
  bm(18) do |test|
    nil_nodes = nil
    real_nodes = nodes = [1,2,3]
    likely_pct = 0
    10.times do |p|
      likely_pct += 10
      test.report("#{likely_pct}% W/out rescue") do
        TEST_COUNT.times do |n|
          nodes = rand(100) > likely_pct ? real_nodes : nil_nodes
          without_rescue(nodes)
        end
      end
      test.report("#{likely_pct}% With rescue") do
        TEST_COUNT.times do |n|
          nodes = rand(100) > likely_pct ? real_nodes : nil_nodes
          with_rescue(nodes)
        end
      end
    end
  end
end

method_1
method_2

Sample Output

                  user     system      total        real
W/out rescue  0.520000   0.010000   0.530000 (  0.551359)
With rescue  22.490000   0.940000  23.430000 ( 26.487543)
nodes = [1, 2, 3]
                  user     system      total        real
W/out rescue  0.590000   0.000000   0.590000 (  0.601803)
With rescue   0.460000   0.000000   0.460000 (  0.461810)
                        user     system      total        real
10% W/out rescue    1.020000   0.000000   1.020000 (  1.087103)
10% With rescue     3.320000   0.120000   3.440000 (  3.825074)
20% W/out rescue    1.020000   0.000000   1.020000 (  1.036359)
20% With rescue     5.550000   0.200000   5.750000 (  6.158173)
30% W/out rescue    1.020000   0.010000   1.030000 (  1.105184)
30% With rescue     7.800000   0.300000   8.100000 (  8.827783)
40% W/out rescue    1.030000   0.010000   1.040000 (  1.090960)
40% With rescue    10.020000   0.400000  10.420000 ( 11.028588)
50% W/out rescue    1.020000   0.000000   1.020000 (  1.138765)
50% With rescue    12.210000   0.510000  12.720000 ( 14.080979)
60% W/out rescue    1.020000   0.000000   1.020000 (  1.051054)
60% With rescue    14.260000   0.590000  14.850000 ( 15.838733)
70% W/out rescue    1.020000   0.000000   1.020000 (  1.066648)
70% With rescue    16.510000   0.690000  17.200000 ( 18.229777)
80% W/out rescue    0.990000   0.010000   1.000000 (  1.099977)
80% With rescue    18.830000   0.800000  19.630000 ( 21.634664)
90% W/out rescue    0.980000   0.000000   0.980000 (  1.325569)
90% With rescue    21.150000   0.910000  22.060000 ( 25.112102)
100% W/out rescue   0.950000   0.000000   0.950000 (  0.963324)
100% With rescue   22.830000   0.940000  23.770000 ( 25.327054)

Manual Cucumber Tests?

there was some discussion over on the cucumber list about manual testing.

cucumber is great at BDD, but it doesn’t mean it is the only test technique (preaching to choir) we should use.

i have learned it is critical to understand where automated tests shine, and where human testing is critical — and to not confuse the two.

as far as cuking manual tests, keeping the tests in one place seems like a good advantage (as described in Tim Walker’s cucum-bumbler wiki <g>).

the cucumber “ask” method looks interesting. maybe your testers could use the output to the console as-is, or (re-)write your own method to store the results somewhere else/output them differently.

From the cucumber code (cucumber-1.1.4/lib/cucumber/runtime/user_interface.rb):

# Suspends execution and prompts +question+ to the console (STDOUT).
# An operator (manual tester) can then enter a line of text and hit
# <ENTER>. The entered text is returned, and both +question+ and
# the result is added to the output using #puts.
# ...
def ask(question, timeout_seconds)
...

Sample Feature:

    ...
Scenario: View Users Listing
  Given I login as "Admin"
  When I view the list of users
  Then I should check the aesthetics

Step definition:

Then /^I should check the aesthetics$/ do
  ask("#{7.chr}Does the UI have that awesome look? [Yes/No]", 10).chomp.should =~ /yes/i
end

The output to the console looks like this:

Thanks for the pointer, Matt!

[notice]NOTE: it doesn’t play well with running guard/spork.[/notice]
The question pops up over in the guard terminal 🙁

Of course, if you are running a suite of manual tests, you probably don’t need to worry about the Rails stack being sluggish :-p

    Spork server for RSpec, Cucumber successfully started
    Running tests with args ["features/user.feature", "--tags", "@wip:3", "--wip", "--no-profile"]...
    Does the UI have that awesome look? [Yes/No]
    Yes
    ERROR: Unknown command Yes
    Done.