BDD With Cucumber

I have enjoyed learning and using Cucumber for doing BDD/acceptance-level testing, and you can read some of my cucumber posts.

Motivation

In all my years of dealing with software development, Cucumber represented the first time I ever saw a “closed loop” feedback system for ensuring that my english-like business requirements were actually implemented in code. Cucumber’s Gherkin Domain-Specific Language for specifying behavior as Given-When-Then is very easy to just start using with the business. It doesn’t require anything very special to get folks to grok the simplistic syntax. (Similarly, it isn’t hard to get business folks to grok object models of their business domain — if you stick to simple aspects of UML.) Typically, acceptance tests were some combination of manual or “recorded” techniques to ensure that a set of requirements (word documents, 3×5 index cards, or a spreadsheet) were being handled by a set of test cases. There was no actual regression-test-like assurance that the functionality was actually still in the code.

The amazing thing about Cucumber is that it helps me do software development the way I like to do software development — which may not be how you work. I like to combine object models, and UI mockups, and features, all into a holistic agile process. The features should be small, actionable bits of desired functionality — perfect for describing in a cucumber feature! The “demand-pull” system of writing a failing cucumber test, and the “outside-in” approach of dropping down from a failing cuke into TDDing out the code, is just fantastic.

Using Cucumber, I get so much “bang for the buck” — a.k.a., rewards:

I write features with the business folks in a mildly “techy” way that business folks can buy into
Cukes help define behavior and indicate what “done” looks like
Cukes often speak the language of the business by allowing a set of example scenarios, and by using tables — very natural for many problem domains
Writing a failing cuke and then implementing the minimum to get it to pass helps speed up development by encouraging short, crisp, commits
Having BDD cukes instigate my TDD rspecs also ensures I am writing just enough down in the unit test land, but no more.
After the cuke passes, I know I have built what we needed, and no more
As a bonus, I am left with a set of Acceptance Tests and a set of documentation

Now Aslak Hellesoy will likely cringe a bit when I say how much I think Cucumber is a really cool acceptance test tool — which is actually not its primary purpose, it is just a great side effect.

There are naysayers of using Cucumber. But like any tool, you have to know how to use it. A fool with a tool is still a fool.

A Handful of Tips

As I was helping a client using the Cucumber front-end (a.k.a. “Gherkin” — a domain-specific language, or DSL) on a C# project using SpecFlow, I wanted to capture some of their ah-ha’s.

Create Data “Close” to the Test

For some folks new to building up a suite of tests — and those using SQL, it is tempting to think that building up a set of test data to run all the tests against is a good idea. While it would work, I advise that it is best to keep the data creation in close proximity to the test that is using the data. Otherwise, when you go to add a new feature and add new data, you have to take an extra step to go add data to some other file.

Therefore, create the data in the Given of each test, or maybe in the background should a few tests benefit from the same setup.

It’s also important to setup and teardown the database around your test runs, using something like Database Cleaner or rolling your own DDL scripts.

Create Simple, Uncoupled Tests

A pair was building a feature to show that the user could search by a certain term, and that the proper record(s) showed up in the list of results. They had a UI mockup as guidance. They were spending time in the step definition to ensure that the text was appearing in the proper row and the proper column of the table. While there may be some instances where such detailed testing might be warranted, in general:

Just test that the expected values are present in the page text
Trust the developer to build the proper UI
Trust the QA folks to verify the proper visual aspects (data in proper columns)
Don’t waste a lot of time toiling in the step definition file

The less coupled your steps are to the actual implementation, the less brittle your tests will be.

Don’t Repeat Yourself

One of the more fascinating ah-hah moments I had when first learning to write Cucumber tests, was that the effort lessened as I built up more tests! That is, for the first feature, I had to write all of the step definition code. But subsequent steps often re-used one or more existing step definitions. Counterintuitive, until you stop to think about it — then it makes perfect sense.

A developer I was pairing with wrote a feature with mildly different wording, and proceeded to add a method to the step definition file with that slightly different wording. Before he got too far, I asked him to compare this new wording with the existing other feature. Once he saw the duplication, we embarked on using the same given. Funny enough, we were rewarded with discovering the other feature using this given was more tightly coupled to the implementation than was necessary (see the above tip). We refactored the step so that it served both features and reduced brittleness.

If you pay attention to using consistent wording (or have a tool like RubyMine with good manners), you will be rewarded with less development time and less test code to maintain.

Keep Your Tests Narrow

It is not uncommon to wonder about how to create data for cukes…

Keep it high level?
```
Given I have 4 products
```
Or explicitly define every detail of the data; e.g., using a table

As a broad answer to this question, it all depends on what behavior you are specifying. There is no single, right way.

My usual rule of thumb is to limit the setup to just those parts of the problem domain “object graph” that you care to test in the current scenario. That helps draw attention to the most “narrow” bits of the system that are under test. Otherwise, if you continue to build everything from scratch for all scenarios, it is sometimes hard to see the purpose of the test because it is lost in all of the setup code. Sometimes you care about the details in the parent object, sometimes you want to test the sum of the parts, and other times you are focused only on a small aspect of the object model.

Bottom line, do not add more detail to the cukes than you are testing. It is both wasteful and misleading.

Examples

Suppose we are making bread, and we have a set of recipes. Each recipe has a set of ingredients. When we go to make 5 loaves of bread, the list of ingredients are properly re-sized and displayed so we can mix the dough and make a new batch of bread.

My first scenario might be detailed around setting up the core recipe/ingredient relationship. Though it borders on appearing like “too much unnecessary detail,” it packs a good deal of crucial information into a “small” test:

Scenario: Create a recipe
  Given The following recipe is entered:
  |Name  | Artisan Bread       |
  |Yield | 2 1-lb oblong loaves|
  |Prep  | 30 min              |
  |Proof | 3 hours             |
  |Baking| 35 min. at 450 deg F|
  And It includes the following ingredients:
  |Ingredient       |Quantity|
  |White Bread Flour|6 cups  |
  |Whole Wheat Flour|0.5 cups|
  |Granulated Yeast |1.5 Tbsp|
  |Coarse (sea) salt|1.5 Tbsp|
  |Water, 100 deg   |3 cups  |
  When the recipe is saved
  Then the recipe should be listed with its ingredients

So the above scenario gets us focused on the basic bits of the one-to-many association part of the model: Recipe ----*-> Ingredients, and the desired behavior of what a recipe display contains.

This cuke would drive me to create rspec tests (TDD) to flesh out the core Recipe and Ingredient models. Then I would be able to return to the cuke to flesh out the basic visual behavior that is being specified:

a form to enter the attributes as listed for the recipe
the ability to add each ingredient “detail”

Though I might have been tempted to tackle the ingredient unit of measure, I would have instead treated the “quantity” as a simple string.

Now suppose the next scenario we tackled was the ability to “resize” the recipe to make some multiple of the base quantity? Do we need to repeat the same sort of setup as we did above? Or can we get away with defining less information? That is, using something like FactoryGirl, we could predefine some basic recipe data, if nothing specific is needed.

Scenario: Resize the batch to triple the yield
  Given a "Hard Tack" recipe that yields "25 cakes"
  With the following ingredients:
  |Ingredient | Quantity  |
  |Flour      | 4 cups    |
  |Water      | 2 cups    |
  |Salt       | 4 tsp     |
  When the recipe is resized by 3 times
  Then the following ingredients should be listed
  |Ingredient | Quantity  |
  |Flour      | 12 cups   |
  |Water      |  6 cups   |
  |Salt       | 12 tsp    |
  And the recipe should yield "75 cakes"

While the difference is subtle, the point to the above scenario is to not clutter it with the core recipe elements as were visible in the first scenario. Just the name and the yield quantity are sufficient, as the focus here is about “resizing” — that is, dealing with units of measure. So now this simple cuke will force us to treat quantity as a combination of a numeric value and a unit of measure. Maybe we TDD out a new “Quantity” class? And it would seem that we need to TDD out the concept of “resize” for a Recipe instance… Again, this cuke packs a punch — a lot of key info in a very tiny amount of specified behavior.

For a more extreme example of “narrowing” the focus, lets look at another scenario. This time, the business tells us they want users to be able to rate the recipes.

Scenario: Users can 'Like' a recipe only one time
  Given a "Swedish Hard Tack" recipe
  When Steve likes the recipe
  Then the Like Count should increment by 1
  When Steve likes the recipe a second time
  Then the Like Count should increment by 0

So here you can see there is no reason whatsoever to care about the details of the recipe. In fact, I could have left off the recipe name, but sometimes I like to keep some sense of the domain visible. Had I written the Given to show all of the detailed data, it would have obscured the meaning of the scenario, as this scenario would then look like the other scenarios.

Scenario: (BULKY VERSION!) Users can 'Like' a recipe only one time (DO NOT USE)
  Given I have the following recipe:
  |Name | Artisan Bread       |
  |Yield| 2 1-lb oblong loaves|
  |Prep | 30 min              |
  |Proof| 3 hours             |
  |Baking| 35 min. at 450 deg F|
  With the following ingredients:
  |Ingredient       |Quantity|
  |White Bread Flour|6 cups  |
  |Whole Wheat Flour|0.5 cups|
  |Granulated Yeast |1.5 Tbsp|
  |Coarse (sea) salt|1.5 Tbsp|
  |Water, 100 deg   |3 cups  |
  When I Like the recipe
  Then I should see the Like Count increment by one
  ...

Again, this is a subtle point. But making it easier for readers of the feature file to see differences without having to “strain” can improve understanding and reduce potential for errors.

Technical Debt

Jon Kern's ramblings on software development