Keep your Cukes Narrow | Technical Debt

It is not uncommon to wonder about how to create data for cukes…

Keep it high level?
```
Given I have 4 products
```
Or explicitly define data; e.g., using a table

As a broad answer to this question, it all depends on what behavior you are specifying. There is no single, right way.

My usual rule of thumb is to limit the setup to just those parts of the problem domain “object graph” that you care to test in the current scenario. That helps draw attention to the most “narrow” bits of the system that are under test. Otherwise, if you continue to build everything from scratch for all scenarios, it is sometimes hard to see the purpose of the test because it is lost in all of the setup code. Sometimes you care about the details in the parent object, sometimes you want to test the sum of the parts, and other times you are focused only on a small aspect of the object model.

Examples

Suppose we are making bread, and we have a set of recipes. Each recipe has a set of ingredients. When we go to make 5 loaves of bread, the list of ingredients are properly re-sized and displayed so we can mix the dough and make a new batch of bread.

My first scenario might be detailed around setting up the core recipe/ingredient relationship:

Scenario: View a recipe
  Given I have the following recipe:
  |Name | Artisan Bread       |
  |Yield| 2 1-lb oblong loaves|
  |Prep | 30 min              |
  |Proof| 3 hours             |
  |Baking| 35 min. at 450 deg F|
  With the following ingredients:
  |Ingredient       |Quantity|
  |White Bread Flour|6 cups  |
  |Whole Wheat Flour|0.5 cups|
  |Granulated Yeast |1.5 Tbsp|
  |Coarse (sea) salt|1.5 Tbsp|
  |Water, 100 deg   |3 cups  |
  When I view the recipe
  Then I should see the recipe

So the above scenario gets us focused on the basic bits of the Recipe ----*-> Ingredients part of the model, and the desired behavior of what a recipe display looks like. Note: on the “Then” I chose not to focus on how it is displayed. I could have said something like “…with the recipe info at the top, followed by a list of ingredients.”

Now suppose the next scenario we tackled was the ability to “resize” the recipe to make some multiple of the base quantity? Do we need to repeat the same sort of setup as we did above? Or can we get away with defining less information? That is, using something like FactoryGirl, we could predefine some basic recipe data, if nothing specific is needed.

Scenario: Resize the batch to triple the yield
  Given I have a "Hard Tack" recipe that yields "25 cakes"
  With the following ingredients:
  |Ingredient | Quantity  |
  |Flour      | 4 cups    |
  |Water      | 2 cups    |
  |Salt       | 4 tsp     |
  When I resize the recipe to 3 times the size
  Then I should see the following ingredients
  |Ingredient | Quantity  |
  |Flour      | 12 cups   |
  |Water      |  6 cups   |
  |Salt       | 12 tsp    |
  And a yield of "75 cakes"

While the difference is subtle, the point to the above scenario is to not clutter it with the core recipe elements as were visible in the first scenario.

For a more extreme example of “narrowing” the focus, lets look at another scenario. This time, the business tells us they want users to be able to rate the recipes.

Scenario: Users can 'Like' a recipe
  Given I have a "Swedish Hard Tack" recipe
  When I Like the recipe
  Then I should see the Like Count increment by one
  And I should not be able to Like the recipe a second time

So here you can see there is no reason whatsoever to care about the details of the recipe. In fact, I could have left off the recipe name, but sometimes I like to keep some sense of the domain visible. Had I written the Given to show all of the detailed data, it would have obscured the meaning of the scenario, as this scenario would then look like the other scenarios.

Scenario: (BULKY VERSION!) Users can 'Like' a recipe
  Given I have the following recipe:
  |Name | Artisan Bread       |
  |Yield| 2 1-lb oblong loaves|
  |Prep | 30 min              |
  |Proof| 3 hours             |
  |Baking| 35 min. at 450 deg F|
  With the following ingredients:
  |Ingredient       |Quantity|
  |White Bread Flour|6 cups  |
  |Whole Wheat Flour|0.5 cups|
  |Granulated Yeast |1.5 Tbsp|
  |Coarse (sea) salt|1.5 Tbsp|
  |Water, 100 deg   |3 cups  |
  When I Like the recipe
  Then I should see the Like Count increment by one
  And I should not be able to Like the recipe a second time

Again, this is a subtle point. But making it easier for readers of the feature file to see differences without having to “strain” can improve understanding and reduce potential for errors.

For more on Cucumber, peruse my other posts on the topic, and see this “Crib Sheet.“

Titus Fortner February 18, 2013 at 1:20 pm

You mention that there is no no single, right way, but I don’t understand why you would ever want to include most of the data you put in some of those examples. If the point is to verify that your system under test behaves in a certain way, and your implementation is abstracted anyway, does it ever matter to the scenario what the ingredients are? When it comes down to it, who cares which recipe, which ingredients, even what multiple, and is it expected that someone (presumably a business type) is going to run through and do the math to verify the numbers in the Given and Then are correct?

I would write your second scenario as:

Scenario: Resize the batch to increase the yield
Given I have a Hard Tack recipe
When I resize the recipe to 3 times the size
Then the recipe should yield 3 times the original recipe

Actually, I would probably do a scenario outline to vary the numbers for edge cases (1, 2 1/2, etc) that are desired to be tested, and maybe multiple recipes with different numbers of ingredients.

Putting data in a yaml, or Factory Girl (I presume, though I haven’t used it) seems much preferred to ever putting it in your test, and possibly having to repeat it for similar but different tests.

It sounds like this is along the lines of your point in this post, but it also appears like you are allowing for use cases that should include the data, and I’m not sure in which situations that would be desirable.

I’d greatly appreciate any additional insight you have on these things based on your experience.

Thanks,
Titus

2 thoughts on “Keep your Cukes Narrow”

Titus Fortner February 18, 2013 at 1:20 pm

You mention that there is no no single, right way, but I don’t understand why you would ever want to include most of the data you put in some of those examples. If the point is to verify that your system under test behaves in a certain way, and your implementation is abstracted anyway, does it ever matter to the scenario what the ingredients are? When it comes down to it, who cares which recipe, which ingredients, even what multiple, and is it expected that someone (presumably a business type) is going to run through and do the math to verify the numbers in the Given and Then are correct?

I would write your second scenario as:

Scenario: Resize the batch to increase the yield
Given I have a Hard Tack recipe
When I resize the recipe to 3 times the size
Then the recipe should yield 3 times the original recipe

Actually, I would probably do a scenario outline to vary the numbers for edge cases (1, 2 1/2, etc) that are desired to be tested, and maybe multiple recipes with different numbers of ingredients.

Putting data in a yaml, or Factory Girl (I presume, though I haven’t used it) seems much preferred to ever putting it in your test, and possibly having to repeat it for similar but different tests.

It sounds like this is along the lines of your point in this post, but it also appears like you are allowing for use cases that should include the data, and I’m not sure in which situations that would be desirable.

I’d greatly appreciate any additional insight you have on these things based on your experience.

Thanks,
Titus
jon Post authorFebruary 18, 2013 at 9:02 pm

I have also written condensed (hidden) scenarios as you suggest in the re-write. It buries the details quite well — which is sometimes what is needed.

Many times I include the data to be very explicit about ensuring the scenario tells enough of the story such that it is clear and unambiguous. When I combine the detailed scenario with a domain model and a simple UI sketch, you pretty much have everything you need to set about and develop the feature.

Other times, I might bury the details when either it is not that critical to show explicitly, or if it is likely to be a bit brittle (that is, the implementation may be likely to change). This is especially true when you are creating a new object, and then end up on a view that shows you that object or a list of objects.

The default data in Factory Girl is used when you just want an instance or two, and care little about the values (of course, you can selectively override individual properties as needed).

Comments are closed.