How to Structure a Scalable And Maintainable Acceptance Test Suite - codecentric AG Blog
Robot Framework Tutorial
Part I: Robot Framework Tutorial – Overview
Part II: Robot Framework – A complete example
Part III: Robot Framework IDE
Part IV: How to Structure a Scalable And Maintainable Acceptance Test Suite
Part V: Robot Framework Tutorial – Writing Keyword Libraries in Java
Part VI: Robot Framework Tutorial – Loops, Conditional Execution and more
Part VII: Robot Framework – Testing Windows Applications
Appendix A: Robot Framework – Compact Sheet
You started to write automated acceptance tests, so that you don’t need to retest all the results from earlier sprints at the end of every sprint. Greate, we too. After a while of successful test automation, tests start to look like a big ball of mud instead of a cleanly designed test suite. Darn, same for us. Where did we go wrong? Over time, we established a few patterns and best practices, that lead to a scalable and maintainable test infrastructure, which I would like to present in this post.
I will only consider the structure of the tests itself, neglecting all considerations regarding the execution (logging, paralellisation) or test hardware. It was mentioned a couple of times already: we’re using the robot framework for automating our acceptance tests, thats why some of the presented solution will be specific to the robot framework. But most will be applicable to other test frameworks: users of FitNesse, Cucumber, Concordion, etc., don’t stop reading here 🙂 Enough preamble, let’s get going:
Recently, there have been many good articles that answer the question, what makes a good, single acceptance test: For example there’s Gojki Adzics blog post “Anatomy of a good acceptance test” that can be recommended, or Dale Emerys article “Writing Maintainable Automated Acceptance Tests” (PDF), in which he illustrates his ideas with code from robot framework (Uncle Bob shows the same with FitNesse). Well, a single, excellently designed acceptance test won’t stay alone for long, but how do you write a maintainable test suite? When I say test suite, I don’t only refer to the sum of all tests, but include all, which is necessary to run the tests (including additional artefacts, libraries, frameworks, etc.). To start the discussion, let’s get into the mood with a quote from Dale’s article:
The need to change tests comes from two directions: changes in requirements and changes in the system’s implementation. Either kind of change can break any number of automated tests. If the tests become out of sync with either the requirements or the implementation, people stop running the tests or stop trusting the results. To get the tests back in sync, we must change the tests to adapt to the new requirements or the new implementation.
The necessity for changing the test suite results from changes to the requirements and from changes in the implementation. And what’s true for a single test stands even stronger for a whole test suite. But how can we build a test suite that can be adapted to all those changes as good as possible? Apparently, there are parts in the test suite that are more stable or more variable than others.
Stable parts are for example the test framework itself and its accompanying libraries. Those will not be swapped out easily but kept constant if possible. Equally constant are the test cases themselves, when there is no business reason to change them. What should be possible without problems is adding new tests independantly of the existing ones. Where are the variable parts in the test suite, when tests (on the very top) and platform (at the very botton) are equally stable? The following figure should make that transparent and describes the structure of a scalable and maintainablne test suite. Each layer will be explained in more detail below the picture. Roughly the answer to the question about the variable parts is this: when I have a system that should be stable on both ends, but overall flexible, then the variable parts need to be in the middle.
The following discusses the layered structure. The different colors depict the different kinds of files in the robot framework:
- red: test cases and suites are marked as red
- green: robot’s resource files are green. Resources contain keywords, which can be called from test cases or other resources. Simply put, a keyword is similar to a method, and a resource is a collection of little helper methods.
- blue: elements of the robot framework itself are blue
On the left hand side of the picture, you will find the degree of stability or variability for every layer. At the top and at the bottom, there’s a maximum of stability, while towards the middle the stability decreases and there’s a maximum of variability. This means that elements in the middly layers are affected by changes most frequently, while elements at the top and botton are relatively constant over time.
The layers explained in detail:
The test cases itself should only be touched, when requirements change. Our preferred way to describe expected behaviour in robot is the “Given / When / Then”-style, enriched with example tables. This means that if there is no change in the business requirements, there should be no technical reason to even touch the test cases.
Test cases should be grouped into functionally related test suites. In the beginning we grouped our test cases by user story during which they were created. In retrospect we realized, that we had to gradually search more and more for a specific test, when there were several similar themed user stories. We found it better to tag the test cases with the user story and group tests by theme into test suites.
The next layer I have named “imports”. Its purpose is to map test cases to resources, which contain the keywords needed to automate the acceptance test. Since the tests should not be altered for technical reasons, but keywords are continuously refactored and moved around, we need something to map one to the other. Since test suites contain similar themed test cases, they tend to need the same imports, so there’s one import resource for every test suite. And every import resource is only used by test cases from the same test suite.
The import resources don’t contain any keywords per se. Nevertheless, they are well suited for “one-off keywords” that are only needed in a particular test suite, and don’t make sense in any of the resources a layer below.
The layer with the highest change frequency contains aggregating objects: when testing web UIs an established pattern is the “Page Objects” (here are more examples for page objects with Selenium itself of with JBehave). Abstracting from the concept of a web page, in this layer is where the classic software design and engineering is happening: how shall code be structured to be flexible and maintainable? Page objects are one possibility, when you deal with websites. Generally those aggregating objects can be business or domain objects, services or similar.
We added a layer of library adapters, after discussing the topic “Multiple initialization of a Library” on the robotframework-users mailing list (A big thanks goes out to Pekka, Janne and the other “robots” for providing quick and sound help for even the most weird questions). The conclusion of the discussion was, that libraries should be imported via a resources, to ensure that only one instance of that library is active in the system. After introducing the library adapters, they proved helpful also for other reasons: when a library is initialized with start parameters (like the SeleniumLibrary for example) there’s now only one place to keep that parameters (DRY). Additionally the library underneath can be extended with new functionality, transparently to the using test cases, by adding some keywords to the adapter.
The platform includes all installed components from the robot framework and available libraries.
Finally, I can state that our way of structuring our test suite had a profound impact. On the other hand, we continue to learn new things and I am sure that new learnings will influence the structure and design of test cases and suites even further. I hope I succeeded in presenting our knowledge in a way so that other projects can benefit from it.
What are your experiences regarding the maintainability of test suites? Will other technologies and frameworks lead to other test suite structures? Are there flaws or problems in the presented solutions, which we could not spot yet? I’m looking forward to your comments.