Why Cucumber and the Gherkin Language should not be used for Software Testing

By Gonzalo Isaza

Lately, a technology called Cucumber, closely tied to “Gherkin language” has become a popular choice for writing test automation code. Apparently it is being used in many companies. This blog would like to present why, in my personal opinion, this is a bad choice for a QA team in any industry, and should be avoided. Other technologies provide more benefits as far as code reuse goes and reduce development time significantly, way over anything Cucumber has to offer.

One of the benefits attributed to using “Gherkin language” for testing, is that it easily describes in English (or another language) the software behavior. While this may be valid, it is a very poor argument for adopting such technology for writing tests. Computer Science, and programming, is a specialized skill, which allows engineers to produce well written software. But it’s complex, and engineers go through years of training to become professionals at this. Therefore, to perform their work efficiently, they learned a complex jargon, math, algorithms, computer languages, etc. Adopting Gherkin so “lay people” can understand what the software does is a very poor argument. In the same way I could not understand complex jargon in economics, architecture, or brain surgery, I should not expect that people, unfamiliar with programming must be able to understand what programmers produce or express in technical terns (e.g. “This algorithm has a big O of n Log n performance”). Yet architects, pilots, economists or people in any other profession are not asked to rescind technical terms so the lay can understand. The idea is ludicrous. I don’t know why it became adopted in programming. IT QA departments should produce test planning documents, which are in plain English, and provide non-technical people an understanding of what will be done. That is fine. But this should not be tied to coding. In the same way my dentist explains to me the procedure he is planning to perform on my teeth in simple plain English so I can understand it. That does not mean he will not use technical terms with dental assistants while performing his/her work, or include it in dental history files. That would be unreasonable.

OK, I hope by now, I sold you a bit why this makes no sense. Now, I will explain, based on Computer Science principles, why this is a huge mistake, and its adoption denies the use of simple advances in Computer Science offered by modern programming languages.

Cucumber technology maps every sentence to a method which performs an action. Let’s look at a hypothetical example (I numbered the lines in order to refer to them easily as I go on). This example looks just like a Gherkin Language test file:

1. Given my ice cream shop is today offering chocolate, vanilla, and mint flavors 2. When a customer comes to the store 3. And asks for a chocolate ice cream cone 4. Then the attendant should supply it

Each line will trigger the execution of a single method. So let’s say line (1) may fill a list with the flavors the store offers for the test. Line 2 will trigger a method which generates a customer, etc. According to cucumber followers, this is great because methods can easily be reused by simply reusing the sentence. I could easily create a new test, identical to this one, but with an extra line 3a which says 3a. And prefers a waffle cone . All I need to do is write the code for my new 3a Gherkin sentence.

All this may sound beautiful, but it’s not. The problem is that each method depending on data will pick it up from some “agreed location”. So the flavors will, for example, be in TestData.IceCreamFlavors which is a list of flavors. Step 4 will pick the flavor, the list of flavors maybe, and the customer data. And TestData most likely is just a bag of objects where everything is thrown in for reuse. The “pre-arranged location”. Step 3 needs a customer. So it will pick it up from TestData.theCustomer in order to prepare the order, ask for the credit card, and whatever you want to imagine. The problem is: The customer is expected to be in a pre-arranged place: let’s say that place is TestData.theCustomer. If you do not see a problem with this, then let’s move on to create a new test where I need 2 customers. That’s a problem because step 2 is planned to create a customer and leave it in TestData.theCustomer for others to use. I can have 50 tests written already. And now, I need 2 customers for some new test. So the solution applied would be to define TestData.secondCustomer and move on (until you need 3 customers). Or maybe the code is planned better and defined TestData.customerList where it adds as many customers as it needs. This would never happen without Cucumber. The test would just invoke the method to create a customer 3 times, and receive the data back. The point I’m trying to make, is that having “pre-arranged” locations for data is a very bad practice. I will make an analogy with the evolution of programming languages, in order to make even more clear why.

In the early days (the 80s and 90s), the programming language of choice was C. In those days C++ or C#, as many other languages did not exist yet. The C language did not have classes. Just a group of C files with methods (or functions as they were called back then) which got compiled to do things. Whenever common data was needed, a variable was defined outside the method, and then all the methods could see it (a global variable). So a lot of discipline had to be enforced when it came to naming and using this shared data, so that the hundreds and hundreds of methods making up a program would not accidentally misuse a or inadvertently change a data format. A mishap with a global variable would potentially affect many things. The writer of method X changes something in a global variable; he or she could break many other functions depending somehow on this data. Object oriented programming was invented to provide many benefits (come to the rescue), one of which is encapsulation. So in object oriented programming, I can have a customer A, with all its information, and customer B with all its information, and both are independent. Data is tightly sealed in each object, except for the data the object wants to expose. And objects can be passed around from method to method as objects without having to live in a “pre-arranged location”. The use of global variables today, for the most part has disappeared. Experience proved this practice was very hard to maintain and heavily prone to failure. Using global data sharing was bad in the past, was abolished, and there is no reason why it now suddenly becomes OK.

The use of Cucumber technology and Gherkin (coming back to topic) destroys the benefits of encapsulation, and takes coding back to the C language days. But it makes it even worse: at least in the C language, methods could return values to be used by a caller. In cucumber there is no way to pass data from one sentence to another. It has to be placed in a pre-arranged place. Just like global variables in C. This is why the use of Cucumber is a 30-year step back in the discipline of code writing. But wait, there is more…

Yet another problem of Cucumber is that variables for the most part are received in methods as strings and need to be parsed out. I say “for the most part” because some products have ways to convert some strings to numbers, enums and maybe a few other types. These are called hooks, and guess what: they are global (just like global variables in the C language). I won’t even go into the risks of this.

These are among others, the problems this technology will bring to an organization which adopts it:

· Test development time will increase. Why?

o There is an extra layer (the Gherkin syntax file) to connect the sentence to the code. Is it better, cheaper or faster to develop code with n layers, or with (n + 1) layers, considering there is one unnecessary layer? Certainly not; It can’t be cheaper. It’s more work.

o The Gherkin syntax layer deals with regular expressions. Working and matching regular expressions properly is more time consuming since they are hard to understand. Understanding “what’s wrong with my regular expression” takes time. This is true even for advanced users of regular expressions. Regular expressions are cryptic (e.g. (?:([AB]+)\s*)\w*([1–9]+[0–9]*) ). I’m not saying they are not understandable. I’m saying figuring out what it does is tricky and takes time even if you are proficient in their use. If you want to see a really complex regular expression which matches only multiples of 3, check https://www.quaxio.com/triple .

o An accidental change in a Gherkin file may result in the regular expression disconnecting from the code, and going unnoticed by the compiler. You will find out in the nightly test report (or not!).

o The fact that many arguments arrive as strings deprives the engineer from the benefit of type checking. Strongly typed languages were created for a reason. Extra code will need to be written to convert these strings to their proper type for methods to execute. Yet more work.

· Since writing tests is more time consuming, it is very hard to write and execute a 3-line test to check something quickly, which can easily be done using other technologies (e.g. NUnit or the default test adapter in Visual Studio)

· Every method has to pick up data or leave data in a “pre-arranged” location. This, in Computer Science, is called a side effect. A method is changing something external to itself during its execution. With Cucumber, every method either produces a side effect, or relies on a side effect produced by a previous step. The use of side effects is strongly discouraged in modern object oriented languages (if you don’t believe me look it up in Wikipedia). And if you are using a language like Scala which is gaining a lot of popularity, then side effects move from being a huge mistake to being a humongous one. Side effects are an abomination for functional languages. One of the purposes of functional languages is to completely avoid them (if such thing were possible). Therefore using Cucumber with Scala, is, well, …. now you know.

· Once your testing source code base becomes huge, your system will reach an untouchable-unmaintainable state. So much global data will be in use by so many different methods, that any minor change can potentially break many things. You will reach the state of “the thing is working. Just don’t touch it”, and by then, you will think about rewriting everything, and starting the cycle again. Avoid a full costly cycle. Use better tools now, and make informed decisions! I don’t tell you this due to any psychic powers. That’s what happened with a lot of the code written in “C”.

Conclusion: It is my opinion that Gherkin language, and the Cucumber technology should not be used for software testing (or any programming task for that matter). Would you ask a brain surgeon to adopt procedures only in a form of English anyone can understand, and ask him/her to operate through this interface, denying the doctor of complicated helpful technology just so others can understand it? I certainly would not. Why, then demand this of skilled Computer Science professionals in the code they write?

But I will end with a positive note: There are very good technologies, which allow very powerful test coding and really reduce the cost of writing code, and increase code reuse in a structured, well architected way. If you are looking for tools, I highly recommend looking at NUnit (https://www.nunit.org), and use solid sound principles of software design. While NUnit documentation may not be the easiest to understand, and some things are a bit frustrating during the ramp up process, it’s a very powerful tool. I’m sure there are other test adapter options as well. Among the benefits of NUNit (and that is not the topic for this blog, so this is a bonus) you will find:

· Creation of tests is extremely easy

· Visual Studio supports it

· Versatile and powerful data driven testing capabilities

· Powerful use of attributes to mark method parameters which can be very handy in data reuse.

· Powerful use of attributes which easily allows preparing data required for a test.

· It’s now in version 3.x

Moving away from coding simplicity will cost time, resources, and hence money. Moving towards coding simplicity will do the opposite, and although obvious it’s not adopted in many cases. If you are looking for simplicity and lowering your software testing cost, my advice for you today is “leave cucumbers for salads. OK, and maybe for wrinkle therapy over somebody’s eyes”.

Just as shipping code, a test code base and a good testing framework requires the adoption of good designs, the application of some solid code architecture based on a solid understanding of Computer Science principles. Design reviews, and code planning for your QA team should be a common business practice. Unfortunately, it’s not.

Gonzalo Isaza StackOverflow alias: DDRider62 github alias: giposse

Summarize

Why Cucumber and the Gherkin Language should not be used for Software Testing