Blog

Beyond Data Science - Unit testing
Beyond Data Science - Unit testing

What is Unit testing:

In everyone’s journey of being a Software Developer, whether it’s in the initial phase while learning or in the stage while you are actually working as a Developer, we all come across something called Unit Testing. And I am pretty sure it’s the last thing that excites us.

In this post, I am going to tell you how Unit Testing may not be a super power, but with simple observations it might change your Software Developing experience. It can also add tremendously to your arsenal of tools in the journey of being a Software Engineer.

So what is Unit testing exactly?

Unit Testing is nothing but making sure that — as a Developer — whatever you write, works as expected before packaging and shipping it out for QA to test.

Intuitively, one can view a unit as the smallest testable part of an application. In procedural programming, a unit could be an entire module, but it is more commonly an individual function or procedure.

The term “unit” is typically defined as an isolated test case that consists of a the following components:

  • a so-called “fixture” (e.g., a function, a class or class method that you want to test)
  • an expected outcome (e.g., the expected return value of a function)
  • the actual outcome (e.g., the actual return value of a function call)
  • a verification message (e.g., a report that displays whether the actual return value matches the expected return value or not)

Why Unit testing?

I am sure, just like me, some people in the beginning of their coding days, must’ve thought the idea of writing extra code to make sure the entire development pipeline works is boring and highly unnecessary. Frankly speaking, it’s not. Unit Testing is simple and powerful and might even save your ass or even better, your holidays.

 

The basics of Unit Testing

How to do Unit testing?

In this post, I’ll explain Unit Testing using a simple unit testing framework in Python called Pytest which is a really handy and effective way to test code written in Python.

Example:

Let’s say you get a task of writing a module that, when given an arbitrary number, returns whether the number is a multiple of 5 or not. Seems easy right? Let’s go ahead and write the python code to complete this task.

 

Python module that checks whether a number is a multiple of 5 or not.

Mission Accomplished, right?

Now, let’s suppose you pushed this code to production(suppose) and went on a holiday. You are on a beach sipping mocktails when you get a call from a DevOps guy saying your code is failing in production for some cases.

Hint: Your code failed because someone passed 0 as a parameter and well, 0 is not really a multiple of 5. But your code says otherwise. (Try running the above function with 0 as a parameter)

Trying to salvage your holiday, this is what you tell the DevOps guys.

 

Works on my machine situation

The next day your manager finds out the only thing that has changed during the week is your code and guess what he says..

 

Manage-errrrrr

So you have to cancel what’s left of your remaining trip and fix that shitty bug.

Now, what you would’ve liked is to get an assurance that whatever you write works as seamlessly as a Swiss Watch.

You can achieve this by writing a small script that tests your code automatically using the Pytest Framework and as a result does not get bugged off by the DevOps guys on your peaceful holiday.

Before getting into how to do that exactly, let’s look at a statement in Python called an ‘assert’.

An ‘assert’ is a statement in Python which evaluates an expression which it hopes will be true and if it is false, it raises an exception of type AssertionError. Let’s see how it works with an example.

 

Example of an assert statement in python

The above code evaluates expressions line by line and if any expression is false, raises an AssertionError. In this case, 4 is NOT equal to 5 and hence the exception is thrown.

Now, coming back to our Multiples of 5 example. Let’s write a function that checks whether our code is working for different values of inputs. Now after writing this code, we realised that 0 is not a multiple of 5 and we expected the function isMultipleOfFive(0) to return False. But as shown below, the function returns True from our code since we have not handled the case of input being 0.

So the DevOps guys were not wrong it seems. Still, who gives a shit about them anyway.

 

Unit test report

The next day, you get back to office and modify your code so that now it handles the 0 being passed and re-run the test script and things fail again. Wow!

 

Unit test report

According to the unit test report, we face another problem here: Our code considers 5 as a factor of –10 (negative 10). For the sake of this example, let’s assume that we don’t want this to happen. We’d like to consider only positive numbers to be multiples of 5. In order to account for those cases, we need to make another small modification to our code by changing !=0 to >0 in the if-statement.

So let’s change the code to handle negative numbers and rerun the test cases.

 

Unit test report

Phewww! That was a lot of code writing, fixing, testing, re-writing, fixing and testing again for a simple case of checking whether a given number is a multiple of 5 or not.

Now, Imagine your manager walks up to you and says (Err….You know how they are)

 

A manager being a manager who knows nothing about SDLC

Let’s assume those features are like pulling out data from a database, performing some operations, updating the database,writing logs, making some API calls and so on. Testing these functions manually and individually would be a stupid idea because there might be hundreds of functions and situations where your code depends on someone else’s code. Unit Testing is the only way to make sure any changes you’ve made are not breaking the entire codebase by writing code for whatever functions you have written.

Depending on what the function you have written does, and the test cases you want to cover, you can write a script that checks and evaluates those test cases.

As a rule of thumb, we can start off with very basic test cases like ones mentioned below:

  1. For a given input, what the function is expected to return
  2. What kind of data structure the function is supposed to return (to make sure other functions work, if it expects a specific data structure)
  3. Are you able to connect to the database and do the operations as expected
  4. Edge cases as we saw above

Pytest:

All this was the basics of Unit Testing. But I haven’t mentioned the particulars of Pytest. Let’s fix that. Let’s talk about how Pytest works and see how to run some test cases.

Q. How to install Pytest?

Ans. We can install Pytest just like any other library using the ‘pip install’ command.

 
Q. How to execute a test case?

Ans: Just write ‘pytest’ and the filename that contains the test cases for the code you have written. (Note: You will need to import the modules you have written so that they are accessible by the test script.

 
Q. How to execute multiple tests written is different file with a single command?

Ans. You can just write the command ‘pytest’ and it will run both the test scripts (test_module.py and test_multiple.py) in our directory Unittest.

 

Unit test Report

Pytest will run all files of the form test_*.py or *_test.py (filenames with a prefix or suffix as test) in the current directory and its sub-directories.

For more information on Pytest, please refer its documentation on https://docs.pytest.org/en/latest/

Wrapping up

Finding bugs and fixing them before delivery is imperative. It significantly reduces the costs to the company that would’ve arisen because of simple errors like missing a tab or a colon which can crash the entire application. The company could’ve end up losing business just because you didn’t test the code properly.

 

Bug vs Cost

Conclusion:

 

Continuous Integration Cycle

Unit Testing always comes in handy to make sure the code is working as expected. Automated tests also help in the continuous integration stage shown above where after pushing your code to DEV, you can setup a test run that runs all the automated tests and it notifies you and the entire team if something breaks which — let’s agree — is way better than pushing your production and breaking it in live environments.