Programming fundamentals - Part 07: Testing

The further we go, the more complexity we begin to introducing into our programs. As our programs become longer and more elaborate, it gets harder and harder to avoid making mistakes and then finding and fixing them afterwards. A good programmer isn't one who never makes mistakes, such programmers do not exist. Instead, a good programmer takes precautions to minimize the amount of bugs and to make discovering them easy.

There are roughly speaking two ways to mitigate the impact of bugs: design software such that bugs are less likely and test your software to shake out bugs that have already made their way into the code. We will explore the program design aspects in future chapters and tackle testing first, because testing will inevitably be involved in the process either way.

How software is tested

Software testing is simple in principle: you need to execute the program in different scenarios to exercise the code and to make sure the program behaves how you'd expect it to.

However, there are many different ways to approach this and different levels to the testing. Some testing might test the whole system, or only parts of it. The testing might be done with full awareness of the structure of the code or with no insight whatsoever. The testing might be routine validation of the program or attempts to find new and creative ways to break the program.

The simplest approach that people intuitively land on is to repeatedly start the program, plug in different inputs and then manually verifying that the output of the program is what it should be. This approach doesn't require a whole lot of planning and generally exercises at least most of the code, but this kind of improvised testing is limited in that it's fairly time-consuming, non-repeatable and may only test typical cases and overlook edge cases. You can improve on this methodology by writing down each test case and make sure to follow the instructions for each test to validate each scenario you have thought of.

However, humans are not the greatest at following precise instructions, especially if the same task needs to be repeated multiple times. You might want to test your software fairly regularly to make sure you didn't break things that used to work. This kind of regression testing is very important but also extremely boring.

Luckily as programmers we have a tool that follows instructions with extreme levels of pedantry and withot getting tired or bored. So, we can also express our tests as programs for the computer to execute.

Making assertions about our code

Suppose we have the following (broken) function that is supposed to sum the numbers in a list:

total = 0

function sumNumbers(list)
   local i = 0

   while i < #list do
      total = total + list[i]
      i = i + 1
   end
end

This code is intentionally written in a poor style, so there are things we could have done here that would eliminate some of the bugs, but it will allow us to discover a few bugs in total.

We can then write programs that use this function and see how it behaves. We could inspect the results by printing them, but this would require a human in the loop to read the output of the program and see if it matches with our expectations. So, let's not do that. Instead we can add assertions into our program that will test if a statement is true and fail our program if an assertion does not hold.

We will need to start out with some test cases, so let's come up with a few scenarios we know about how the summation function should work. First of all, if we give it an empty list, the sum should be zero. If we give it one number, the sum should be that number. And if we give it a few example numbers, such as {1, 2, 3} then we should be able to expect a sum of 6. Let's write a program with these assertions in place.

In Lua we write assertions with the assert() function. It takes one argument that should be either true or false. If the argument is true, the program keeps running. If the argument is false then the program is halted and an error is thrown.

total = 0

function sumNumbers(list)
   local i = 0

   while i < #list do
      total = total + list[i]
      i = i + 1
   end
end

assert(sumNumbers({}) == 0) 
assert(sumNumbers({1}) == 1) 
assert(sumNumbers({1, 2, 3}) == 6) 
print("All tests successful!")

This program fails with an error that should look something like:

lua: stdin:12: assertion failed!
stack traceback:
	[C]: in function 'assert'
	stdin:12: in main chunk
	[C]: in ?

This means that our assertion on line 12 failed. We apparently didn't get a zero back when we ran the function with no arguments. Luckily, this is easy to solve, we simply forgot to return the total at the end of the function:

total = 0

function sumNumbers(list)
   local i = 0

   while i < #list do
      total = total + list[i]
      i = i + 1
   end

   return total
end

assert(sumNumbers({}) == 0) 
assert(sumNumbers({1}) == 1) 
assert(sumNumbers({1, 2, 3}) == 6) 
print("All tests successful!")

However, running this also revealed another issue, this time caught by Lua inside the sumNumbers() function. We tried to access a nil field. This is actually an easy mistake to make if you have used other programming languages, since it stems from our i variable. We are trying to index into the list starting at index 0, but in Lua list indices start at 1. Another quick fix gets us out of that mess but into another:

total = 0

function sumNumbers(list)
   local i = 1

   while i < #list do
      total = total + list[i]
      i = i + 1
   end

   return total
end

assert(sumNumbers({}) == 0) 
assert(sumNumbers({1}) == 1) 
assert(sumNumbers({1, 2, 3}) == 6) 
print("All tests successful!")

Now the second assertion is failing. And if we run the sumNumbers({1}) separately we will quickly see why. We are getting a zero back, even though we expected to get back a one. This is due to another common mistake that sneaks into a lot of code: an off-by-one. We are using the less-than operator when we are checking if the index variable i is within the expected length of the list. This ignores the last element of the list and we end up getting the sum of every number except the last one. Easily solved by using less-than-or-equal (<=) operator.

total = 0

function sumNumbers(list)
   local i = 1

   while i <= #list do
      total = total + list[i]
      i = i + 1
   end

   return total
end

assert(sumNumbers({}) == 0) 
assert(sumNumbers({1}) == 1) 
assert(sumNumbers({1, 2, 3}) == 6) 
print("All tests successful!")

Now we are at the final assertion and it seems like despite our working for previous cases, we still aren't getting back the values we'd expect for all cases. In fact, more worryingly, if we run the same test for a list of one, we fail that test too. The problem is our total variable, which we accidentally put outside of the function as a global variable. Therefore the value of the total is not reset between calls to sumNumbers() and we end up getting a combination of the totals. Easily solved and now our program complete happily.

function sumNumbers(list)
   local total = 0
   local i = 1

   while i <= #list do
      total = total + list[i]
      i = i + 1
   end

   return total
end

assert(sumNumbers({}) == 0) 
assert(sumNumbers({1}) == 1) 
assert(sumNumbers({1, 2, 3}) == 6) 
print("All tests successful!")

All tests successful!

Structuring and triggering tests

Setting up our tests as just regular top-level expressions that get executed when our program is run is one way to set up test execution, but it's not necessarily desirable. Even automated tests may take some time, so we might not want to run tests over and over, especially if we already know they will complete successfully.

Secondly, a flat list of assertions is difficult to parse when we get a test failure. You have to look at the line number Lua gives you and find the test case on that line to find out what went wrong. And it might also be difficult to figure out what a particular assertion is supposed to test.

A simple way to alleviate both problems a little bit is to structure our tests and test suite a bit differently. If we put each of our tests inside a function, then Lua will spit out that function name if our assertion for that test fails. This makes it easier to see at a glance which test failed. Secondly, we can also put the test execution inside a function that will simply call the other test functions in sequence. Then when we want to run the tests, we just need to call that main test function.

function testSumOfEmptyListIsZero()
   assert(sumNumbers({}) == 0) 
end

function testSumOneNumberIsTheNumber()
   assert(sumNumbers({1}) == 1) 
end
  
function testSumOfExampleNumbers()
   assert(sumNumbers({1, 2, 3}) == 6) 
end

function runTests()
   testSumOfEmptyListIsZero()
   testSumOneNumberIsTheNumber()
   testSumOfExampleNumbers()
end

Where the runTests() function is triggered is up to you. One way is to load all of the code up into the interactive Lua REPL and simply executing the function from there. Or, alternatively your program could call the function at the start and you can simply comment out that line with the -- comment prefix when you don't want to run the tests again.

This way of running tests still isn't perfect though and has two issues: tests are always run in a specific order and we have to remember to add new tests into the runTests() function in order to make sure they get executed. Our test suite will also stop when a failing tests is encountered, so we might need to run the test suite multiple times to find and fix all of the failing tests.

In most projects, a proper testing framework is used, which provides all of that missing functionality. One such framework for Lua is LuaUnit, which you could add into your project as a third party library. We'll discuss those alongside with splitting programs into multiple files in the next chapter.

But, the most important part is that now you know at least one way to write code that tests other code. At the end of the day, automated tests are just functions that call other code and check if the output they get back matches with the pre-defined expectations. The other things a proper testing framework might do are mostly just conveniences, albeit very useful conveniences.

When to test and what to test

The question of who should be testing what and when testing should take place has no universal consensus. Different methods have been tried with varying results, but a specific approach has not established itself as clearly better than all others.

Some teams prefer to write the implementations first and then write tests, either manual or automated, that verify that the implementation is working. This might be combined with less formal testing, where the program is quickly iterated upon and tested manually without stopping to write a formal test suite until possibly later towards the end of the project. This can be a very reasonable approach for software where the cost of bugs is very low, such as video games.

Other teams prefer the opposite: tests are written first and only then the implementation is written to satisfy the tests. This can take the form of test-driven development, where the programmer alternates between writing tests and implementation code and always proceeds in small increments. This helps build a very comprehensive test suite as you tackle the problem in small chunks, where each chunk gets immediately tested.

Some teams even have varied practices when it comes to test automation. Some problems are difficult to test, for example user interfaces or complex simulations, so sometimes they are easier to test by hand. However, even user interfaces are possible to test and complex simulations might be possible to break down into simpler, testable parts, so I don't think test automation should be discarded as an option too easily.

In general, my advice would be to test often and automate as much as possible. And when your program breaks in a new way, try to create a new test case that tests a similar scenario, in order to prevent that problem from happening again. When it comes to test-first or code-first, I think the choice is largely up to you. I believe in test-driven development and I use it on work projects, but I also recognize that it isn't always necessary. I recommend you try both approaches and also consider the risks involved in the code you are writing. If the code is not going to be doing anything very important, you can probably be a bit more lax and get away with rudimentary testing, but if someone's money or life are dependent on your code working well, then maybe you should explore and employ different domains of software testing more thoroughly and assert your expectations regularly in the form of comprehensive, automated test suites.

Also, always remember: testing might reveal the presence of bugs, but it cannot show the absence of bugs. This doesn't mean it's useless, far from it, but you should understand that in many cases there is realistically no way to exhaustively express all scenarios your code may encounter in the form of traditional tests. In some cases it might be possible to formally prove the absence of bugs, but in most cases we as programmers do not go to those lengths.

Conclusion

So, now that we have software testing in our toolbox, we can worry slightly less about the growing complexity of the problems we are tackling. In the coming chapters we will look into tackling the software design part of the complexity equation, so that we can make sure our software remains manageable even as the length of our programs extends from hundreds of lines to thousands by componentizing our code. Testing can then be employed to make sure each component behaves correctly, thus hopefully resulting in systems that work well as a whole.

I hope you have find the information thus far useful and I hope to see you in the future chapters!