This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

06. Testing

Introduction to testing concepts and automated unit testing.

You are getting the first edition of all these pages. Please let me know if you find an error!

Testing is integral to all forms of engineering. Software developers often write as much test code as they do product code! This set of labs introduces testing concepts and automated testing.

1 - Assertions

The building block of testing.

You are getting the first edition of all these pages. Please let me know if you find an error!

Software testing is both a manual and an automated effort.

Manual testing is when a tester (or user) enters values into the user interface and checks the behavior of the system.

Automated testing is where test code is used to check the results of the main product code. Automated testing is an essential part of program verification, which is an evaluation that software is behaving as specified and is free from errors.

Automated testing is a necessity in real systems with thousands of lines of code and many complex features. Manual testing is simply infeasible to do thoroughly.

Code that verifies code?

Automated testing in this case means writing code. Developers and testers write code and scripts that executes and tests some other code.

Exercise

  1. Create a directory named testing-lab in your seng-201/ directory.
  2. Download sample.py and put it in the testing-lab/ directory.
  3. Open it in Visual Studio Code, and run it.

The function calls in the __main__ section of code are a semi-automated test. The calls are automated, but the verification is still manual – you, the developer, have to verify that the output is indeed correct.

To have automated testing, we need a programmatic indicator of correctness. Enter the assert statement.

The assert statement

Nearly all programming languages have an assert keyword. An assertion checks if a value is True or False. If True, it does nothing. If False, the assert throws a special type of exception. Assertions are commonly used in languages like C and Ada to verify that something is True before continuing execution.

In most modern languages, including Python, the assert is the basis of automated testing.

Exercise

Let’s explore the assert in Python.

  1. Create a new file named test_sample.py in the testing-lab/ directory. Edit the file in Visual Studio Code.
  2. Add the following code:
    test_sample.py
    
    assert True
    assert False
    print("Made it to the bottom.")
    
  3. Run test_samply.py. Notice the following.
    • assert True does not produce any output. The program simply continues.
    • assert False generates an exception. This is expected.
    • The print(...) statement did not execute because the exception generated by assert False crashed the program.
  4. Comment out the assert False line and run it again. The print(...) statement will execute.

This demonstrates the behavior of assert. Let’s add some more interesting assertions. Add the following lines to the bottom of test_sample.py:

test_sample.py

x = 2**5
assert x == 32
assert type("Bob") == str
y = 16
assert x-y==16 and type("Bob") == str and int("25") == 25
print("Made it to the bottom.")

The right-hand side of the assert statements now use comparisons and boolean operators. This looks a bit more realistic. The assert can have any simple or complex Boolean expression so long as it evaluates to True or False.

Quick Exercise: Change the operators or values in the expressions so they evaluate to False. Notice how the last assert can fail if any of the comparisons are false.

We’ll put our assertions to work testing program code in the next lab.

Knowledge check

  • Question: What two things are you trying to verify with program verification?
  • Question: Why do we need automated testing?
  • Question: What happens next if a Python program encounters the statement assert True?
  • Question: What happens next if a Python program encounters the statement assert False?
  • Question: What happens when the following executes: assert 16 == 2**4?
  • Question: What happens when the following executes? assert len('Bob') > 0 and 'Bob' == 'Alice'

2 - Unit testing

Using assertions to test a file.

You are getting the first edition of all these pages. Please let me know if you find an error!

Assertions are the basis of modern automated testing. Developers write test code in source files that are separate from the main program code. We have our program code in sample.py and the test code will be in test_sample.py. This is a common naming convention.

In practice, the test code will be kept in a separate directory from the program code.

Testing sample.py

Now, let’s use our assert to test the correctness of the functions in sample.py.

  1. Comment out all the code in test_sample.py
  2. Add the line import sample. In Python, this makes the content of sample.py accessible to code in test_sample.py.1
  3. Now let’s convert those print(...) statements from sample.py into assert statements in test_sample.py. test_sample.py should now have the following:
    test_sample.py
    
    import sample  # We import the filename without the .py
    
    assert sample.palindrome_check("kayak")  # the function should return True, giving "assert True"
    assert sample.palindrome_check("Kayak")
    assert sample.palindrome_check("moose") is False  # the function should return False, giving "assert False is False", which is True
    
    assert sample.is_prime(1) is False
    assert sample.is_prime(2)
    assert sample.is_prime(8) is False
    
    assert sample.reverse_string("press") == "sserp"  # checking result for equality with expected
    assert sample.reverse_string("alice") == "ecila"
    assert sample.reverse_string("") == ""
    print("All assertions passed!")
    

Point 1: We access the functions in sample.py by calling, e.g., sample.palindrome_check(...). The prefix sample.X tells Python “go into the sample module and call the function named X.” We would get an error if we called only sample.palindrome(...) because Python would be looking in the current running file, which has no such function defined in it.

Point 2: In Python, you should check if a value is True or False using is. The is operator returns a boolean. You could also type x == True or x == False. Either form will work, but is is preferred2.

Point 3: Remember that palindrome_check() and is_prime() return True/False themselves. We are simply verifying that they are returning the correct value. reserve_string() returns a string value, so we need to compare using == to an expected value.

Point 4: The program will crash with an AssertionError if any of the assert statements are False. Mess up one of the assertions to verify this.

Exercise

  1. Go to sample.py and define a function named power() that takes two parameters, x and y, and returns the computed result of .
  2. Add assert statements to test_sample.py to verify your function behaves correctly.

Unit tests

The file test_sample.py is what software engineers call an automated unit test. Unit tests test individual an individual classes or source files3. Unit tests are usually written by the same developer who wrote the program code.

Our automated unit test now calls functions and use assert statements to verify that they are returning the expected results. If an assertion fails, the test fails.

What does it mean if a test fails? One of two things:

  1. Either there is something wrong in the program code. Maybe there is a logic error.
  2. The test code itself has a mistake in its logic.

Regardless, if a test fails, you need to figure out why. A good unit test will systematically exercise all the logic of the function or module under test. This can help uncover flaws in the program code. We will discuss strategies to do this in subsequent lessons.

We also need a way to run the test code and accumulate the results in a useful way. We will do this in the next lab.

Knowledge check

  • Question: Suppose you wanted to test a function named get_patient_priority(str) in hospital.py. What would you have to do to call the function from your test code?
  • Question: The right hand side of an assert statement can be any expression (simple or complex) as long as it evaluates to _____ or _____.
  • Question: Who writes unit tests?
  • Question: The name for a test that tests an individual module is a ______ test.
  • Question: Why do you think we write separate assert statements for each function input, rather than one assert statement that calls the function multiple times with different inputs? That is, why not do assert sample.reverse_string("alice") == "ecila" and sample.reverse_string("") == ""?

  1. In Python parlance, a single file is called a module. You can create complicated modules that are collections of multiple source files. This is how many popular Python libraries like random work, as do third party libraries like pytorch and keras used for machine learning. It is a way to bundle functions and classes for convenient use in source code. ↩︎

  2. If you are dying to know the difference between x is False and x == False, it’s because many other values are equivalent to True and False when using ==. Empty values, such as 0 or [] are == False (try it). But only False is False. Similarly, only True is True, but 1 == True↩︎

  3. The unit is usually a single class. However, in our case, there is no class, but a collection of functions in a file. Some people treat a file as a unit. But a file can have multiple classes in it. The definition of a unit is a bit fuzzy, but usually refers to either a class or a single file. ↩︎

3 - Structuring test code

Organizing the test code has benefits, just like organizing program code.

You are getting the first edition of all these pages. Please let me know if you find an error!

Limitations to the current approach

In the previous lab, we gathered our assert statements into a test file that can be run. If the test file runs to completion, our tests have passed. If it fails with an AssertionError, we know that a test has failed and something is wrong (either with the program code or the test code itself). We have the beginnings of automated unit testing.

Our current goal

What we have so far is a good start, but we have two things to improve upon:

  1. Currently, we can only fail one assert the test file at a time. Ideally, we would like to know if multiple test cases are failing.
  2. We would like to collect our test results in a human-friendly format. I run the test, I get a summary of passes and fails.

We can accomplish these both these things. First, we need to organize our test cases in our test file. Second, we will need help from developer tools.

Current state

Here is our sample.py file:

sample.py

def palindrome_check(s):
    cleaned_str = ''.join(s.lower()) 
    return cleaned_str == cleaned_str[::-1]

def is_prime(n):
    if n <= 1:
        return False
    for i in range(2, int(n**0.5) + 1):
        if n % i == 0:
            return False
    return True

def reverse_string(s):
    return s[::-1]

And here is the test code:

test_sample.py

import sample  # We import the filename without the .py

assert sample.palindrome_check("kayak")  # the function should return True, giving "assert True"
assert sample.palindrome_check("Kayak")
assert sample.palindrome_check("moose") is False  # the function should return False, giving "assert False is False", which is True

assert sample.is_prime(1) is False
assert sample.is_prime(2)
assert sample.is_prime(8) is False

assert sample.reverse_string("press") == "sserp"  # checking result for equality with expected
assert sample.reverse_string("alice") == "ecila"
assert sample.reverse_string("") == ""
print("All assertions passed!")

Remember, we use the naming convention test_<file>.py to identify the unit test for <file>.py.

Organizing test code into test cases

To meet our goal, we will first organize our assert statements into test cases, which has a conceptual and a literal definition:

  1. test case (concept): inputs and expected results developed for a particular objective, such as to exercise a particular program path or verify that a particular requirement is met. [Adapted from ISO/IEC/IEEE 24765].
  2. test case (literal): a test function within a test file.

Let’s start simple. Let’s move the assert statements that test each function into their own functions in the test file like so:

test_sample.py

import sample  # We import the filename without the .py

def test_palindrome():
    assert sample.palindrome_check("kayak")  # the function should return True, giving "assert True"
    assert sample.palindrome_check("Kayak")
    assert sample.palindrome_check("moose") is False  # the function should return False, giving "assert False is False", which is True

def test_is_prime():
    assert sample.is_prime(1) is False
    assert sample.is_prime(2)
    assert sample.is_prime(8) is False

def test_reverse():
    assert sample.reverse_string("press") == "sserp"  # checking result for equality with expected
    assert sample.reverse_string("alice") == "ecila"
    assert sample.reverse_string("") == ""

# run the test cases when executing the file
if __name__ == "__main__": 
    test_palindrome()
    test_is_prime()
    test_reverse()

We say now that each of test_palindrome(), test_is_prime(), and test_reverse() is a test case. We have three (3) test cases in one (1) unit test file.

Note the naming convention: all the test case functions begin with the string test_. This is a requirement of the developer tool in the next lab that will help us run multiple test cases even if one of them fails.

The block beginning with if __name__ == "__main__": allows us to run the tests by running the file. You should not see any output when you run the unit test because all of these assert statements should evaluate to True.

Diversifying our test cases

One test case for each function in your program code is where you should start. However, we often want more than one test case per program code function. Why?

Consider why we have multiple simple assert statements. Suppose we have the following valid assertion: assert sample.is_prime(1) is False and sample.is_prime(2). Now, suppose this assertion failed due to a bug in our program code. The bug could either be with the logic of dealing with the input 1 or 2. We put our checks in separate assert statements so we know precisely which input caused an error in the program code.

The same strategy applies when unit testing program code.

Program paths

A program path is a sequence of instructions (lines of code) that may be performed in the execution of a computer program. [ISO/IEC/IEEE 24765] Take a look at is_prime() in sample.py:

 5
 6
 7
 8
 9
10
11
def is_prime(n):
    if n <= 1:
        return False
    for i in range(2, int(n**0.5) + 1):
        if n % i == 0:
            return False
    return True

Program paths are formed by the unique sequence of instructions (lines of code) that may be executed. is_prime() has three unique program paths:

  1. Giving the input 1 executes lines 5, 6 and 7. This path (5,6,7) deals with special cases where our input is ≤ 1. One (1) itself is not prime, and neither are 0 or negative numbers by definition.
  2. Giving the input 4 executes lines 5, 6, 8, 9, and 10. This path (5,6,8,9,10) accounts for numbers > 1 that are not prime.
  3. Giving the input 5 will execute lines 5, 6, 8, 9 and 11. This path (5,6,8,9,11) accounts for numbers > 1 that are prime. The input 3 is a special case of this that does not include line 8.

Path testing

Let’s group assert statements that test “a particular program path” or “a particular requirement” (see the test case definition) into separate test cases. Change test_is_prime() to the following:

test_sample.py

def test_is_prime():
    assert sample.is_prime(2)
    assert sample.is_prime(8) is False
    assert sample.is_prime(2719)
    assert sample.is_prime(2720) is False

def test_is_prime_special_cases():
    assert sample.is_prime(1) is False
    assert sample.is_prime(0) is False
    assert sample.is_prime(-1) is False

These test cases both verify is_prime() but examine different program paths.

test_is_prime_special_cases() tests path #1 (previous subsection). We know something is wrong with the part of our algorithm that handles the special case of integers ≤ 1.

test_is_prime() tests paths #2 and #3. WE know something is with the part of the algorithm that checks if the input is divisible by a potential factor if that test case fails.

The ability to pinpoint where the algorithm is failing is very useful to the developer when they go to debug. Especially when you have many test cases and hundreds of lines of program code.

Some functions only have one program path, and so one test case may be sufficient.

Your testing strategy

Writing separate test cases for each program path or requirement is a testing strategy. But, it can be hard to know how much to identify the program paths or to know how many tests are “enough”.

For now, start with one test case per program function.

Then ask yourself, “are there sets of input where the program behaves differently than for other inputs?” If so, divide your test case to separate those input sets. In is_prime(), the program behaves differently if you give it inputs ≤ 1 vs. inputs > 1 that are prime vs. inputs > 1 that are not prime.

We will discuss how to analyze a program to create a good test strategy in future lessons, as well as quantify how good our tests are.

Exercise

Our test_is_prime() has lumped together the program paths where the number is prime and the number is not. Reorganize this test into two test cases: one for each program path. Write one test case asserting only prime numbers ≥ 1, and the other only non-prime numbers ≥ 1.

Knowledge check

  • Question: In test code, a single function is called what?
  • Question: How many program paths will a function with a single if-else statement have?
  • Question: What is a program path?
  • Question: Conceptually, what is a test case?
  • Question: Besides generally being more organized, why do software developers want to split up their tests into multiple test cases?
  • Question: Suppose you have a program file that defines the functions foo() and bar(). How many test cases should you have at a minimum in your test code? What should they be named?

4 - pytest

Use a test framework, pytest, to run tests and collect results.

You are getting the first edition of all these pages. Please let me know if you find an error!

Test frameworks

We have created a well-organized unit test in the previous lab. Our test code is looking good, but we still need to address two issues for it to be truly useful:

  1. We would like to know if multiple test cases are failing.
  2. We would like to collect our test results in a human-friendly format.

Automated test frameworks address these find and execute test code (often through naming conventions like test_*), capture assertion exceptions (test case failures), and generate summaries of which tests pass and fail.

Automated test frameworks are an integral part of modern software engineering.

Introducing pytest

We will use an automated test framework for Python called pytest. Test frameworks are language-specific. Java has JUnit, C++ has CPPUnit, JavaScript has multiple options, etc. Automated test frameworks exist for nearly every programming language and do largely the same things.

pytest is a library. Libraries are source code or compiled binaries that provide useful functions. They are almost always written in the same programming language as the program code. Professional software engineers use third-party libraries, often open source, to provide functions that they would otherwise have to write themselves.

In our case, we could write some try-except blocks to catch our assertion exceptions, create counters to track the number of tests passed or failed, and then print out the results. But why do that when we can use a library? No sense in reinventing the wheel.

Installing pytest with pip

We install pytest and another tool we will use later from the CLI. Choose your operating system below and follow the instructions:

pip3 install -U pytest pytest-cov
  
  1. Run these commands first from your Terminal:
    sudo apt update -y && sudo apt upgrade -y
    sudo apt install python3-pip python3-venv
    # Make sure your working directory is the directory the test files are in.
    python3 -m venv .venv  # This will create a subdirectory named .venv/
    
  2. Open Visual Studio Code in the working directory. It is essential that your testing-lab/ directory is the top-level of Visual Studio Code.
  3. Press Ctrl+Shift+P or select View-Command Palette
  4. Search for “environment” and select Python: Create Environment…
  5. Select Venv
  6. Select Use Existing
  7. The integrated Terminal in Visual Studio code should restart, and you should see a little (.venv) at the beginning of the command line. Contact the instructor if you do not.
  8. You will run all subsequent Terminal commands from the integrated Terminal in Visual Studio Code.
  9. From the integrated terminal, run
    pip install pytest pytest-cov
    

What is pip? It is basically the App Store for Python packages. A package contains one or more libraries or executable tools. pip was included when you installed Python on your computer. We will use pip again to install useful packages in future labs.

Running test code with pytest

You should have a testing-lab/ directory containing sample.py and test_sample.py. If not, grab the files from the previous lab Change into the testing-lab/ directory so that it is the working directory in the terminal.

Run pytest test_sample.py in the terminal. You should see console output similar to the following:

collected 3 items                                  

test_sample.py ...                           [100%]

================ 3 passed in 0.01s =================

pytest scans your test file looking for functions that follow the naming convention test_<function_name> and “collects” them. I had three test case functions in my code, but you may have more or less, so your “collected” number may be different. Test case function names must start with test_ for pytest to run them.

pytest then calls each test case separately and checks to see if the test case throws an AssertionError. If so, the test case fails. If not, the test case passes

Let’s introduce errors in our program code sample.py to show pytest collecting multiple test case failures, which is one of our improvements needed for automated unit testing.

Open sample.py and make the following changes:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
def palindrome_check(s):
    # cleaned_str = ''.join(s.lower()) 
    cleaned_str = ''.join(s)  # this makes "Kayak" no longer a palindrome because of different case 
    return cleaned_str == cleaned_str[::-1]

def is_prime(n):
    # if n <= 1:
    if n <= 0:  # the algorithm will now say that 1 is prime, which is incorrect by definition
        return False
    for i in range(2, int(n**0.5) + 1):
        if n % i == 0:
            return False
    return True

Now run pytest test_sample.py again. Your output should now look something like this:

collected 3 items                                                                                                                                      

test_sample.py FF.                                                                                                                               [100%]

======================================================================= FAILURES =======================================================================
___________________________________________________________________ test_palindrome ____________________________________________________________________

    def test_palindrome():
        assert sample.palindrome_check("kayak")  # the function should return True, giving "assert True"
>       assert sample.palindrome_check("Kayak")
E       AssertionError: assert False
E        +  where False = <function palindrome_check at 0x1023494e0>('Kayak')
E        +    where <function palindrome_check at 0x1023494e0> = sample.palindrome_check

test_sample.py:5: AssertionError
____________________________________________________________________ test_is_prime _____________________________________________________________________

    def test_is_prime():
>       assert sample.is_prime(1) is False
E       assert True is False
E        +  where True = <function is_prime at 0x1023493a0>(1)
E        +    where <function is_prime at 0x1023493a0> = sample.is_prime

test_sample.py:9: AssertionError
=============================================================== short test summary info ================================================================
FAILED test_sample.py::test_palindrome - AssertionError: assert False
FAILED test_sample.py::test_is_prime - assert True is False
============================================================= 2 failed, 1 passed in 0.03s ==============================================================

We can see at the nice human-friendly summary at the end that 2 failed and 1 passed. The names of the test cases that failed are printed, as are the exact assert calls that failed.

Other ways of running pytest

  1. You can run pytest without giving it a target file. pytest will scan the working directory looking for files with the naming convention test_<file>.py. It will collect and run test cases from all test_<file>.py it finds.
  2. Try running pytest --tb=line to get a condensed version of the results if you find the output to be overwhelming.

Recap

We accomplished a couple significant things in this lab:

  1. We installed the pytest package using pip. Again, you only need to do this once.
  2. We ran pytest, which scans for files and functions named test_* and runs them.
  3. pytest collects test case successes and failures independently from one another, allowing us to get more information with each run of our test code.
  4. pytest displays a summary of the results in human-friendly format.
  5. All popular programming languages have a test framework. You will need to seek out one for the language you are working in.

Knowledge check

  • Question: The Python tool we run to install Python packages is called _______.
  • Question: For pytest to find and execute tests automatically, the test files and test cases must begin with __________.
  • Question: (True/False) You can have multiple assert statements in a single test case?
  • Question: Create a file called math.py with the following function:
    def compute_factorial(n):
        if n < 0:
            return "Factorial is not defined for negative numbers."
        elif n == 0 or n == 1:
            return 1
        else:
            factorial = 1
            for i in range(2, n + 1):
                factorial *= i
            return factorial
    
    1. Create a test file.
    2. Implement one or more test cases that cover all program paths in the function.
    3. Use pytest to execute your test code.

5 - Testing for exceptions

How to test for expected exceptions.

You are getting the first edition of all these pages. Please let me know if you find an error!

Before you start

If necessary, fix up your sample.py so that all your test cases pass.

Testing for exceptions

Sometimes, the expected behavior of a function is that it throws an exception. How do we test for expected exceptions given an input?

Suppose we want reverse_string() to work only for strings containing the letters [a–z] and to throw an exception if the string contains any other characters. Change reverse_string() in sample.py to the following:

13
14
15
16
def reverse_string(s):
    if not s.isalpha():
        raise ValueError('letters a-z only')
    return s[::-1]

This is appropriate given the requirements of reverse_string(). It returns a reversed str input under normal circumstances, but raises an exception under abnormal circumstances, a.k.a., exceptional conditions from our problem statement structure.

“Raising” and “throwing” an exception are the same thing. You will hear both terms in practice. The keyword in Python is raise, and exceptions in Python always end with the string Error, e.g., ValueError or IndexError.

Exercise

  1. Define a new test case in test_sample.py named test_reverse_exception and add a call to sample.reverse_string with an input that will trigger the exception.
  2. Run pytest. You should see a test summary similar to the following:
================================= short test summary info =================================
FAILED test_sample.py::test_reverse - ValueError: letters a-z only
FAILED test_sample.py::test_reverse_exception - ValueError: letters a-z only
=============================== 2 failed, 2 passed in 0.06s ===============================

I have two test failures: the new test case I created, and the original test_reverse. This is because test_reverse in my code contains the call assert sample.reverse_string(''). The empty string does not consist of the letters [a–z], so an exception is correctly raised.

This is an important lesson: as program code evolves, so too might the test code. Move the assert sample.reverse_string('') to the test_reverse_exception test case where it logically belongs.

Your test cases for reverse_string should now look something like this:

13
14
15
16
17
18
19
def test_reverse():
    assert sample.reverse_string("press") == "sserp"  # checking result for equality with expected
    assert sample.reverse_string("alice") == "ecila"

def test_reverse_exception():
    sample.reverse_string("abc123")
    sample.reverse_string("")

Verifying expected exceptions with pytest

Our assert statements only check the return values of functions. pytest provides a convenient helper function to check if an exception was raised.

First, add the line import pytest to the top of your test code file test_sample.py.

Second, change test_reverse_exception to the following:

18
19
20
21
22
23
24
def test_reverse_exception():
    with pytest.raises(ValueError):   # the pytest.raises comes from the imported pytest module
        sample.reverse_string("abc123")
    
    with pytest.raises(ValueError) as err:  # we can optionally capture the exception in a variable
        sample.reverse_string("")
    assert str(err.value) == "letters a-z only"  # convert the exception to a str and verify the error message

A few things of note:

  • pytest.raises(...) requires that you specify the type of exception. In our case, we expect a ValueError to be raised.
  • We can optionally capture the exception itself. That’s what as err does on line 22. err is a variable (name it whatever you want) that captures the exception.
  • On line 24, we can call str(err) to convert the exception to a string. That error message should be "letters a-z only", which comes from the line raise ValueError('letters a-z only') in sample.py.

This test case would fail if reverse_string() did not raise an exception

Exercise

  1. Comment out the if-statement and exception raising lines in reverse_string() and rerun pytest. How does the pytest output for an expected exception differ from a failed assert?

Checking exception values

Checking the exception message is useful because we may want our function to raise ValueErrors under different circumstances. For example, maybe we want to raise a ValueError for the empty string that says ‘string cannot be empty’, and a different ValueError for letters a-z only.

Why would you want to raise two different ValueErrors? Because it tells the caller of reverse_string() what they did wrong and how to fix it. It’s similar rationale to why we split our assert statements and our test cases into multiple instances to get more precise info.

Exercise

  1. Put the if-statement and exception raising back in reverse_string(). Add an if-statement at the beginning of the function to check if the input parameter is the empty string. If so, raise ValueError('string must not be empty'). Re-run pytest. What happens?
  2. Modify your test_reverse_string so that both with pytest.raises(...) calls capture the error as in line 22. Add/modify assert statements to verify that the appropriate error message is in the exception.

Recap

We accomplished a couple significant things in this lab:

  1. We installed the pytest package using pip. Again, you only need to do this once.
  2. We ran pytest, which scans for files and functions named test_* and runs them.
  3. pytest collects test case successes and failures independently from one another, allowing us to get more information with each run of our test code.
  4. pytest displays a summary of the results in human-friendly format.

Knowledge check

  • Question: (True/False) Raising and throwing exceptions are two different things.
  • Question: Why should you not exception logic in the same test case where you test “normal” logic?
  • Write a code block using pytest that checks that the determine_priority(str) function correctly throws a TypeError when passed anything other than a string.
  • Question: What happens when running pytest and the program code raises an exception that you do not expect?

6 - Test coverage

Computing an objective measure of test quality.

You are getting the first edition of all these pages. Please let me know if you find an error!

Before you start

You must have completed the lab on Testing for exceptions.

Motivation

Software engineers need some measure of the quality of the tests they write. This is not a simple question to answer.

  • Does a good test find bugs? Hopefully, but also, we should be writing our code to not have bugs!
  • Do we count how many lines of test code we have? Is it more than source code? Maybe, but that doesn’t mean we are testing the right things.
  • Do our tests check independent things in the code? How can we determine that automatically if so?

Measuring test case quality is not straightforward, but there is one generally agreed-upon measure used as a baseline: test coverage.

Test coverage

Test coverage is a measure of how much of source code is executed when the tests run. There are three measures of “how much”:

  1. Line coverage or statement coverage is the percentage of source lines of code executed by your test cases. We do not include test code lines when counting the percentage of code.
  2. Branch coverage is the percentage of program paths executed by your test cases.
  3. Conditional coverage is the percentage of Boolean conditions executed by your test cases.

Consider the following (very poorly designed and implemented) code snippet:

1
2
3
def authorize(is_authenticated, user_id, caller)
    if is_authenticated is True or (user_id.startswith('admin') and caller == "privileged"):
        return True

Now consider the following test case:

def test_authorize():
    assert my_module.authorize(True, "bob", "privileged") is True
  • This test case has 100% line coverage because all lines of code are executed.
  • This test case has 50% branch coverage because only one program path is executed: the path where the if-statement evaluates to True.
  • This test case has 33% conditional coverage because only one boolean conditional is checked (is_authenticated is True), but the other expressions user_id.startswith('admin') and caller == privileged are not.

Line coverage is the least precise, and conditional coverage is the most precise.

Test coverage is computed over the union of all source lines, branches, and conditions executed by our test cases. So we can easily write additional test cases that, collectively, reach 100% statement, branch, and condition coverage.

You want to target 100% condition coverage, but achieving 100% of any coverage can be challenging in a real system. Exception handling and user interface code in complex systems can be hard to test for a variety of reasons.

In practice, most organizations aim for 100% line coverage as a target.

Using pytest-cov to compute test coverage

Most test frameworks, like pytest and Junit (for Java), also have tools for computing test coverage. Manually computing these measures would be too tedious. These tools compute line coverage, but not always branch coverage, and almost never condition coverage because of the technical challenges of automating that calculation.

We installed the pytest-cov tool when we installed pytest. Refer to the instructions for installing pytest and pytest-cov Open a Terminal in the directory where you were working on your unit testing examples. Run the following:

Running pytest-cov

Run the following command from your Terminal in the directory with sample.py and test_sample.py from the previous labs.

pytest --cov . - This tells pytest to run tests in the current directory, ., and generate the coverage report. You should see something similar to the following:

============================================================= test session starts ==============================================================
platform darwin -- Python 3.12.2, pytest-8.3.3, pluggy-1.5.0
rootdir: /Users/laymanl/git/uncw-seng201/content/en/labs/testing/coverage
plugins: cov-5.0.0
collected 4 items                                                                                                                              

test_sample.py ....                                                                                                                      [100%]

---------- coverage: platform darwin, python 3.12.2-final-0 ----------
Name             Stmts   Miss  Cover
------------------------------------
sample.py           23      6    74%
test_sample.py      23      3    87%
------------------------------------
TOTAL               46      9    80%


============================================================== 4 passed in 0.03s ===============================================================

pytest executes your tests as well, so if any tests fail, you will see that output as well. Note that failing tests can lower your test coverage!

  • The general format for the command is pytest --cov <target_directory>
  • To get branch coverage, run the command pytest --cov --cov-branch <target-directory>

Generating a coverage report

You can also generate an HTML report with pytest --cov --cov-branch --cov-report=html <target-directory>. This will create a folder named htmlcov/ in the working directory. Open the htmlcov/index.html file in a web browser, and you will see an interactive report that shows you which lines are and are not covered.

A sample coverage report viewable in a web browser

Knowledge check

  • Test coverage is a measure of how much _________________ is executed when the __________________ runs.
  • Explain the difference between branch coverage and conditional coverage.
  • Give an example of a function and a test case where you have 100% branch coverage but <100% conditional coverage.
  • (True/False) Branch coverage is more precise than statement coverage.