How should unit tests be documented?

Question:

I’m trying to improve the number and quality of tests in my Python projects. One of the the difficulties I’ve encountered as the number of tests increase is knowing what each test does and how it’s supposed to help spot problems. I know that part of keeping track of tests is better unit test names (which has been addressed elsewhere), but I’m also interested in understanding how documentation and unit testing go together.

How can unit tests be documented to improve their utility when those tests fail in the future? Specifically, what makes a good unit test docstring?

I’d appreciate both descriptive answers and examples of unit tests with excellent documentation. Though I’m working exclusively with Python, I’m open to practices from other languages.

Asked By: ddbeck

||

Answers:

I document most on my unit tests with the method name exclusively:

testInitializeSetsUpChessBoardCorrectly()
testSuccessfulPromotionAddsCorrectPiece()

For almost 100% of my test cases, this clearly explains what the unit test is validating and that’s all I use. However, in a few of the more complicated test cases, I’ll add a few comments throughout the method to explain what several lines are doing.

I’ve seen a tool before (I believe it was for Ruby) that would generate documentation files by parsing the names of all the test cases in a project, but I don’t recall the name. If you had test cases for a chess Queen class:

testCanMoveStraightUpWhenNotBlocked()
testCanMoveStraightLeftWhenNotBlocked()

the tool would generate an HTML doc with contents something like this:

Queen requirements:
 - can move straight up when not blocked.
 - can move straight left when not blocked.
Answered By: Kaleb Brasee

The name of the test method should describe exactly what you are testing. The documentation should say what makes the test fail.

Answered By: Bill the Lizard

You should use a combination of descriptive method names and comments in the doc string. A good way to do it is including a basic procedure and verification steps in the doc string. Then if you run these tests from some kind of testing framework that automates running the tests and collecting results, you can have the framework log the contents of the doc string for each test method along with its stdout+stderr.

Here’s a basic example:

class SimpelTestCase(unittest.TestCase):
    def testSomething(self):
        """ Procedure:
            1. Print something
            2. Print something else
            ---------
            Verification:
            3. Verify no errors occurred
        """
        print "something"
        print "something else"

Having the procedure with the test makes it much easier to figure out what the test is doing. And if you include the docstring with the test output it makes figuring out what went wrong when going through the results later much easier. The previous place I worked at did something like this and it worked out very well when failures occurred. We ran the unit tests on every checkin automatically, using CruiseControl.

Answered By: Chris Lacasse

Perhaps the issue isn’t in how best to write test docstrings, but how to write the tests themselves? Refactoring tests in such a way that they’re self documenting can go a long way, and your docstring won’t go stale when the code changes.

There’s a few things you can do to make the tests clearer:

  • clear & descriptive test method names (already mentioned)
  • test body should be clear and concise (self documenting)
  • abstract away complicated setup/teardown etc. in methods
  • more?

For example, if you have a test like this:

def test_widget_run_returns_0():
    widget = Widget(param1, param2, "another param")
    widget.set_option(true)
    widget.set_temp_dir("/tmp/widget_tmp")
    widget.destination_ip = "10.10.10.99"

    return_value = widget.run()

    assert return_value == 0
    assert widget.response == "My expected response"
    assert widget.errors == None

You might replace the setup statements with a method call:

def test_widget_run_returns_0():
    widget = create_basic_widget()
    return_value = widget.run()
    assert return_value == 0
    assert_basic_widget(widget)

def create_basic_widget():
    widget = Widget(param1, param2, "another param")
    widget.set_option(true)
    widget.set_temp_dir("/tmp/widget_tmp")
    widget.destination_ip = "10.10.10.99"
    return widget

def assert_basic_widget():
    assert widget.response == "My expected response"
    assert widget.errors == None

Note that your test method is now composed of a series of method calls with intent-revealing names, a sort of DSL specific to your tests. Does a test like that still need documentation?

Another thing to note is that your test method is mainly at one level of abstraction. Someone reading the test method will see the algorithm is:

  • creating a widget
  • calling run on the widget
  • asserting the code did what we expect

Their understanding of the test method is not muddied by the details of setting up the widget, which is one level of abstraction lower than the test method.

The first version of the test method follows the Inline Setup pattern. The second version follows Creation Method and Delegated Setup patterns.

Generally I’m against comments, except where they explain the “why” of the code. Reading Uncle Bob Martin’s Clean Code convinced me of this. There is a chapter on comments, and there is a chapter on testing. I recommend it.

For more on automated testing best practices, do check out xUnit Patterns.

Answered By: Mike Mazur

When the test fails (which should be before it ever passes) you should see the error message and be able to tell what’s up. That only happens if you plan it that way.

It’s entirely a matter of the naming of the test class, the test method, and the assert message. When a test fails, and you can’t tell what is up from these three clues, then rename some things or break up some tests classes.

It doesn’t happen if the name of the fixture is ClassXTests and the name of the test is TestMethodX and the error message is “expected true, returned false”. That’s a sign of sloppy test writing.

Most of the time you shouldn’t have to read the test or any comments to know what has happened.

Answered By: Tim Ottinger