Practical Test Improvement

Time to put some of the unit testing guidelines I described last time into practice. In this post, I’ll go through some concrete examples improving the overall quality of a unit test suite.

The pmp-library will serve as a practical example: Its current test suite is reasonably comprehensive, amounting to a test coverage of around 96%. Most of the tests, however, have been written

  • in a rush,
  • without too much thought,
  • to cover as much code as possible, and
  • long after the original code was written

Yours truly is partially responsible for that, so it’s only fair enough I spend some time cleaning things up.

Before starting, recall the two fundamental qualities of a unit test:

  1. It runs fast.
  2. It helps to localize problems.

I’ll focus on performance first.

Assessing Performance

We can run the full pmp-library test suite through a simple call to

$ make test

This reports a run-time of approximately 1.72 seconds on my Dell XPS 13 with a Intel Core i7-7560 CPU running at 2.4 GHz. That’s not too bad, but I’m sure we can do better. We’re currently using Google Test for unit testing. In order to get a more detailed picture, we can run the gtest_runner directly:

$ cd tests
$ ./gtest_runner

This will report individual run times for each test and test case1. Skimming through the results quickly reveals the most offending test:

...
[----------] 1 test from PointSetAlgorithmsTest
[ RUN      ] PointSetAlgorithmsTest.smoothing
[       OK ] PointSetAlgorithmsTest.smoothing (1270 ms)
[----------] 1 test from PointSetAlgorithmsTest (1270 ms total)
...

Turns out the biggest time eater right now is the PointSetAlgorithmsTest. This case alone takes up almost 75% of the run time of the entire test suite. The PointSetAlgorithmsTest currently contains only a single test for point set smoothing using iterated Moving Least Squares projection. Given the limited scope of the test it’s hardly justified that it takes up so much of our precious testing time.

Improving Performance

Although it is a bit embarrassing, here’s the PointSetAlgorithmsTest test case in all its glory:

class PointSetAlgorithmsTest : public ::testing::Test
{
public:
    PointSetAlgorithmsTest()
    {
        ps.read("pmp-data/xyz/armadillo_low.xyz");
    }
    PointSet ps;
};

TEST_F(PointSetAlgorithmsTest, smoothing)
{
    Scalar origBounds = ps.bounds().size();
    PointSetSmoothing pss(ps);
    pss.smooth();
    Scalar newBounds = ps.bounds().size();
    EXPECT_LT(newBounds,origBounds);
}

For now I’m ignoring the fact that checking for shrinking model bounds is not really a meaningful test and concentrate on performance instead. The biggest performance impact comes from

  1. using an unnecessarily large model and
  2. loading the model from a file.

This has not only performance implications, but also directly violates two criteria of a good unit test: The test touches the file system and therefore depends on the execution environment. This applies to other cases in the test suite as well, but I’ll stick to this one for now.

A faster and more self-contained way to obtain a test model is to generate a synthetic model on the fly. This also helps to concentrate on a model that is as simple as possible. For this particular case generating a simple unit sphere will be fully sufficient2. Here’s a little helper function for doing that:

// Randomly generate nPoints on the unit sphere.
// Stores the resulting points and normals in ps.
void generateRandomSphere(size_t nPoints, PointSet& ps)
{
    ps.clear();

    // normal distribution random number generator
    std::default_random_engine generator;
    std::normal_distribution<double> distribution(0.0, 1.0);

    auto normals = ps.vertexProperty<Normal>("v:normal");

    // generate points and add to point set
    for (size_t i(0); i < nPoints; i++)
    {
        double x = distribution(generator);
        double y = distribution(generator);
        double z = distribution(generator);

        // reject if all zero
        if (x == 0 && y == 0 && z == 0)
        {
            i--;
            continue;
        }

        // normalize
        double mult = 1.0 / sqrt(x * x + y * y + z * z);
        Point p(x, y, z);
        p *= mult;

        // add point and normal
        auto v = ps.addVertex(p);
        normals[v] = p;
    }
}

Using this function to generate the test model and removing the unnecessary test fixture setup leads to a simple and self-contained test:

TEST(PointSetAlgorithmsTest, smoothing)
{
    PointSet ps;
    generateRandomSphere(1000,ps);
    Scalar origBounds = ps.bounds().size();

    PointSetSmoothing pss(ps);
    pss.smooth();

    Scalar newBounds = ps.bounds().size();
    EXPECT_LT(newBounds,origBounds);
}

This change brings down the run-time for this test down to a bare 5ms:

...
[----------] 1 test from PointSetAlgorithmsTest
[ RUN      ] PointSetAlgorithmsTest.smoothing
[       OK ] PointSetAlgorithmsTest.smoothing (5 ms)
[----------] 1 test from PointSetAlgorithmsTest (5 ms total)
...

For the full test suite this change reduces the run time from 1.72s to 0.44s, almost a factor of 4. Good enough for now.

Improving Localization

Originally, I introduced PointSetAlgorithmsTest following the model of its counterpart for surface meshes, SurfaceMeshAlgorithmsTest. The latter contains the tests for all higher level surface mesh processing algorithms in the library. This strategy, obviously, does not provide proper localization, i.e., in case a test fails it is not really straightforward to find out which part of the code is causing trouble.

In retrospect, it would have been more coherent to have separate cases for each of the algorithms under test. After all, the classes and functions implementing the algorithms are our basic units of code.

The PointSetAlgorithmsTest only contains the smoothing test, so this is quickly refactored into PointSetSmoothingTest. In case of SurfaceMeshAlgorithmsTest the situation is slightly more complex because we’ve got more algorithms and scenarios to test. Despite, refactoring the test case is still worth the effort: It not only provides better localization but also provides opportunity to clean up redundant code by using algorithm-specific test fixtures.

To provide a concrete example: All of the curvature computation tests contained the following lines in order to load a mesh of a hemisphere:

mesh.clear();
mesh.read("pmp-data/off/hemisphere.off");
SurfaceCurvature curvature(mesh);
curvature.analyze(1);

Grouping all curvature tests in a SurfaceCurvatureTest case allows to simply use a single test fixture for setup:

class SurfaceCurvatureTest : public ::testing::Test
{
public:
    SurfaceCurvatureTest()
    {
        EXPECT_TRUE(mesh.read("pmp-data/off/hemisphere.off"));
        curvature = new SurfaceCurvature(mesh);
        curvature->analyze(1);
    }

    ~SurfaceCurvatureTest()
    {
        delete curvature;
    }

    SurfaceMesh mesh;
    SurfaceCurvature* curvature;
};

Obviously, the test case still relies on loading data from a file, which is bad, and I’m gonna change it at some point.

Conclusion

By applying some of the basic guidelines for good and clean unit tests the pmp-library test suite now runs as fast as lightning and more properly covers individual units of code.

However, there’s still room for improvement. Especially when it comes to the concepts we are actually testing for I’m far from satisfied with what we’ve got so far.

A whole different problem is how to properly test file I/O and file formats. While there seem to be some strategies out there, I’m not yet sure what’s the right way to go for the pmp-library. In case you have a suggestion, I’d be glad to hear.

  1. Note that Google Test uses a slightly confusing terminology for what is a test, and what is a test case. Baically a test is a single test and a test case is a group of tests. See the Google Test Primer for details. 

  2. In a first attempt I used a small sphere point set floating around on my machine. Much to my surprise, this caused the test to fail. Turned out the file had all zero normals, essentially making the algorithm doing nothing. A perfect reminder to harden algorithms against degenerate inputs! 

Comments