| kdgregory.com | ||||||||||
|
Blog
Food Programming Travel |
So You Think You're Covered? One of the more disturbing trends in testing is a reliance — in some cases, a corporate mandate — on code coverage metrics. Don't get me wrong: I wholeheartedly support coverage tools, and use both Cobertura and Emma on a regular basis (Cobertura has a plugin for Maven, Emma has a plugin for Eclipse). But I don't rely on either of them to tell me how well my unit tests exercise my code. Because coverage metrics lie. Perhaps no more than any other metric, and definitely less than some. But as with any metric, the numbers that you get out of your coverage tool are an indication of how well your code is being exercised, not an absolute statement. The rest of this article examines some of the ways that coverage tools lie, and what you can do about it. How Coverage Tools Work To understand why coverage tools lie, it's necessary first to know how
they work. Which turns out to be very simple: the coverage tool adds code
to the class to track execution, either via a custom classloader or as a
post-compilation step. For example, here's how public static void main(java.lang.String[]) throws java.lang.Exception; Code: 0: getstatic #19; //Field java/lang/System.out:Ljava/io/PrintStream; 3: ldc #25; //String Hello, World 5: invokevirtual #27; //Method java/io/PrintStream.println:(Ljava/lang/String;)V 8: return And here's what it looks like after being instrumented by Emma: public static void main(java.lang.String[]) throws java.lang.Exception; Code: 0: getstatic #36; //Field $VR4019:[[Z 3: iconst_1 4: aaload 5: astore_1 6: getstatic #2; //Field java/lang/System.out:Ljava/io/PrintStream; 9: ldc #3; //String Hello, World 11: invokevirtual #4; //Method java/io/PrintStream.println:(Ljava/lang/String;)V 14: aload_1 15: iconst_0 16: iconst_1 17: bastore 18: return As you can see, the bytecode has more than doubled in size. If you don't
read bytecode, what's happening is that Emma creates a And this brings up the first issue with coverage tools: how granular is their coverage? Both Emma and Cobertura track coverage to the level of a “basic block” (aka branch instructions in the bytecode). In other words, they will tell you if you haven't exercised both parts of a ternary expression. This is a very good thing. Other coverage tools aren't quite so good: they provide coverage of lines, or methods, or (worst) classes. Know what level of coverage you get from your tools! The Independent Path ProblemConsider the following piece of code. How many different paths are there through this code?
public static int testMe(int a, int b, int c)
{
if (a > 5)
c += a;
if (b > 5)
c += b;
return c;
}
If you build a truth table for the
OK, so how many times do you have to call this method to get 100% coverage, even with the block-level coverage of Emma and Cobertura? Two.
public void test100PercentCoverage() throws Exception
{
assertEquals(2, testMe(2, 2, 2));
assertEquals(21, testMe(7, 7, 7));
}
While this is a trivial example, every non-trivial program in existence exhibits the same trait: there are multiple independent paths through the code. This makes a mockery of any mandated coverage percentages: 100% reported coverage may not validate all possible paths — in fact, it almost certainly won't. There are several approaches to mitigating this problem. One is to use truth (path) tables like the one shown above. Each box in the table should contain a test method, and empty boxes need more tests regardless of what your coverage tool says. Unfortunately, this technique quickly breaks down. Real applications rarely have simple two-dimensional path combinations, and once you get above three or four dimensions the number of paths is overwhelming. A better approach is to refactor the code, typically using Extract Method to move the code inside the branch into an easily tested, linear method. That still leaves the branch intact, but it means that you can write simpler tests because you don't have to validate the branch and the code inside it. In fact, in some cases you can move the branch testing out of the realm of unit tests, and into the realm of acceptance tests. One way to identify code that needs to be refactored is to calculate the cyclomatic complexity of the code under test. High complexity values indicate code with many independent paths, which consequently require more effort to properly test. Cobertura has an edge here, as it reports cyclomatic complexity along with coverage numbers. Uncovered DependenciesAre you looking for full coverage of just your code, or of your code and its dependencies? Usually you consider libraries to be separate from your code, but is that a reasonable approach? True story from a previous job: while running instrumented tests against our server (written in C++), we discovered memory leaks and access violations in code from a major commercial database vendor, as well as that from another division of our own company. The former was met by “thank you, we'll investigate,” the latter by “you have no business poking around in that code?!?” While neither of these problems ultimately affected us, knowing of their existence meant that we could avoid code that triggered them. Clearly, you can't justify writing tests for all the libraries that you use (although, if you're using open source, such tests would be welcomed). But the point remains: even if you reach 100% coverage of your own code, you won't necessarily be bug-free. Exceptions Coverage-driven testing tends to focus on the “happy path”
through the tested code: does it do the right thing given expected input.
Consider the following code, which expects a string in the form
“
public static String extractMiddle(String s)
{
int idx1 = s.indexOf(':');
int idx2 = s.indexOf(':', idx1 + 1);
return s.substring(idx1 + 1, idx2);
}
public void testExtractMiddle() throws Exception
{
assertEquals("bar", extractMiddle("foo:bar:baz"));
}
This test gives 100% coverage, but completely ignores cases where the
string is Missing FeaturesAnd that brings me to the final problem: a coverage tool can tell you to write more tests, but it can't tell you to write more mainline code. Or put another way, if a missing feature is never tested, you'll never know — until the code reaches production. What you need is some way to determine how well your tests cover the application's specifications. The test-driven-development contingent will respond that tests are the specifications: if there isn't a test for a feature, that feature doesn't exist. Unfortunately, in a large application, where you may have thousands of unit tests, it's remarkably hard to identify missing features. Worse, you may think that you have a feature tested, when in fact it's only tested for a subset of the possible application states. Ultimately, this comes back to the Independent Path Problem, and there's no good solution to that problem save careful thought. While test-as-specification may be appropriate on the level of a single class or set of interacting classes, it's not appropriate at the level of an application. And acceptance tests, while closer to usable specification, do not have the granularity of a good unit test. They especially tend to miss boundary conditions and exceptional behavior. Conclusion As I said at the top of this article, I think coverage tools are great.
A big red block in the middle of your code is a strong incentive to write
more tests. But more important, from the perspective of quality code, is
to think about what your code is doing and put time into ensuring that
your tests fully exercise it, especially at the boundaries. Because
100% coverage isn't that important when you get a midnight phone call
asking why your code is throwing For More InformationCoverage tools:
“Uncle Bob” Martin has several articles on code coverage, cyclomatic complexity, and (of course) test-driven development. I'll leave explanations of “CRAP” to him. There aren't many examples in this articles, and the ones that exist are meant for shock value rather than education. If you feel the need to shock someone, here they are:
I gave a presentation on this material at the Philadelphia Java Users Group, in the fall of 2009. It doesn't say anything that you haven't alread read, but it's one of the better-looking presentations that I've done. You can find the slides here. Copyright © Keith D Gregory, all rights reserved | |||||||||