Given software where ...
- The system consists of a few subsystems
- Each subsystem consists of a few components
- Each component is implemented using many classes
... I like to write automated tests of each subsystem or component.
I don't write a test for each internal class of a component (except inasmuch as each class contributes to the component's public functionality and is therefore testable/tested from outside via the component's public API).
When I refactor the implementation of a component (which I often do, as part of adding new functionality), I therefore don't need to alter any existing automated tests: because the tests only depend on the component's public API, and the public APIs are typically being expanded rather than altered.
I think this policy contrasts with a document like Refactoring Test Code, which says things like ...
- "... unit testing ..."
- "... a test class for every class in the system ..."
- "... test code / production code ratio ... is ideally considered to approach a ratio of 1:1 ..."
... all of which I suppose I disagree with (or at least don't practice).
My question is, if you disagree with my policy, would you explain why? In what scenarios is this degree of testing insufficient?
In summary:
- Public interfaces are tested (and retested), and rarely change (they're added to but rarely altered)
- Internal APIs are hidden behind the public APIs, and can be changed without rewriting the test cases which test the public APIs
Footnote: some of my 'test cases' are actually implemented as data. For example, test cases for the UI consist of data files which contain various user inputs and the corresponding expected system outputs. Testing the system means having test code which reads each data file, replays the input into the system, and asserts that it gets the corresponding expected output.
Although I rarely need to change test code (because public APIs are usually added to rather than changed), I do find that I sometimes (e.g. twice a week) need to change some existing data files. This can happens when I change the system output for the better (i.e. new functionality improves existing output), which might cause an existing test to 'fail' (because the test code only tries to assert that output hasn't changed). To handle these cases I do the following:
- Rerun the automated test suite which a special run-time flag, which tells it to not assert the output, but instead to capture the new output into a new directory
- Use a visual diff tool to see which output data files (i.e. what test cases) have changed, and to verify that these changes are good and as expected given the new functionality
- Update the existing tests by copying new output files from the new directory into the directory from which test cases are run (over-writing the old tests)
Footnote: by "component", I mean something like "one DLL" or "one assembly" ... something that's big enough to be visible on an architecture or a deployment diagram of the system, often implemented using dozens or 100 classes, and with a public API that consists of only about 1 or a handful of interfaces ... something that may be assigned to one team of developers (where a different component is assigned to a different team), and which will therefore according to Conway's Law having a relatively stable public API.
Footnote: The article Object-Oriented Testing: Myth and Reality says,
Myth: Black box testing is sufficient.
If you do a careful job of test case
design using the class interface or
specification, you can be assured that
the class has been fully exercised.
White-box testing (looking at a
method's implementation to design
tests) violates the very concept of
encapsulation.
Reality: OO structure matters, part
II. Many studies have shown that
black-box test suites thought to be
excruciatingly thorough by developers
only exercise from one-third to a half
of the statements (let alone paths or
states) in the implementation under
test. There are three reasons for
this. First, the inputs or states
selected typically exercise normal
paths, but don't force all possible
paths/states. Second, black-box
testing alone cannot reveal surprises.
Suppose we've tested all of the
specified behaviors of the system
under test. To be confident there are
no unspecified behaviors we need to
know if any parts of the system have
not been exercised by the black-box
test suite. The only way this
information can be obtained is by code
instrumentation. Third, it is often
difficult to exercise exception and
error-handling without examination of
the source code.
I should add that I'm doing whitebox functional testing: I see the code (in the implementation) and I write functional tests (which drive the public API) to exercise the various code branches (details of the feature's implementation).
Question&Answers:
os