Regression Testing: "What" to test and "When"
Regression testing is often seen as an area in which companies
hesitate to allocate resources. We often hear statements such as: "The
developer said the defect is fixed. Do we need to test it again?" And
the answer should be: "Well, the developer probably said the product
had no defects to begin with." The truth of the matter is, in today's
world of extremely complex devices and software applications, the
quantity and quality of regression testing performed on a product are
directly proportional to the commitment vendors have to their customer
base. This does not mean that the more regression testing, the better.
It simply means that we must make sure that regression testing is done
in the right amount and with the right approach.
The two main challenges presented by regression testing are:
1. What do we test?
2. When do we test it?
The purpose of this article is to outline a few techniques that will help
us answer these questions. The first issue we should consider is the
fact that it is not necessary to execute our regression at the end of our
testing cycle. Much of the regression effort can be accomplished
simultaneously to all other testing activities. The supporting assumption
for this approach is:
"We do not wait until all testing is done to fix our defects."
Therefore, much of the regression effort can be accomplished long
before the end of the project, if the project is of reasonable length. If
our testing effort will only last one week, the following techniques may
have to be modified. However, it is not usual for a product to be tested
in such a short period of time. Furthermore, as you study the
techniques outlined below, you will see that as the project's length
increases, the benefits offered by these techniques also increase.
To answer the questions of what should we test and when, we will
begin with a simple suite of ten tests. In the real world, this suite would
obviously be much larger, and not necessarily static, meaning that the
number of tests can increase or decrease as the need arises. After our
first test run with the first beta (which we will call "Code Drop 1") of our
hypothetical software product, our matrix looks like this.
Test ID Status CD 1 Defect
Number
1 Pass
2 Fail 1
3 Fail 1
4 Pass
5 Pass
6 Pass
7 Fail 2
8 Fail 3
9 Fail 4
10 Pass
In the matrix above, we have cross-referenced the defects we found,
with the tests that caused them. As you can see, defect number 1 was
caused by test 2, but it also occurred on test 3. The remaining failures
caused unique defects.
As we prepare to execute our second test run (Code Drop 2), we must
decide what tests will be executed. The rules we will use only apply to
our regression effort. There are rules we can apply to the subset of
tests that have passed, in order to find out which ones we should re-
execute. However, that will be the topic of another article.
The fundamental question we must now ask is: "Have any of the
defects found been fixed?" Let us suppose that defects 1, 2, and 3
have, in fact, been reported as fixed by our developers. Let us also
suppose that three more tests have been added to our test suite. After
"Code Drop 2", our matrix looks as follows:
Test ID Status CD 1 Defect Number Status CD 2 Defect Number
1 Pass
2 Fail 1 Pass 1 - Fixed
3 Fail 1 Pass 1 - Fixed
4 Pass
5 Pass
6 Pass
7 Fail 2 Fail 2
8 Fail 3 Pass
9 Fail 4
10 Pass
11 Pass
12 Pass
A few key points to notice are:
Of the tests that previously failed, only the tests that were
associated with defects that were supposedly fixed were
executed. Test number 9, which caused defect number 4, was
not executed on Code Drop 2, because defect number 4 is not
fixed.
Defect number 1 is fixed, because tests 2 and 3 have finally
passed.
Test number 7 still fails. Therefore, the defect remains.
Test number 13 is a new test, and it caused a new defect.
We chose not to execute tests that had passed on Code Drop
1. This may often not be the case, since turmoil in our code or
the area's importance (such as a new feature, an improvement
to an old feature, or a feature as a key selling point of the
product) may prompt us to re-execute these tests.
This simple, but efficient approach ensures that our matrix will never
look like the matrix below (in order to more clearly show the problem,
we will omit the Defect # column after each code drop). We will also
consider Code Drop 5 to be our final regression pass.
Test ID ST CD1 ST CD2 ST CD3 ST CD4 ST CD5
1 Pass Pass Pass Pass Pass
2 Fail Pass Pass Fail Pass
3 Fail Pass Pass Pass Pass
4 Pass Pass Pass Pass Pass
5 Pass Pass Pass Pass Pass
6 Pass Pass Pass Pass Pass
7 Fail Fail Fail Pass Pass
8 Fail Pass Pass Pass Pass
9 Fail Fail Fail Fail Pass
10 Pass Pass Pass Pass Pass
11 Pass Pass Pass Pass
12 Pass Pass Pass Pass
13 Fail Fail Fail Fail
We will address tests 2, 7, and 9 later, but here are a few key points to
notice about this matrix:
Why were tests 1, 4, 5, 6, 10, 11, and 12 executed up to five
times? They passed every single time.
Why were tests 3 and 8 executed up to five times? They first
failed and were fixed. Did they need to be executed on every
code drop after the failure?
If test 13 failed, was the testing team erroneously told it had
been fixed on each code drop? If not, why was it executed four
times with the same result? We can also ask the question:
"Why isn't it fixed?" But we will not concern ourselves with that
issue, since we are only addressing the topic of regression.
In conclusion, we will list some general rules we can apply to our
testing effort that will ensure our regression efforts are justified and
accurate. These rules are:
1. A test that has passed twice should be considered as
regressed, unless turmoil in the code (or other reasons
previously stated, such as a feature's importance) indicates
otherwise. By this we mean that the only time a test should be
executed more than twice is if changes to the code in the area
the test exercises (or the importance of the particular feature)
justify sufficient concerns about the test's state or the feature's
condition.
2. A test that has failed once should not be re-executed unless
the developer informs the test team that the defect has been
fixed. This is the case for tests 7 and 9. They should not have
been re-executed until Code Drops 4 and 5 respectively.
3. We must implement accurate algorithms to find out what tests
that have already passed once should be re-executed, in order
to be aware of situations such as the one of test number 2.
This test passed twice after its initial failure and it failed again
on Code Drop 4. Just as an additional note of caution: "When
in doubt, execute."
4. For tests that have already passed once, the second execution
should be reserved for the final regression pass, unless turmoil
in the code indicates otherwise, or unless we do not have
enough tests to execute. However, we must be careful.
Although it is true that this allows us to get some of the
regression effort out of the way earlier in the project, it may
limit our ability to find defects introduced later in the project.
5. The final regression pass should not consist of more than 30%
to 40% of the total number of tests in our suite. This subset
should be allocated using the following priorities:
a. All tests that have failed more than once. By this we
mean the tests that failed, the developer reported them
as fixed, and yet they failed again either immediately
after they were fixed or some time during the
remainder of the testing effort.
b. All tests that failed once and then passed, once they
were reported as fixed.
c. All, or a carefully chosen subset of the tests that have
passed only once.
d. If there is still room to execute more tests, execute any
other tests that do not fit the criteria above but you feel
should nevertheless be executed.
These common sense rules will ensure that regression testing is done
smartly and in the right amount. In an ideal world, we would have the
time and the resources to test our product completely. Nevertheless,
today's world is a world of tight deadlines and even tighter budgets.
Wise resource expenditure today will ensure our ability to continue to
develop reliable products tomorrow.