Testing
Testing
Somnath Khan
Q: when will we do white box testing? A: White box testing is done after developers will complete the coding part of the one function then they will do unit testing on that function whether it's meeting the requirement or not and also after completing all the coding part of the App. -- then they will integrate all the coding part and do the Integration testing---these all comes under Whit box testing we can all so call as Glass box testing or structural testing... But commonly the developers will do white box testing---But we never say we will not perform this we can, some times If developers are too busy then we can also do Unit testing, Integration testing--Q: There are 3 mandatory fields and 3 optional fields: How many possible test cases can be written? How many positive test cases can be written? ________________________________________ A: It would be a better help if mentioned the field type, now we can write the cases only for Optional or mand fields. Briefly we can write the cases like below 1. Submit the page with empty optional fields and verify whether the msgs. shown for optional fields 2. Each time leave any one of the optional fields, submit the page and verify whether the val msgs shown 3. Submit the page with empty mand fields and verify whether the val. msgs shown for mand fields 4. Each time leave any one of the mand fields, submit the page and verify whether the val msgs shown Dynamic Testing vs. Static Testing: Dynamic Testing involves working with the software, giving input values and checking if the output is as expected. These are the Validation activities. Unit Tests, Integration Tests, System Tests and Acceptance Tests are few of the Dynamic Testing methodologies. Static testing is a form of software testing where the software isn't actually used. This is in contrast to dynamic testing. It is generally not detailed testing, but checks mainly for the sanity of the code, algorithm, or document. It is primarily syntax checking of the code or and manually reading of the code or document to find errors. This type of testing can
be used by the developer who wrote the code, in isolation. Code reviews, inspections and walkthroughs are also used. From the black box testing point of view, static testing involves review of requirements or specifications. This is done with an eye toward completeness or appropriateness for the task at hand. This is the verification portion of Verification and Validation. Bugs discovered at this stage of development are less expensive to fix than later in the development cycle. Prototype: A prototype is a working model that is functionally equivalent to a component of the product. In many instances the client only has a general view of what is expected from the software product. In such a scenario where there is an absence of detailed information regarding the input to the system, the processing needs and the output requirements, the prototyping model may be employed. This model reflects an attempt to increase the flexibility of the development process by allowing the client to interact and experiment with a working representation of the product. The developmental process only continues once the client is satisfied with the functioning of the prototype. At that stage the developer determines the specifications of the clients real needs. V-Model The V-model is a software development model which can be presumed to be the extension of the waterfall model. Instead of moving down in a linear way, the process steps are bent upwards after the coding phase, to form the typical V shape. The V-Model demonstrates the relationships between each phase of the development life cycle and its associated phase of testing.
Verification Phases 1. Requirements analysis: In this phase, the requirements of the proposed system are collected by analyzing the needs of the user(s). This phase is concerned about establishing what the ideal system has to perform. However, it does not determine how the software will be designed or built. Usually, the users are interviewed and a document called the user requirements document is generated. The user requirements document will typically describe the systems functional, physical, interface, performance, data, security requirements etc as expected by the user. It is one which the business analysts use to communicate their understanding of the system back to the users. The users carefully review this document as this document would serve as the guideline for the system designers in the system design phase. The user acceptance tests are designed in this phase. 2. System Design: System engineers analyze and understand the business of the proposed system by studying the user requirements document. They figure out possibilities and techniques by which the user requirements can be implemented. If any of the requirements are not feasible, the user is informed of the issue. A resolution is found and the user requirement document is edited accordingly. The software specification document which serves as a blueprint for the development phase is generated. This document contains the general system organization, menu structures, data structures etc. It may also hold example business scenarios, sample windows, reports for the better understanding. Other technical documentation like entity diagrams, data dictionary will also be produced in this phase. The documents for system testing is prepared in this phase. 3. Architecture Design: This phase can also be called as high-level design. The baseline in selecting the architecture is that it should realize all which typically consists of the list of modules, brief functionality of each module, their interface relationships, dependencies, database tables, architecture diagrams, technology details etc. The integration testing design is carried out in this phase. 4. Module Design: This phase can also be called as low-level design. The designed system is broken up in to smaller units or modules and each of them is explained so that the programmer can start coding directly. The low level design document or program specifications will contain a detailed functional logic of the module, in pseudocode database tables, with all elements, including their type and size - all interface details with complete API references- all dependency issues- error message listings- complete input and outputs for a module. The unit test design is developed in this stage.
A: Verification is a quality process that is used to evaluate whether or not a product, service, or system complies with a regulation, specification, or conditions imposed at the start of a development phase. Verification can be in development, scale-up, or production. This is often an internal process Validation is the process of establishing documented evidence that provides a high degree of assurance that a product, service, or system accomplishes its intended requirements. This often involves acceptance and suitability with external customers. Difference: 1. It is sometimes said that validation ensures that you built the right thing and verification ensures that you built it right. 'Building the right thing' refers back to the user's needs, while 'building it right' checks that the documented development process was followed. In some contexts, it is required to have written requirements for both as well as formal procedures or protocols for determining compliance. 2. Verification takes place before validation, and not vice versa. Verification evaluates documents, plans, code, requirements and specifications. Validation evaluates the product itself. The inputs of verification are checklists, issue lists, walkthroughs and inspection meetings, reviews and meetings. The input of validation is the actual testing of an actual product. The output of verification is a nearly perfect set of documents, plans, specifications and requirements. 3. Verifiaction: Means are we doing right thing i.e. we have to Check whether we are implementing right process. Validation: Means are we doing things right .i.e. we have to Check whether we have developed a software as per the Client requirement. 4.Verification is done through out the life cycle of the project/product where as validation comes in the testing phase of the application. Verification ensures that the application complies to standards and processes. This answers the question " Did we build the right system? " E.g.: Design reviews, code walkthroughs and inspections. Validation ensures whether the application is built as per the plan. This answers the question " Did we build the System in the right way? ". E.g.: Unit Testing, Integration Testing, System Testing and
UAT.
A: The main difference between EP and BVA is that EP determines the number of test cases to be generated for a given scenario where as BVA will determine the effectiveness of those generated test cases. Equivalence partitioning: Equivalence Partitioning determines the number of test cases for a given scenario. Equivalence partitioning is a black box testing technique with the following goal: 1. To reduce the number of test cases to a necessary minimum. 2. To select the right test cases to cover all possible scenarios. EP is applied to the inputs of a tested component. The equivalence partitions are usually derived from the specification of the component's behavior. An input has certain ranges which are valid and other ranges which are invalid. This may be best explained at the following example of a function which has the pass parameter "month" of a date. The valid range for the month is 1 to 12, standing for January to December. This valid range is called a partition. In this example there are two further partitions of invalid ranges. The first invalid partition would be <= 0 and the second invalid partition would be >= 13. . ... -2 -1 0 Invalid partition 1 1.............. 12 valid partition 13 14 15..... invalid partition 2
The testing theory related to equivalence partitioning says that only one test case of each partition is needed to evaluate the behavior of the program for the related partition. In other words it is sufficient to select one test case out of each partition to check the behavior of the program. To use more or even all test cases of a partition will not find new faults in the program. The values within one partition are considered to be "equivalent". Thus the number of test cases can be reduced considerably.
Equivalence partitioning is no stand alone method to determine test cases. It has to be supplemented by boundary value analysis. Having determined the partitions of possible inputs the method of boundary value analysis has to be applied to select the most effective test cases out of these partitions.
Boundary Value Analysis determines the effectiveness of test cases for a given scenario. To set up boundary value analysis test cases the tester first has to determine which boundaries are at the interface of a software component. This has to be done by applying the equivalence partitioning technique. Boundary value analysis and equivalence partitioning are inevitably linked together. For the example of the month a date would have the following partitions: . .. -2 -1 0 1.............. 12 13 14 15..... --invalid partition 1 valid partition invalid partition 2 By applying boundary value analysis we can select a test case at each side of the boundary between two partitions. In the above example this would be 0 and 1 for the lower boundary as well as 12 and 13 for the upper boundary. Each of these pairs consists of a "clean" and a "dirty" test case. A "clean" test case should give a valid operation result of the program. A "dirty" test case should lead to a correct and specified input error treatment such as the limiting of values, the usage of a substitute value, or in case of a program with a user interface, it has to lead to warning and request to enter correct data. The boundary value analysis can have 6 test cases: n, n-1, and n+1 for the upper limit; and n, n-1, and n+1 for the lower limit. Statement testing: A test case design technique for a component in which test cases are designed to execute statements. i.e. "A test case design technique for a component in which test cases are designed to execute statements." BS 7925-1.British Computer Society Specialist Interest Group in Software Testing (BCS SIGIST) Statement Testing is a structural or white box technique, because it is conducted with reference to the code. Statement testing comes under Dynamic Analysis.. In an ideal world every statement of every component would be fully tested. However, in the real world this hardly ever happens.
In statement testing every possible statement is tested .
Compare this to Branch testing, where each branch is tested, to check that it can be traversed, whether it encounters a statement or not. Why? 1) To find faults 2) Give credible information about the state of the component, on which business decisions can be taken.
Who? Ideally an independent person or body, to the one that wrote the code. Usually, however it is conducted by the developer. Where? As close as possible to the point where the code was written. When? As part of component testing. How? Each test case, the following should be specified:1) Inputs to the component 2) Statement to be executed 3) Expected outcome. Metric statement coverage is available to measure how much of the code has been executed. Q: What is the difference between path coverage and branch coverage?... A: Statement coverage measures the number of lines executed. Code that has 100% statement coverage could easily contain defects. Branch coverage measures the number of executed branches. A branch is an outcome of a decision, so an if statement, for example, has two branches (True and False). Code that has 100% branch coverage can still contain errors. Path coverage more aptly put, data path coverage. This is a measure of the number of paths that contain defined data. This metric is a combination of cyclomatic complexity and data-flow paths. The reason these metrics are combined is to provide a very robust level of coverage not seen in either branch nor statement coverage. What A test case design technique for a component in which test cases are designed to execute >branch outcomes. BS 7925-1.British Computer Society Specialist Interest Group in Software Testing (BCS SIGIST) Branch outcome in the above could also be decision outcome. Branch Testing is a structural or white box technique, because it is conducted with reference to the code. A decision is an executable statement that may transfer control to another statement. In an ideal world every branch of every component would be fully tested. However, in the real world this hardly ever happens. In branch/decision testing every possible branch is tested, regardless of the outcome. Compare this to Statement testing, where if there is no statement on the branch, it is not tested.
Why? 1) To find faults 2) Give credible information about the state of the component, on which business decisions can be taken. Who? Idealy an independent person or body, to the one that wrote the code. Usually, however it is conducted by the developer. Where? As close as possible to the point where the code was written. When? As part of component testing. How? Each test case, the following should be specified:1) Inputs to the component 2) Decision or branch to be executed 3) Expected outcome. A metric branch decision coverage is available to measure how much of the code has been executed. How? Two measurements are essential, the total number of statements, and the number to be tested, by the test suite. Note the two figures do not have to match. In non-safety critical systems, it will not be cost effective to test the last nth statement on some obscure path. At the start of testing the test analyst should know the percentage that will be tested. As testing progresses he can see the percentage that has actually been tested. Statement Coverage = Executable Statements Tested -----------------------------------------Total No. of executable Statements
-----------------------------------------------------------------------------------------------------------
\
Comparison of Manual and Automated Testing
Manual Testing
Manual testing is slow Fewer tests can be performed tester must identify the most important few Oracles do not have to be as comprehensive (they only need to address a subset of significant behaviors the tester has time to perform) Oracles can be manual (There is time to visually review screen output during manual testing)
Automated Testing
Automated testing is blazing fast (making manual/visual verification self-defeating) Millions of test can be performed, resulting in a much larger percentage of the functionality being covered during testing The test oracle must address all functionality addressed through automated testing a much larger portion of possible behaviors The high volume of tests prevents manual/visual assessment of results of individual test cases, necessitating automation of the test oracle implementation
Iterative and Incremental development is a cyclic software development process developed in response to the weaknesses of the waterfall model. "Incremental" and "Iterative"
Incremental development is a scheduling and staging strategy in which the various parts of the system are developed at different times or rates, and integrated as they are completed. It does not imply, require nor preclude iterative development or waterfall
development - both of those are rework strategies. The alternative to incremental development is to develop the entire system with a "big bang" integration. Iterative development is a rework scheduling strategy in which time is set aside to revise and improve parts of the system. It does not presuppose incremental development, but works very well with it. A typical difference is that the output from an increment is not necessarily subject to further refinement, and its testing or user feedback is not used as input for revising the plans or specifications of the successive increments. On the contrary, the output from an iteration is examined for modification, and especially for revising the targets of the successive iterations. The two terms were merged in practical use in the mid-1990s. The authors of the Unified Process (UP) and the Rational Unified Process (RUP) selected the term "iterative development", and "iterations" to generally mean any combination of incremental and iterative development. Most people saying "iterative" development mean that they do both incremental and iterative development. Some project teams get into trouble by doing only one and not the other without realizing it.
Life-Cycle
The basic idea behind iterative enhancement is to develop a software system incrementally, allowing the developer to take advantage of what was being learned during the development of earlier, incremental, deliverable versions of the system. Learning comes from both the development and use of the system, where possible. Key steps in the process were to start with a simple implementation of a subset of the software requirements and iteratively enhance the evolving sequence of versions until the full system is implemented. At each iteration, design modifications are made and new functional capabilities are added. The Procedure itself consists of the Initialization step, the Iteration step, and the Project Control List. The initialization step creates a base version of the system. The goal for this initial implementation is to create a product to which the user can react. It should offer a sampling of the key aspects of the problem and provide a solution that is simple enough to understand and implement easily. To guide the iteration process, a project control list is created that contains a record of all tasks that need to be performed. It includes such items as new features to be implemented and areas of redesign of the existing solution. The control list is constantly being revised as a result of the analysis phase. The iteration involves the redesign and implementation of a task from project control list, and the analysis of the current version of the system. The goal for the design and implementation of any iteration is to be simple, straightforward, and modular, supporting redesign at that stage or as a task added to the project control list. The level of design detail is not dictated by the interactive approach. In a light-weight iterative project the code may represent the major source of documentation of the system; however, in a missioncritical iterative project a formal Software Design Document may be used. The analysis of iteration is based upon user feedback, and the program analysis facilities available. It involves analysis of the structure, modularity, usability, reliability, efficiency, & achievement of goals. The project control list is modified in light of the analysis results.
Any difficulty in design, coding and testing a modification should signal the need for redesign or re-coding. Modifications should fit easily into isolated and easy-to-find modules. If they do not, some redesign is needed. Modifications to tables should be especially easy to make. If any table modification is not quickly and easily done, redesign is indicated. Modifications should become easier to make as the iterations progress. If they are not, there is a basic problem such as a design flaw or a proliferation of patches. Patches should normally be allowed to exist for only one or two iterations. Patches may be necessary to avoid redesigning during an implementation phase. The existing implementation should be analysed frequently to determine how well it measures up to project goals. Program analysis facilities should be used whenever available to aid in the analysis of partial implementations. User reaction should be solicited and analysed for indications of deficiencies in the current implementation.
Waterfall development completes the project-wide work-products of each discipline in a single step before moving on to the next discipline in the next step. Business value is delivered all at once, and only at the very end of the project.
Iterative development slices the deliverable business value (system functionality) into iterations. In each iteration a slice of functionality is delivered through cross-discipline work, starting from the model/requirements through to the testing/deployment. The unified process groups iterations into phases: inception, elaboration, construction, and transition. Inception identifies project scope, risks, and requirements (functional and non-functional) at a high level but in enough detail that work can be estimated. Elaboration delivers a working architecture that mitigates the top risks and fulfills the non-functional requirements. Construction incrementally fills-in the architecture with production-ready code produced from analysis, design, implementation, and testing of the functional requirements. Transition delivers the system into the production operating environment. Each of the phases may be divided into 1 or more iterations, which are usually time-boxed rather than feature-boxed. Architects and analysts work one iteration ahead of developers and testers to keep their work-product backlog full.
Characteristics
Using analysis and measurement as drivers of the enhancement process is one major difference between iterative enhancement and the current agile software development. It provides support for determining the effectiveness of the processes and the quality of product. It allows one to study, and therefore improve and tailor, the processes for the particular environment. This measurement and analysis activity can be added to existing agile development methods. The relative changes measured over the evolution of the system can be very informative as they provide a basis for comparison, even if sometimes difficult to understand in the absolute. For example, a vector of measures, m1, m 2, ... mn, can be defined to characterize various aspects of the product at some point in time, e.g., effort to date, changes, defects, logical, physical, and dynamic attributes, environmental considerations.
Thus an observer can tell how product characteristics like size, complexity, coupling, and cohesion are increasing or decreasing over time. One can monitor the relative change of the various aspects of the product or can provide bounds for the measures to signal potential problems and anomalies. Many utility software systems have been developed using this model, wherein the requirement is basically providing the customer with some working model at an early stage of the development cycle. As new features are added in, a new release is launched which has fewer bugs and more features than the previous release. Some of the typical examples of this kind of model are: Yahoo Messenger, Azureus, Cyber Sitter, Net Meter, PC Security, Limewire, P2P, etc etc.
When you are writing test cases , whether they are unit tests, tests for test-driven development, regression tests, or any other kind, there are some functions you will find yourself doing each time. These include the basics of an application, a mechanism for launching tests, and a way to report results. You could write these functions each time you start testing a new feature or you could write them once and leverage them every time. A test harness, at its most simple, is just thata system to handle the elements youll repeat each time you write a test. Think of it as scaffolding upon which you will build your tests.
A basic test harness will contain at least the following elements: application basics, test launching, and result reporting. It may also include a graphical user interface, logging,
and
It should be noted that harnesses will generally be written for one language or runtime (C/C++, Java, .Net, etc.). It is hard to write a good harness which will work across runtimes.
To run a test, the first thing you need is an application. Each operating system has a different way to write an application. For example, on windows if you want any GUI, you need things like a message pump and a message loop. This handles the interaction with the operating system. The harness will include code to start the program, open any required files, select which cases are run, etc . The next thing you need is a way to actually launch the tests. Most of the time a test case is just a function or method. This function does all of the actual work of the test case. It calls the API in question, verifies the results, and informs the framework whether the case passed or failed. The test harness will provide a standardized way for test cases to advertise themselves to the system and an interface by which they will be called. In the most simple system, a C/C++ harness might advertise its cases by adding function pointers to a table. The harness may provide some extra services to the test cases by allowing them to be called in a specified order or to be called on independent threads.
The third basic pillar of a test harness is a way to inform the user of which cases pass and which fail. They provide a way for test cases to output messages and to report pass/fail results. In the most basic harness, this could be just a console window in which test cases will print their own messages. Better than that is a system which automatically displays each test case name and its results. Usually there is a summary at the end of how many cases passed or failed. This could be textual or even a big red or green bar. More advanced systems have built-in logging systems. They provide a standardized way for the test cases to output trace messages informing the user of each important call as it is being made. The harness may simply log to a text file but it may also provide a parsable format like XML or even interact directly with a database for result retention. At Microsoft, many groups write their own test harnesses which are specialized to support the unique needs of the organization. For example, my team uses a harness called Shell98 which has, among other things, support for device enumeration. Good examples of a freely available test harnesses are the xUnit series of tests like cppUnit, nUnit, and jUnit. These are designed for unit testing and are not very feature-rich. Ive used cppUnit which is very basic and nUnit which is pretty slick. The xUnit harnesses do not do any logging and you do not have control over the order in which tests are run. They are intended for a user to run and visually inspect the results. The harness I use allows for scripting and outputs its results to a database.
In short we can summaries it as follows Used to exercise software which does not have a user interface (yet)( The user interface (or Human Computer Interface) is the aggregate of means by which peoplethe usersinteract with the systema particular machine, device, computer program or other complex tools) Used to run groups of automated tests or comparisons Often custom-build Simulators (where testing in real environment would be too costly or dangerous)
This paper gives a complete description of code coverage analysis (test coverage analysis), a software testing technique.
Introduction:
through your release candidate. Coverage analysis requires access to test program source code and often requires recompiling it with a special command. This paper discusses the details you should consider when planning to add coverage analysis to your test plan. Coverage analysis has certain strengths and weaknesses. You must choose from a range of measurement methods. You should establish a minimum percentage of coverage, to determine when to stop analyzing coverage . Coverage analysis is
one of many testing techniques; you should not rely on it alone.
Code coverage analysis is sometimes called test coverage analysis. The two terms are synonymous. The academic world more often uses the term "test coverage" while practitioners more often use "code coverage". Likewise, a coverage analyzer is sometimes called a coverage monitor. I prefer the practitioner terms.
complete. This is especially true near the end of the product development time line when the requirements specification is updated less frequently and the product itself begins to take over the role of the specification. The difference between functional and structural testing blurs near release time.
The Premise
The basic assumptions behind coverage analysis tell us about the strengths and limitations of this testing technique. Some fundamental assumptions are listed below.
Bugs relate to control flow and you can expose Bugs by varying the control flow. For example, a programmer wrote "if (c)" rather than "if (!c)". You can look for failures without knowing what failures might occur and all tests are reliable, in that successful test runs imply program correctness . The tester understands what a correct version of the program would do and can identify differences from the correct behavior. Other assumptions include achievable specifications, no errors of omission, and no unreachable code.
Clearly, these assumptions do not always hold. Coverage analysis exposes some plausible bugs but does not come close to exposing all classes of bugs. Coverage analysis provides more benefit when applied to an application that makes a lot of decisions rather than data-centric applications, such as a database application.
Basic Metrics
A large variety of coverage metrics exist. Here is a description of some fundamental metrics and their strengths and weaknesses.
Statement Coverage
This metric reports whether each executable statement is encountered. Also known as: line coverage, segment coverage , C1 and basic block coverage. Basic block coverage is the same as statement coverage except the unit of code measured is each sequence of non-branching statements. I highly discourage using the undisruptive name C1. People sometimes incorrectly use the name C1 to identify decision coverage. Therefore this term has become ambiguous.
Summary
Software developers and testers commonly use statement coverage because of its simplicity and availability in object code instrumentation technology. Of all the structural coverage criteria, statement coverage is the weakest, indicating the fewest number of test cases. Bugs can easily occur in the cases that statement coverage cannot see. The most significant shortcoming of statement coverage is that it fails to measure whether you test simple if statements with a false decision outcome. Experts generally recommend to only use statement coverage if nothing else is available. Any other metric is better.
Introduction
Statement coverage is a code coverage metric that tells you whether the flow of control reached every executable statement of source code at least once.
Attaining coverage on every source statement seems like a good objective. But statement coverage does not adequately take into account the fact that many statements (and many bugs) involve branching and decision-making. Statement coverage's insensitivity to control structures tends to contradict the assumption of code coverage testing itself: thorough testing requires exercising many combinations of branches and conditions. In particular, statement coverage does not call for testing the following:
Statement coverage has three characteristics that make it seem like a good coverage metric. Upon close inspection, they all become questionable. Statement coverage is:
Simple and fundamental Measurable by object code instrumentation Sensitive to the size of the code
. A number of software testing books and papers give descriptions of statement coverage that range from "the weakest measure" to "not nearly enough".
Experts agree
Line coverage, basic block, and segment coverage are variations of statement coverage. They all have similar characteristics and this document applies equally to all of them, except where noted.
Specific Issues
Simple If-Statements
Statement coverage does not call for testing simple if statements. A simple if statement has no else-clause. To attain full statement coverage requires testing with the controlling decision true, but not with a false outcome. No source code exists for the false outcome, so statement coverage cannot measure it. If you only execute a simple if statement with the decision true, you are not testing the if statement itself. You could remove the if statement, leaving the body (that would otherwise execute conditionally), and your testing would give the same results. Since simple if statements occur frequently, this shortcoming presents a serious risk.
Without a test case that causes condition to evaluate false, statement coverage declares this code fully covered. In fact, if condition ever evaluates false, this code dereferences a null pointer.
Logical Operators
Statement coverage does not call for testing logical operators. In C++ and C these operators are &&, ||, and ?:. Statement coverage cannot distinguish the code separated by logical operators from the rest of the statement. Executing any part of the code in a statement causes statement coverage to declare the whole statement fully covered. When logical operators avoid unnecessary evaluation (by short circuit), statement coverage gives an inflated coverage measurement. This problem often occurs even when logical operators occur on different source code lines. Some compilers, such as Microsoft C++, only provide one debug line number for a decision, even if it spans multiple source lines.
Statement coverage declares this code fragment fully covered when condition is true. With condition false, the call to strcmp gets an invalid argument, a null pointer.
// Oops, should be
This program clearly anticipates three different errors. You can satisfy statement coverage with just one error, errno=EACCESS. Statement coverage says that testing with this error is just as good as another. However, this code incorrectly initializes message for ENODEV twice, but does not initialize message for ENOENT. Testing with either of these errors exposes the problem, but statement coverage does not call for them.
The main loop termination decision, i <= sizeof(output), intends to prevent overflowing the output buffer. You can achieve full statement coverage without testing this condition. The overflow decision really ought to use operator < rather than operator <=, so a buffer overflow could occur post-release. You get full statement coverage of this code with any input string of length 100 or less, without exposing the bug.
Do-While Loops
Statement coverage does not call for testing iteration of do-while loops. Since dowhile loops always execute at least once, statement coverage sees them as fully covered whether or not they repeat. If you only execute a do-while without repeating the loop, you are not testing the loop. You could remove the do-while, leaving the statements that would otherwise execute repetitively, and your testing would give the same results.
You can achieve full statement coverage without repeating this loop. Testing with a zerolength input string is sufficient for statement coverage. The problem is the programmer forgot to increment the index. Any non-zero length input string causes an infinite loop.
Common Misconceptions
Simple and Fundamental
Statement coverage is the simplest structural coverage metric in that it calls for the least testing in order to achieve full coverage. Additionally, statement coverage is a fundamental metric in that most other structural coverage metrics include statement
coverage. However, statement coverage is not the simplest metric to understand and statement coverage is not fundamental to good testing. Some coverage metrics other than statement coverage are fairly simple. Condition/decision coverage calls for exercising all decisions and logical conditions with both true and false outcomes. This metric is simple to understand and leads to more complete testing than statement coverage. Testing experts often describe statement coverage as a basic or primary level of coverage. Most other structural coverage metrics subsume, or include, statement coverage. However, this only holds for full coverage, which rarely occurs in practice even with statement coverage. The difficulty of attaining additional coverage increases exponentially with all types of coverage. Rather than spend your time on the most difficult part of statement coverage, you make better progress using a more sensitive coverage metric that offers more test cases, some of which may require relatively little effort. Even if you do achieve 100% statement coverage, you have not necessarily exercised all your object code even though it appears you have exercised all your source code. The object code corresponding to branches is still vulnerable. Statement coverage may be the most basic metric, but it is not part of good testing.
Sensitivity to basic block length is not beneficial since it comes at the expense of sensitivity to paths and test cases. Basic block coverage is not sensitive to basic block length. Basic block coverage is the same as statement coverage except the unit of code measured is each sequence of nonbranching statements. Segment coverage is another name for basic block coverage.
Code Examples
Sensitivity To Basic Block Length Example 1
The C++ if-else statement below contains a lot of code in the then-clause, but very little in the else-clause.
if (condition) { // 99 statements statement1; statement2; ... statement99; } else { // 1 statement statement100; }
With condition true, you obtain 99% statement coverage. With a successful test, you can conclude that 99% of the code has no bugs. In the reverse senario with condition false, you obtain just 1% statement coverage. Statement coverage seems to measure the relative importance of the two test cases proportionately.
this code all of which expose a bug. Statement coverage indicates of the number of bugs very poorly.
int* p = NULL; if (condition) { p = &variable; *p = 1; } *p = 0; const char* string2 = NULL; if (condition || strcmp(string1, string2) == 0) statement; message[EACCES] = "Permission denied"; message[ENODEV] = "No such device"; message[ENODEV] = "No such file or directory"; // Oops, should be ENOENT switch (errno) { case EACCES: case ENODEV: case ENOENT: printf("%s\n", message[errno]); break; ... } char output[100]; for (int i = 0; i < = sizeof(output); i++) { output[i] = input[i]; if (input[i] == '\0') { break; } } int i = 0; do { output[i] = input[i]; } while (input[i] != '\0');
Test cases generally correlate more to decisions than to statements. You probably would not have 10 separate test cases for a sequence of 10 non-branching statements; you would have only one test case. For example, consider an if-else statement containing one statement in the then-clause and 99 statements in the else-clause. After exercising one of the two possible paths, statement coverage gives extreme results: either 1% or 99% coverage. Basic block coverage eliminates this problem. One argument in favor of statement coverage over other metrics is that bugs are evenly distributed through code; therefore the percentage of executable statements covered reflects the percentage of faults discovered. However, one of our fundamental assumptions is that faults are related to control flow, not computations. Additionally, we could reasonably expect that programmers strive for a relatively constant ratio of branches to statements. In summary, this metric is affected more by computational statements than by decisions.
Decision Coverage
This metric reports whether boolean expressions tested in control structures (such as the if-statement and while-statement) evaluated to both true and false. The entire boolean expression is considered one true-or-false predicate regardless of whether it contains logical-and or logical-or operators. Additionally, this metric includes coverage of switch-statement cases, exception handlers, and interrupt handlers. Also known as: branch coverage, all-edges coverage , basis path coverage , C2 , decisiondecision-path testing . "Basis path" testing selects paths that achieve decision coverage. I discourage using the undescriptive name C2 because of the confusion with the term C1.
Branch Testing is a method of program testing whereby you make sure that you test each branch of the program. Consider the following code:
main() { float x; cout << "Enter a number "; cin >> x; if (x < 100 ) cout << "The number is less than 100" << endl; else cout << "The number is greater than or equal to 100" << endl; }
In order to properly test this program, I must provide at least 2 test cases because there are two branches to the if statement. There is the branch that
is executed if the test is true, and the branch that is executed if the test is false. So there are two routes, or paths, that execution can take from the start to the end of the program. So I must choose one test case in which the number entered is less than 100, and one test case in which the number entered is greater or equal to 100. This metric has the advantage of simplicity without the problems of statement coverage. A disadvantage is that this metric ignores branches within boolean expressions which occur due to short-circuit operators. For example, consider the following C/C++/Java code fragment:
if (condition1 && (condition2 || function1())) statement1; else statement2;
This metric could consider the control structure completely exercised without a call to function1. The test expression is true when condition1 is true and condition2 is true, and the test expression is false when condition1 is false. In this instance, the short-circuit operators preclude a call to function1.
Condition Coverage
Condition coverage reports the true or false outcome of each boolean sub-expression, separated by logical-and and logical-or if they occur. Condition coverage measures the sub-expressions independently of each other. This metric is similar to decision coverage but has better sensitivity to the control flow. However, full condition coverage does not guarantee full decision coverage. For example, consider the following C++/Java fragment.
bool f(bool e) { return false; } bool a[2] = { false, false }; if (f(a && b)) ... if (a[int(a && b)]) ... if ((a && b) ? false : false) ...
All three of the if-statements above branch false regardless of the values of a and b. However if you exercise this code with a and b having all possible combinations of values, condition coverage reports full coverage.
To achieve full multiple condition coverage, the first condition requires 6 test cases while the second requires 11. Both conditions have the same number of operands and operators. The test cases are listed below.
1. 2. 3. 4. 5. 6. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. a && b && (c || (d && e)) F T F T T F F T T F T F T T F T T T T T ((a || b) && (c || d)) && e F F F T F F F T F T F F T F T T F T T F F T T T T F F T F T F T F T T T T F T T T
As with condition coverage, multiple condition coverage does not include decision coverage. For languages without short circuit operators such as Visual Basic and Pascal, multiple condition coverage is effectively path coverage (described below) for logical expressions, with
the same advantages and disadvantages. Consider the following Visual Basic code fragment.
If a And b Then ...
Multiple condition coverage requires four test cases, for each of the combinations of a and b both true and false. As with path coverage each additional logical operator doubles the number of test cases required.
Condition/Decision Coverage
Condition/Decision Coverage is a hybrid metric composed by the union of condition coverage and
decision coverage.
It has the advantage of simplicity but without the shortcomings of its component metrics.
BullseyeCoverage
Path Coverage
This metric reports whether each of the possible paths in each function have been followed. A path is a unique sequence of branches from the function entry to the exit. Also known as predicate coverage. Predicate coverage views paths as possible combinations of logical conditions . Since loops introduce an unbounded number of paths, this metric considers only a limited number of looping possibilities. A large number of variations of this metric exist to cope with loops. Boundary-interior path testing considers two possibilities for loops: zero
repetitions and more than zero repetitions . For do-while loops, the two possibilities are one iteration and more than one iteration. Path coverage has the advantage of requiring very thorough testing. Path coverage has two severe disadvantages. The first is that the number of paths is exponential to the number of branches. For example, a function containing 10 if-statements has 1024 paths to test. Adding just one more if-statement doubles the count to 2048. The second disadvantage is that many paths are impossible to exercise due to relationships of data. For example, consider the following C/C++ code fragment:
if (success) statement1; statement2; if (success) statement3;
Path coverage considers this fragment to contain 4 paths. In fact, only two are feasible: success=false and success=true. Researchers have invented many variations of path coverage to deal with the large number of paths. For example, n-length sub-path coverage reports whether you exercised each path of length n branches. Others variations include linear code sequence and jump (LCSAJ) coverage and
data flow coverage .
Other Metrics
Here is a description of some variations of the fundamental metrics and some less commonly use metrics.
Function Coverage
This metric reports whether you invoked each function or procedure. It is useful during preliminary testing to assure at least some coverage in all areas of the software. Broad, shallow testing finds gross deficiencies in a test suite quickly.
BullseyeCoverage
Call Coverage
This metric reports whether you executed each function call. The hypothesis is that bugs commonly occur in interfaces between modules. Also known as call pair coverage.
Loop Coverage
This metric reports whether you executed each loop body zero times, exactly once, and more than once (consecutively). For do-while loops, loop coverage reports whether you executed the body exactly once, and more than once. The valuable aspect of this metric is determining whether while-loops and for-loops execute more than once, information not reported by other metrics. As far as I know, only GCT implements this metric.
Race Coverage
This metric reports whether multiple threads execute the same code at the same time. It helps detect failure to synchronize access to resources. It is useful for testing multithreaded programs such as in an operating system. As far as I know, only GCT implements this metric.
Relational operator coverage reports whether the situation a==b occurs. If a==b occurs and the program behaves correctly, you can assume the relational operator is not suppose to be <=. As far as I know, only GCT implements this metric.
Table Coverage
This metric indicates whether each entry in a particular array has been referenced. This is useful for programs that are controlled by a finite state machine.
Comparing Metrics
You can compare relative strengths when a stronger metric includes a weaker metric.
includes statement coverage since exercising every branch must lead to exercising every statement. This relationship only holds when control flows uninterrupted to the end of all basic blocks. For example a C/C++ function might never return to finish the calling basic block because it uses throw, abort, the exec family, exit, or longjmp.
Decision coverage Condition/decision coverage Path coverage
includes
decision coverage
and
condition coverage
(by definition).
includes
Predicate coverage
includes
and
metrics. Academia says the stronger metric subsumes the weaker metric. Coverage metrics cannot be compared quantitatively.
Your highest level of testing productivity occurs when you find the most failures with the least effort. Effort is measured by the time required to create test cases, add them to your test suite and run them. It follows that you should use a coverage analysis strategy that increases coverage as fast as possible. This gives you the greatest probability of finding failures sooner rather than later. Figure 1 illustrates the coverage rates for high and low productivity. Figure 2 shows the corresponding failure discovery rates.
One strategy that usually increases coverage quickly is to first attain some coverage throughout the entire test program before striving for high coverage in any particular area. By briefly visiting each of the test program features, you are likely to find obvious or gross failures early. For example, suppose your application prints several types of documents, and a bug exists which completely prevents printing one (and only one) of the document types. If you first try printing one document of each type, you probably find this bug sooner than if you thoroughly test each document type one at a time by printing many documents of that type before moving on to the next type. The idea is to first look for failures that are easily found by minimal testing. The sequence of coverage goals listed below illustrates a possible implementation of this strategy. 1. Invoke at least one function in 90% of the source files (or classes). 2. Invoke 90% of the functions. 3. Attain 90% condition/decision coverage in each function. 4. Attain 100% condition/decision coverage. Notice we do not require 100% coverage in any of the initial goals. This allows you to defer testing the most difficult areas. This is crucial to maintaining high testing productivity; achieve maximum results with minimum effort.
Avoid using a weaker metric for an intermediate goal combined with a stronger metric for your release goal. Effectively, this allows the weaknesses in the weaker metric to decide which test cases to defer. Instead, use the stronger metric for all goals and allow the difficulty of the individual test cases help you decide whether to defer them.
Summary
Coverage analysis is a structural testing technique that helps eliminate gaps in a test suite. It helps most in the absence of a detailed, up-to-date requirements specification. Condition/decision coverage is the best general-purpose metric for C, C++, and Java. Setting an intermediate goal of 100% coverage (of any type) can impede testing productivity. Before releasing, strive for 80%-90% or more coverage of statements, branches, or conditions