1.3 Fundamentos
1.3.3 Fundamentos legales
Though testing cannot settle the question of program correctness, different testing methods continue to be developed. For example, there are specification- based testing methods and code-based testing methods. It is important to develop a theory to compare the power of different testing methods. Gourlay [19] put forward a theory to compare the power of testing methods based on their fault detection abilities.
A software system undergoes multiple test–fix–retest cycles until, ideally, no more faults are revealed. Faults are fixed by modifying the code or adding new code to the system. At this stage there may be a need to design new test cases. When no more faults are revealed, we can conclude this way: either there is no fault in the program or the tests could not reveal the faults. Since we have no way to know the exact situation, it is useful to evaluate the adequacy of the test set. There is no need to evaluate the adequacy of tests so long as they reveal faults. Two practical ways of evaluating test adequacy are fault seeding and program mutation. Finally, we discussed two limitations of testing. The first limitation of testing is that it cannot settle the question of program correctness. In other words, by testing a program with a proper subset of the input domain and observing no fault, we cannot conclude that there are no remaining faults in the program. The second limitation of testing is that in several instances we do not know the expected output of a program. If for some inputs the expected output of a program is not known or it cannot be determined within a reasonable amount of time, then the program is called nontestable [20].
LITERATURE REVIEW
Weyuker and Ostrand [18] have shown by examples how to construct revealing subdomains from source code. Their main example is the well-known triangle classification problem. The triangle classification problem is as follows. Let us consider three positive integers A,B, andC. The problem is to find whether the given integers represent the sides of an equilateral triangle, the sides of a scalene right triangle, and so on.
Weyuker [13] has introduced the notion ofprogram inference to capture the notion of test data adequacy. Essentially, program inference refers to deriving a program from its specification and a sample of its input– output behavior. On the other hand, the testing process begins with a specificationS and a programP and selects input– output pairs that characterize every aspect of the actual computations performed by the program and the intended computations performed by the spec- ification. Thus, program testing and program inference are thought of as inverse processes. A test setT is said to be adequate ifT contains sufficient data to infer the computations defined by both S andP. However, Weyuker [13] explains that such an adequacy criterion is not pragmatically usable. Rather, the criterion can at best be used as a guide. By considering the difficulty in using the criterion, Weyuker defines two weaker adequacy criterion, namely program adequate and
specification adequate. A test set T is said to be program adequate if it contains sufficient data to infer the computations defined by P. Similarly, the test set T is
said to be specification adequate if it contains sufficient data to infer the computa- tions defined byS. It is suggested that depending upon how test data are selected, one of the two criteria can be eased out. For example, ifT is derived fromS, then it is useful to evaluate if T is program adequate. SinceT is selected from S,T
is expected to contain sufficient data to infer the computations defined byS, and there is no need to evaluateT’s specification adequacy. Similarly, ifT is derived fromP, it is useful to evaluate ifT is specification adequate.
The students are encouraged to read the article by Stuart H. Zweben and John S. Gourlay entitled “On the Adequacy of Weyuker’s Test Data Adequacy Axioms” [15] The authors raise the issue of what makes an axiomatic system as well as what constitutes a proper axiom. Weyuker responds to the criticism at the end of the article. Those students have never seen such a professional interchange; this is worth reading for this aspect alone. This article must be read along with the article by Elaine Weyuker entitled “Axiomatizing Software Test Data Adequacy” [12].
Martin David and Elaine Weyuker [9] present an interesting notion of dis- tance between programs to study the concept of test data adequacy. Specifically, they equate adequacy with the capability of a test set to be able to successfully distinguish a program being tested from all programs that are sufficiently close to it and differ in input– output behavior from the given program.
Weyuker [12, 21] proposed a set of properties to evaluate test data ade- quacy criteria. Some examples of adequacy criteria are to (i) ensure coverage of all branches in the program being tested and (ii) ensure that boundary values of all input data have been selected for the program under test. Parrish and Zweben [11] formalized those properties and identified dependencies within the set. They formalized the adequacy properties with respect to criteria that do not make use of the specification of the program under test.
Frankl and Weyuker [10] compared the relative fault-detecting ability of a number of structural testing techniques, namely, data flow testing, mutation testing, and a condition coverage technique, to branch testing. They showed that the for- mer three techniques are better than branch testing according to two probabilistic measures.
A good survey on test adequacy is presented in an article by Hong Zhu, Patrick A. V. Hall, and John H. R. May entitled “Software Unit Test Coverage and Adequacy” [14]. In this article, various types of software test adequacy criteria proposed in the literature are surveyed followed by a summary of methods for comparison and assessment of adequacy criteria.
REFERENCES
1. R. Gupta, M. J. Harrold, and M. L. Soffa.An Approach to Regression Testing Using Slicing. Paper presented at the IEEE-CS International Conference on Software Maintenance, Orlando, FL, November 1992, pp. 299– 308.
2. G. Rothermel and M. Harrold. Analyzing Regression Test Selection Techniques.IEEE Transactions on Software Engineering, August 1996, pp. 529– 551.
REFERENCES 49 3. V. R. Basili and R. W. Selby. Comparing the Effectiveness of Software Testing.IEEE Transactions
on Software Engineering, December 1987, pp. 1278– 1296.
4. W. E. Howden. Weak Mutation Testing and Completeness of Test Sets. IEEE Transactions on Software Engineering, July 1982, pp. 371– 379.
5. D. S. Rosenblum and E. J. Weyuker. Using Coverage Information to Predict the Cost-effectiveness of Regression Testing Strategies.IEEE Transactions on Software Engineering, March 1997, pp. 146– 156.
6. L. Baresi and M. Young. Test Oracles, Technical Report CIS-TR-01– 02. University of Oregon, Department of Computer and Information Science, Eugene, OR, August 2002, pp. 1– 55. 7. Q. Xie and A. M. Memon. Designing and Comparing Automated Test Oracles for Gui Based
Software Applications. ACM Transactions on Software Engineering amd Methodology, February 2007, pp. 1– 36.
8. G. Rothermel, R. Untch, C. Chu, and M. Harrold. Prioritizing Test Cases for Regression Testing.
IEEE Transactions on Software Engineering, October 2001, pp. 929– 948.
9. M. Davis and E. J. Weyuker. Metric Space-Based Test-Data Adequacy Criteria.Computer Journal, January 1988, pp. 17– 24.
10. P. G. Frankl and E. J. Weyuker. Provable Improvements on Branch Testing.IEEE Transactions on Software Engineering, October 1993, pp. 962– 975.
11. A. Parrish and S. H. Zweben. Analysis and Refinement of Software Test Data Adequacy Properties.
IEEE Transactions on Software Engineering, June 1991, pp. 565– 581.
12. E. J. Weyuker. Axiomatizing Software Test Data Adequacy.IEEE Transactions on Software Engi- neering, December 1986, pp. 1128– 1138.
13. E. J. Weyuker. Assessing Test Data Adequacy through Program Inference.ACM Transactions on Programming Languages and Systems, October 1983, pp. 641– 655.
14. H. Zhu, P. A. V. Hall, and J. H. R. May. Software Unit Test Coverage and Adequacy. ACM Computing Surveys, December 1997, pp. 366– 427.
15. S. H. Zweben and J. S. Gourlay. On the Adequacy of Weyuker’s Test Data Adequacy Axioms.
IEEE Transactions on Software Engineering, April 1989, pp. 496– 500.
16. E. W. Dijkstra. Notes on Structured Programming. InStructured Programming, O.-J. Dahl, E. W. Dijkstra, and C. A. R. Hoare, Eds. Academic, New York, 1972, pp. 1– 81.
17. J. B. Goodenough and S. L. Gerhart. Toward a Theory of Test Data Selection.IEEE Transactions on Software Engineering, June 1975, pp. 26– 37.
18. E. J. Weyuker and T. J. Ostrand. Theories of Program Testing and the Application of Revealing Subdomains.IEEE Transactions on Software Engineering, May 1980, pp. 236– 246.
19. J. S. Gourlay. A Mathematical Framework for the Investigation of Testing.IEEE Transactions on Software Engineering, November 1983, pp. 686– 709.
20. E. J. Weyuker. On Testing Non-Testable Programs.Computer Journal, Vol. 25, No. 4, 1982, pp. 465– 470.
21. E. J. Weyuker. The Evaluation of Program-Based Software Test Data Adequacy Criteria.Commu- nications of the ACM, June 1988, pp. 668– 675.
Exercises
1. Explain the concept of an ideal test.
2. Explain the concept of a selection criterion in test design. 3. Explain the concepts of a valid and reliable criterion. 4. Explain five kinds of program faults.
5. What are the drawbacks of Goodenough and Gerhart’s theory of program testing?
6. Explain the concepts of a uniformly ideal test as well as the concepts of uniformly valid and uniformly reliable criteria.
7. Explain how two test methods can be compared. 8. Explain the need for evaluating test adequacy.
9. Explain two practical methods for assessing test data adequacy. 10. Explain the concept of a nontestable program.
CHAPTER