PRESIDENCIA DEL SENADO
6841003204 PRESIDENCIA DEL SENADO
After demonstrating the effectiveness and usefulness of the proposed approach (refer Figure 5.5) for the example program given in Figure 4.2, we are posed with the following research questions (RQ):
Table 5.2: Comparison of our proposed change-based cohesion metric with different existing approaches.
Comparison Features LCOM TLCOM RCI CBMC DRC ACCo Excluded
special Methods
Constructor No No Yes Yes Yes No Destructor No No No Yes Yes Yes
Access No No Yes Yes Yes Yes Delegation No No No Yes Yes Yes
Briand’s Properties
Property 1 No No Yes Yes Yes Yes Property 2 Yes Yes Yes Yes Yes Yes Property 3 No No Yes No Yes Yes Property 4 Yes Yes Yes Yes Yes Yes Transitive dependency No Yes No No Yes Yes Inheritance No No No No No Yes Interface No No No No No Yes Polymorphism No No No No No Yes Templates No No No No No Yes
quality (detection of faults) as compared to original test suite for all the ex- perimental programs?
RQ2: Usefulness. Is it feasible to generate the minimized test suite within accept-
able time limits?
A change set is maintained that refers to the set of concurrent changes carried out on the program. The test cases for the input program are generated using Junit Eclipse plugin1. To find the fault detection capability of the test cases, the program was seeded with mutants. To generate the mutants for the input program, we used MuClipse. MuClipse [183] is the Eclipse plugin version of µJava that generates two types of mutation operators both for traditional mutation and class mutation. We have considered both types of mutations in our approach. Smith et al. [183] and Do et al. [60] have carried out an extensive empirical study and justified mutants as good proxy of real faults.
1
Table 5.3: Test-suite minimization result of different programs.
Sl. No.
Programs LOC Avg. # of Af- fected Nodes Total # of Test Cases # Mu- tants
Selected Test Suite Minimized Test Suite
% of se- lected test cases % of faults de- tected % of mini- mized test cases % of faults de- tected 1 Expt. Program 54 33 20 14 25 100 54.8 91.6 2 Calculator 75 51 15 42 46.7 94 57.1 90.8 3 Elevator 90 54 25 27 40 98 57.2 92.1 4 Stack 114 72 22 35 40.9 96 57.3 92.6 5 Sorting 130 86 16 43 31.3 89 51.5 92.6 6 BST 130 74 20 51 60 100 54.5 90 7 CrC 261 94 18 46 33.3 93 52.8 91.5 8 DLL 277 83 24 47 25 98 52.9 89.4 9 Notepad 300 68 17 17 47.1 89 51.8 86.3 10 ATM 900 97 33 39 36.4 97 54.2 91.8 11 Elevator spl 1046 105 15 53 66.7 97 54.2 91.3 12 Email spl 1233 98 18 18 61.1 100 50.3 94.8 13 GPL spl 1713 112 22 22 63.6 94 53.5 92.2 14 Jtopas 5400 241 16 28 56.3 92 59 88.2 15 Nanoxml 7646 544 14 32 50 95 50.4 91.6
Figure 5.6: Test suite minimization results for all the ten changes made to the program.
Figure 5.7: Fault detection results of the minimized test suite for all the ten changes made to the program.
5.3.1 RQ1: Effectiveness
To represent the minimization problems, we computed the affected statements with respect to every change made to the experimental programs and computed their affected component cohesion values as discussed in Section 5.2.4. We also theoret- ically validated our ACCo metric in Section 5.2.4. In Table 5.2, we compare our proposed ACCo metric with some of existing metrics. In Table 5.2, it can be ob- served that the approaches such as LCOM [47], TLCOM [7] and CBMC [40] fail to satisfy all the four basic properties of cohesion [31], whereas RCI [32], DRC [218] and ACCo satisfies all the properties. Among these three approaches that satisfy Briand’s properties only DRC and ACCo consider transitive dependency to com- pute the cohesion. In addition to the transitive dependency among program parts, our proposed ACCo approach considers the impact of inheritance and other object- oriented features (such as interface, polymorphism, and templates) on the cohesion measurement. Thus, ACCo metric gives a better cohesion result than DRC.
A total of ten changes are made to each program and slices are computed for every change made to the programs. The total number of computed slices for all the fifteen programs is 150. These slices are used to access the impact of change and select the regression test cases. Table 5.3 shows the initial percentage of the selected test cases and the result of seeded fault detection for every experimental program. Then, we computed and compared the effectiveness of the minimized test
suite with the selected test suite. The percentage of minimized test suite and the percentage of faults detected by the minimized test suite are shown in Table 5.3. The proposed minimization approach achieved an overall test suite minimization of 54% approximately for all the fifteen programs. It is evident from Table 5.3 that the minimized test suite revealed approximately 91% of the faults as compared to 95% by the selected test suite, which is quite acceptable. Figure 5.6 shows the percentage of test-suite minimization achieved with respect to all the ten changes made to each program. These changes are summarized in Table 4.3. Figure 5.7 shows the percentage of faults detected by the minimized test suite with respect to all the ten changes made to the individual programs. Thus, our results confirm that the proposed test suite minimization approach is effective in minimizing the selected test-suite and can reveal most of the faults to ensure the quality of the software.
Figure 5.8: Timing results of the minimized test suite for all the ten changes made to the program.
5.3.2 RQ2: Usefulness
This research question addresses our concern that whether the proposed approach can generate the minimized test suite in a reasonable amount of time. The usefulness of the proposed approach is shown in terms of time taken to generate the minimized test suite. It is observed that the proposed approach can generate the minimized test
suite in less than 1 second for all the changes made to the programs, provided the selected test suite and their coverage information are available before computation. The timing results in Figure 5.8 show the time taken to minimize the test suite for every change made to the programs. This result includes the time to compute the slices, select the test cases hierarchically, compute the cohesion values with respect to the change impact analysis, and the time to minimize the selected test suite. The percentage of minimization achieved is shown in Figure 5.6. However, the timing results would improve when we fully integrate the different components of our proposed minimization framework shown in Figure 5.1. Thus, the results show that the proposed approach is very useful and scale better if the requisite test data is collected during the initial testing of the software.
5.3.3 Threats to validity
Like many other techniques on minimization, the proposed approach also has some threats to its validity.
• All the programs considered for experimentation represent various domains of application. However, real industrial applications can be huge in size and complexity as compared to the chosen programs.
• Intermediate graph-based slicing techniques can suffer scalability issues. To overcome this limitation to some extent, we restricted our regression test se- lection to method level only. Hence, the size of the selected test suite are much less at finer granularity of test case selection. As a result, this could have lessen the time of minimization.
• The proposed minimization problem is formulated based on the cohesion mea- sure given in Section 5.2.4. However, many other researchers have proposed various cohesion measures. Thus, the ILP problem may yield different results if the cohesion measure of other researchers are taken into consideration. • The mutants generated by MuJava sometimes may not represent the real-
faults of industrial applications. Thus, to remain close to the real-faults, we asked our graduate and post-graduate students to seed the errors. This may have resulted in some biasness in seeding the errors. Therefore, we considered only those test cases that gave high coverage of these faults.
• Since minimization problems are NP-complete, we focused on a single cri- terion for minimizing our test suite. However, considering other criteria for
minimization such as coupling measure of affected components, time for fault detection, energy utilization of the test cases, etc. may give some interesting results. Research outputs of such multi-criteria minimization problems are not addressed in this chapter.