Anton Kotelyanskii
Gregory M. Kapfhammer
Test Suites
Automatic Generation
Confronting Challenges
Evaluation Strategies
Challenges
Importance
Replication
Rarity
Amazing test suite generator
Uses a genetic algorithm
Input: A Java class
Output: A JUnit test suite
RSM: Response surface methodology
SPOT: Sequential parameter optimization toolbox
Successfully applied to many diverse problems!
Eight EvoSuite parameters
Ten projects from SF100
475 Java classes for subjects
100 trials after parameter tuning
Aiming to improve statement coverage
Parameter Name | Minimum | Maximum |
Population Size | 5 | 99 |
Chromosome Length | 5 | 99 |
Rank Bias | 1.01 | 1.99 |
Number of Mutations | 1 | 10 |
Max Initial Test Count | 1 | 10 |
Crossover Rate | 0.01 | 0.99 |
Constant Pool Use Probability | 0.01 | 0.99 |
Test Insertion Probability | 0.01 | 0.99 |
184 days of computation time estimated
Cluster of 70 computers running for weeks
Identified 139 "easy" and 21 "hard" classes
Mann-Whitney U-test and
Vargha-Delaney effect size
Category | Effect Size | p-value |
Results Across Trials and Classes | 0.5029 | 0.1045 |
No "Easy" and "Hard" Classes | 0.5048 | 0.0314 |
Using lower-is-better inverse statement coverage
Effect size greater than 0.5 means that tuning is worse
Testing shows we do not always reject the null hypothesis
Additional empirical results in the QSIC 2014 paper!
Tuning improved scores for 11 classes
Otherwise, same as or worse than defaults
A "soft floor" may exist for parameter tuning
Additional details in the QSIC 2014 paper!
Fundamental Challenges
Tremendous Confidence
Great Opportunities
Comprehensive Experiments
Conclusive Confirmation
For EvoSuite, Defaults = Tuned