Understanding Contextual Factors in Regression Testing Techniques
View/ Open
Abstract
The software regression testing techniques of test case reduction, selection, and prioritization are widely used and well-researched in software development. They allow for more efficient utilization of scarce testing resources in large projects, thereby increasing project quality at reduced costs. There are many data sources and techniques that have been researched, leaving software practitioners with no good way of choosing which data source or technique will be most appropriate for their project. This dissertation addresses this limitation. First, we introduce a conceptual framework for examining this area of research. Then, we perform a literature review to understand the current state of the art. Next, we performed a family of empirical studies to further investigate the thesis. Finally, we provide guidance to practitioners and researchers. In our first empirical study, we showed that advanced data mining techniques on an industrial product can improve the effectiveness of regression testing techniques. In our next study, we expanded on that research by learning a classification model. This research showed attributes such as complexity and historical failures were the most effective metrics due to a high occurrence of random test failures in the product studied. Finally, we applied the learning from the initial research and the systematic literature survey to develop novel regression testing techniques based on the attributes of an industrial product and showed these new techniques to be effective. These novel approaches included predicting performance faults from test data and customizing regression testing techniques based on usage telemetry. Further, we provide guidance to practitioners and researchers based on the findings from our empirical studies and the literature survey. This guidance will help practitioners and researchers more effectively employ and study regression testing techniques.