FOREPOST Project on Feedback-Directed Performance Testing

Mark Grechanik Ph.D., University of Texas at Austin

Summary A goal of performance testing is to find situations when applications unexpectedly exhibit worsened characteristics for certain combinations of input values. A fundamental question of performance testing is how to select a manageable subset of the input data faster to find performance problems in applications automatically. We offer a novel solution for finding performance problems in applications automatically using black-box software testing. Our solution is an adaptive, feedback-directed learning testing system that learns rules from execution traces of applications and then uses these rules to select test input data automatically for these applications to find more performance problems when compared with exploratory random testing. We have implemented our solution and applied it to a medium-size application at a major insurance company and to an open-source application. Performance problems were found automatically and confirmed by experienced testers and developers. The entire code, experimental setup, and video can be obtained from here. The Problem And Our Solution A goal of performance testing is to find performance problems, when an application under test (AUT) unexpectedly exhibits worsened characteristics for a specific workload. For example, effective test cases for \emph{load testing}, which is a variant of performance testing, find situations where an AUT suffers from unexpectedly high response time or low throughput. Test engineers construct performance test cases, and these cases include actions (e.g., interacting with GUI objects or invoking methods of exposed interfaces) as well as input test data for the parameters of these methods or GUI objects. It is difficult to construct effective performance test cases that can find performance problems in a short period of time, since it requires test engineers to test many combinations of actions and data for nontrivial applications. Depending on input values, an application can exhibit different behaviors with respect to resource consumption. Some of these behaviors involve intensive computations that are characteristic of performance problems. Naturally, testers want to summarize the behavior of an AUT concisely in terms of its inputs, so that they can select input data that will lead to significantly increased resource consumption thereby revealing performance problems. Unfortunately, finding proper rules that collectively describe properties of such input data is a highly creative process that involves deep understanding of input domains. Descriptive rules for selecting test input data play a significant role in software testing, where these rules approximate the functionality of an AUT. For example, a rule for an insurance application is that some customers will pose a high insurance risk if these customers have one or more prior insurance fraud convictions and deadbolt locks are not installed on their premises. Computing insurance premium may consume more resources for a customer with a high-risk insurance record that matches this rule versus a customer with an impeccable record, since processing this high-risk customer record involves executing multiple computationally expensive transactions against a database. Of course, we use this example of an oversimplified rule to illustrate the idea. Even though real-world systems exhibit much more complex behavior, useful descriptive rules often enable testers to build effective performance fault revealing test cases. We will demonstrate our novel solution for Feedback-ORiEnted PerfOrmance Software Testing (FOREPOST) for finding performance problems automatically by learning and using rules that describe classes of input data that lead to intensive computations. FOREPOST is an adaptive, feedback-directed learning testing system that learns rules from AUT execution traces and uses these learned rules to select test input data automatically to find more performance problems in applications when compared to exploratory random performance testing. FOREPOST uses runtime monitoring for a short duration of testing together with machine learning techniques and automated test scripts to reduce large amounts of performance-related information collected during AUT runs to a small number of descriptive rules that provide insights into properties of test input data that lead to increased computational loads of applications. • FOREPOST collects and utilizes execution traces of the AUT to learn rules that describe the computational intensity of the workload in terms of the properties of input data. These rules are used by the adaptive automated test script automatically, in a feedback loop to steer the execution of the AUT by selecting input data using these learned rules. We know of no testing approach that uses this idea to find performance problems in real-world applications. • We give a novel algorithm that identifies methods that lead to performance bottlenecks (or hot spots), which are phenomena where the performance of the AUT is limited by one or few components. • We have implemented FOREPOST and applied it to an application at a major insurance company. Performance problems were found automatically in the insurance application and confirmed by experienced testers and developers who work at this company. After implementing a fix, the performance of this application was improved by approximately seven percent. • We also applied FOREPOST to an open-source application benchmark, JPetStore. FOREPOST automatically found rules that steer the executions of JPetStore towards input data that increases the average execution time by the order of magnitude when compared with exploratory random testing. Downloads and Experimental Results To reproduce results of our experiments with FOREPOST, you need to download the following components. • The experimental setup of the entire evironment is available here as a virtual machine that runs in VmWare player. This is a compressed file of over 3.5Gb - you will need to log into Box.com and download the file. • The movie that shows how FOREPOST was used at a major insurance company is available here. • My ICSE’12 presentation of FOREPOST is available here. People FOREPOST was created at the Advanced Research In Software Engineering (ARISE) lab at the Department of Computer Science of the University of Illinois at Chicago and Accenture Technology Lab where Mark Grechanik led a research team. Mark Grechanik, Project Lead Email: drmark[at]uic.edu Qi Luo Email: qluo[at]cs.wm.edu Aswathy Nair, UIC - now at Bank of America. Email: nair.a.87[at]gmail.com Denys Poshyvanyk Email: denys[at]cs.wm.edu Past members: Chen Fu, Microsoft Corp. Email: chenfu[at]microsoft.com Qing Xie, Accenture Ltd. Email: qing.xie[at]accenture.com