ContinuITy: Automated Performance Testing in Continuous Software Engineering

Due to the emerging digital transformation, software performance as one major aspect of Quality of Service becomes increasingly important for the business success of modern companies. It is crucial to consider performance already in the testing phase before delivery. One commonly used and fundamental approach to evaluating a system's performance by simulating user behavior is load testing.

A common interpretation of load testing is to test the whole system considering various user behaviors. This results in long-lasting and highly resource-consuming tests often requiring manual intervention. While this approach turned out to be challenging in the past, load testing has become infeasible in modern software engineering due to new principles like DevOps, new processes like continuous integration and continuous delivery and finally, new architectures like microservices. Existing approaches address this challenge by automating generation or execution of load tests. However, they do not cover the whole problem. In particular, maintenance of load tests over evolving systems and user behavior is missing as well as support for microservice architectures.

With ContinuITy, we are going to re-enable load testing in the scenario of modern software engineering. Since continuous integration pipelines require automation, we target automation of the load testing process by building on the existing WESSBAS approach for workload recording and on BenchFlow for test execution. We evolve the recorded workload models over changing usage behavior and system interfaces in order to provide realistic and constantly executable load tests based on the workload models. Manual changes in the workload models are allowed and retained during model evolution. Taking microservices into account, we are going to modularize the load tests and enrich them with contextual information. This approach minimizes the test overhead, since it enables to execute small, microservice-grained load tests focusing on a specific context. In addition, composing context-specific load tests facilitates proactive load testing. In order to minimize the test deployment size, as well, we introduce a new approach to performance stubbing. Concluding, we aim at realistic, automated and context-specific load testing that does not need significantly more resources than system-wide load testing but delay delivery significantly less and thus, is usable in modern software engineering.