Performance Testing Methodology

Introduction

Performance Testing is a software testing process used for testing the speed, response time, stability, reliability, scalability, and resource usage of a software application under a particular workload. The main purpose of performance testing is to identify and eliminate the performance bottlenecks in the software application.

Features and Functionality supported by a software system are not the only concern. A software application’s performance, like its response time, reliability, resource usage, and scalability, do matter. The goal of Performance Testing is not to find bugs but to eliminate performance bottlenecks.

Performance Testing is done to provide stakeholders with information about their application regarding speed, stability, and scalability. More importantly, Performance Testing uncovers what needs to be improved before the product goes to market. Without Performance Testing, the software is likely to suffer from issues such as running slow while several users use it simultaneously, inconsistencies across different operating systems, and poor usability.

The performance testing team (PTF Team) is in charge of the performance testing lifecycle (PTLC).

PTLC contains the following stages:

Performance acceptance criteria
Test planning
System modelling
Test scripts development
Test execution
System tuning
Test result reporting

PTLC

Performance acceptance criteria

This includes goals and constraints for throughput, response times and resource allocation. It is also necessary to identify project success criteria outside of these goals and constraints. Testers should be empowered to set performance criteria and goals because often the project specifications will not include a wide enough variety of performance benchmarks. Sometimes there may be none at all. When possible finding a similar application to compare to is a good way to set performance goals.

TBD NFRs/SLAs

Test planning

Test planning is particularly important for performance testing due to the need for the allocation of test environments, test data, tools and human resources. In addition, this is the activity in which the scope of performance testing is established. During test planning, risk identification and risk analysis activities are completed and relevant information is updated in any test planning documentation (e.g., test plan, level test plan). Just as test planning is revisited and modified as needed, so are risks, risk levels and risk status modified to reflect changes in risk conditions.

PTF Team backlog:

https://folio-org.atlassian.net/secure/RapidBoard.jspa?rapidView=264&view=planning.nodetail&issueLimit=100

System modelling

Know your physical test environment, production environment and what testing tools are available. Understand the details of the hardware, software and network configurations used during testing before you begin the testing process. It will help testers create more efficient tests. It will also help identify possible challenges that testers may encounter during the performance testing procedures.

PROD Config

Test ENV #1 - ncp3 by AWS ECS

Test ENV #2 - ncp4 by AWS ECS

Database: PostgreSQL by AWS RDS

Queue Manager: Kafka by AWS MSK

Environment

Use the default UChicago dataset - 27M records
Other datasets and their sizes: Check with P.Os, depending on the workflow to test.
Run two environments - 1 with a profiler and the other one withOUT a profiler.

Test development

Determine how usage is likely to vary amongst end users and identify key scenarios to test for all possible use cases. It is necessary to simulate a variety of end users, plan performance test data and outline what metrics will be gathered.

In the implementation phase, performance test cases are ordered into performance test procedures. These performance test procedures should reflect the steps normally taken by the user and other functional activities that are to be covered during performance testing. A test implementation activity is establishing and/or resetting the test environment before each test execution. Since performance testing is typically data-driven, a process is needed to establish test data that is representative of actual production data in volume and type so that production use can be simulated.

Global

The workflow test script should be created via JMeter as *.jmx file.

All JMeter scripts are stored in Carrier > Artifacts as *.zip file.

Test data

Test data includes a broad range of data that needs to be specified for a performance test.

For workflow testing reasons, the following scripts could be used:

1. DB Refresh - checkin-checkout-db-restore.sql
2. DB Update (basic)- circ-data-load.sh
3. DB Update (custom - 3k item-level requests adding)- circ-data-load_item-level-requests.sh
4. Other

Types of Performance testing

Type	Mandatory/Optional	Description
Smoke Test aka Health check	M	Should be performed every time when the functionality of the application and the script need to be checked. Also, it can be used as a warming-up test before the main testing step. The result of this testing is used to decide if a build is stable enough to proceed with further testing
Fixed Load Test	M	Load testing is testing that checks how systems function under a heavy number of concurrent virtual users performing transactions over a certain period. A load test is a kind of the most regular test to check the benchmark of the application and its components. Could be run with the load that is defined in NFRs/SLAs
Benchmark Testing	M	Benchmark Testing is defined as a software testing type, done to give a repeatable set of quantifiable results from which present and future software releases for specific functionality can be baselined or compared. It’s a process used to compare the performance of software or hardware system also known as SUT (System Under Test)
Capacity Testing	O	Should be performed to find the number of virtual users which the application supports in a stable state. The test can be performed as one of the first main tests and, also, should be performed after significant changes in the application or its configuration. While the increasing business and adding users, the team should be aware of the system capacity so that the user experience is not impacted while meeting the growth objective.
Stress testing	O	Under stress testing, various activities to overload the existing resources with excess jobs are carried out to break the system down. The goal of stress testing is to analyze post-crash reports to define the behaviour of the application after failure. The biggest challenge is to ensure that the system does not compromise the security of sensitive data after the failure. In a successful stress testing, the system will come back to normality along with all its components even after the most terrible breakdown. Stress testing is supposed to run occasionally to check the application’s stability under high load. Can be performed close to after the code is complete or by special request. Stress testing has next sub-types: High Load Test is a kind of stress test that focused on testing the behaviour and stability of the application during a certain period of time under high load (more than 100% of capacity). Hammer Test is a kind of stress test like a DDos attack that helps to understand can the application withdraws this attack, how long the application can stably work, and what kind of protection should be applied. The main goal of that type of test is to check the behaviour of the application on avoiding crushing under a huge number of requests (usually, without delay between them). Rush-hour Test is a kind of stress test that focuses on the ability of a system to respond correctly to sudden bursts of peak loads and return afterwards to a steady state. The failover Test is a way to gauge the capacity of a system in order to ensure whether a system can allocate extra resources. The entire process is an effort to create a backup system. Failover Testing aims to verify that a system is efficiently handling extra resources like additional CPU or servers during a failure.
Scalability testing	O	Scalability testing is non-functional testing, that measures the performance of a network or system when the number of user requests is scaled up or down. The purpose of Scalability testing is to ensure that an application can handle the projected increase in user traffic, data volume, transaction counts frequency, etc. It tests the system, processes, and database's ability to meet a growing need. Scalability testing lets you determine how your application scales with the increasing workload.
Volume testing	O?	Where the software is subjected to a huge volume of data. It is also referred to as flood testing. Volume testing is done to analyze the system performance by increasing the volume of data in the database. The objective of performing the volume testing is to check system performance with increasing volumes of data in the database and to identify the problem that is likely to occur with a large amount of data.
Endurance testing	O?	The ability of a software product to continue to function, over a long period of time, exercising its full range of use, without failing or causing failure. The purpose of Stability Testing is to check if the application will crash at any point in time.

Test execution

Test execution occurs when the performance test is conducted, often by using performance test tools. Test results are evaluated to determine if the system’s performance meets the requirements and other stated objectives. Any defects are reported.

Detailed information about workflow performance testing executions is described Steps for testing process.

Requested performance testing

The ticket contains all clear, understandable requirements and acceptance criteria.

The ticket should be prioritized by PTF Team PO.

Implementation of the ticket is going through PTLC.

Finally, the report should be shared with interested stakeholders.

Release performance testing

Regression benchmark testing for a new release contains results of a current release candidate and a comparison with the previous release.

The regression pack contains all necessary modules' or workflows' coverages.

The regression pack is an automated process.

TBD

System tuning

Consolidate, analyze and share test results. Then fine tune and test again to see if there is an improvement or decrease in performance. Since improvements generally grow smaller with each retest, stop when bottlenecking is caused by the CPU. Then you may have the consider option of increasing CPU power.

Test result reporting

For analysis results of Server-side Performance Testing should be used following metrics:

Application metrics:

Metric		Aggregation	Comments
Active Thread (VUsers)	M	Count
Response time	M	Average, median, 95 percentile Optionally: Minimum, Maximum, Std. Dev
Throughput (Transaction/sec)	M	Count
Hits/sec	O	Count
Response status code	O		Assertions
Failure rate	M	% Count
DB locks	M	Count
DB TOP waits	M	List
TBD

System metrics:

Metric		Aggregation	Comments
CPU used		%
Memory used		%
DB CPU used		%
DB Memory used		%

In some cases, new metrics will be introduced to cover missing areas (.NET metrics, Database metrics, etc).

For reporting, the next PTF - Report Template could be used.

WIKI Space: [Reporting] Performance Testing Reports