A few approaches to automation performance testing of DI

Currently, our DI does not work in the true parallel mode. This means that all DI processes load all chunks of files into Kafka and wait when the system handles all those chunks in sequential order (FIFO - first in, first out).

For now, DI can not import some specific chunks from Kafka related to a selected job or a user. The feature UXPROD-3471: OCLC single record import takes extended time to complete when large data import jobs are running is not implemented yet.

This means that we don't need to execute performance tests with the parallel load. We can add some parallel load to mod-inventory module, but this module is never loaded in real life in such a way as to impact import.

Based on the above, to automate DI performance testing we need:

Set up Perf environment. This environment will be working all time and shared between a few teams, but only devops team should have permission to update and reconfigure this by request, but it is important to do it very quickly because performance testing is very time consuming and downtime during the next changes (configuring/updating the system, changing different parameters, changing the number of the Kafka topics, etc.) should be as little as possible. Some configuration examples: LCP performance testing, Folijet - Lotus Snapshot Performance testing
Automatically create job profiles to DI (usually PTF - Create 2 and Updates Success - 1 profiles are applied). For this action, Karate task and Jenkins job can be used to start this test.
Start one or more DIs in automatic mode. We should not start a new DI without checking the status of the previous DI: because the DI takes a lot of time and when some problems are met, we should analyze those ones before starting a new DI, otherwise, we might lose time and catch the same error. For automatic mode, we also can use Karate task and Jenkins job to start DI and another Jenkins job (or the same job) to analyze the status of launched DI (Completed, not Completed or Completed with Errors).
It is recommended:
1. access to some graphical analytics tools: we need to check how much memory our modules use. This is also related to CPU resources and open DB connections;
2. all logs should be collected in Kibana;
3. access to some tools for analyzing application metrics and traces (telemetry).

I propose to check the next approach:

Create 1st Karate task to login and create two Job Profiles (create/update);
Create 2nd Karate task to:
1. login;
2. check if there is not any active task:
  1. is true → this Karate task should start the DI and wait for it to be finished:
    1. finished with "Complete" status → next DI should be started.
    2. finished with "Completed with Errors" → the Karate task should be stopped.

!!! One of the problems that should be solved is storage (around 1.5 GB) for files: 5K, 10K, 25K, 50K, 100K and 500K records.