A few approaches to automation performance testing of DI

Currently, our DI does not work in the true parallel mode. This means that all DI processes load all chunks of files into Kafka and wait when the system handles all those chunks in sequential order (FIFO - first in, first out).

For now, DI can not import some specific chunks from Kafka related to a selected job or a user. The feature UXPROD-3471: OCLC single record import takes extended time to complete when large data import jobs are running is not implemented yet.

This means that we don't need to execute performance tests with the parallel load. We can add some parallel load to mod-inventory module, but this module is never loaded in real life in such a way as to impact import.

Based on the above, to automate DI performance testing we need:

  1. Set up Perf environment. This environment will be working all time and shared between a few teams, but only devops team should have permission to update and reconfigure this by request, but it is important to do it very quickly because performance testing is very time consuming and downtime during the next changes (configuring/updating the system, changing different parameters, changing the number of the Kafka topics, etc.) should be as little as possible. Some configuration examples: LCP performance testing, Folijet - Lotus Snapshot Performance testing
  2. Automatically create job profiles to DI (usually  PTF - Create 2 and Updates Success - 1 profiles are applied). For this action, Karate task and Jenkins job can be used to start this test.
  3. Start one or more DIs in automatic mode. We should not start a new DI without checking the status of the previous DI: because the DI takes a lot of time and when some problems are met, we should analyze those ones before starting a new DI, otherwise, we might lose time and catch the same error. For automatic mode, we also can use Karate task and Jenkins job to start DI and another Jenkins job (or the same job) to analyze the status of launched DI  (Completed, not Completed or Completed with Errors). That related to automation of Update operation we should have direct access to DB trough CLI or JDBC connection for receiving latest Instance Ids (we need it for export operation).
  4. Start OCLC single record import in automatic mode. For testing "parallel" handling of OCLC import (this one record import should not wait until large data import will finish), we need an additional Karate task and Jenkins job to start OCLC imports.
  5. It is recommended:
    1. access to some graphical analytics tools: we need to check how much memory our modules use. This is also related to CPU resources and open DB connections;
    2. all logs should be collected in Kibana;
    3. access to some tools for analyzing application metrics and traces (telemetry).

PTF Data Import profiles

I propose to check the next approach:

  1. Create 1st Karate task to login and create two Job Profiles (create/update);
  2. Create 2nd Karate task to:
    1. login;
    2. check if there is not any active task:
      1. is true → this Karate task should start the DI and wait for it to be finished:
        1. finished with "Complete" status → next DI should be started.
        2. finished with "Completed with Errors" → the Karate task should be stopped;
  3. Create 3rd Karate task to login and start OCLC imports. This task can run automatically (by some scheduler every 15 minutes) in parallel with the 2nd task.

!!! One of the problems that should be solved is storage (around 1.5 GB) for files: 5K, 10K, 25K, 50K, 100K and 500K records.