...
...
...
...
...
...
...
...
Info |
---|
It's been found after testing that the actual durations of the imports performed were about 2 (two) times longer than what was reported. This is due to the PTF environment missing a DB trigger that, when restored, doubled the imports' durations. |
Table of Contents outline true
...
For subsequent DI marc Authorities update through the whole DB script was created Marc Authorities update instructions#ConfigurationFile. It looks like script is not performant.
In scope of PERF-456 it's needed to run tests to answer a questions:
...
- On PTF environments we have a lots of corrupted data (SRS records that has no corresponding records in mod_inventory_storage.authority table)
- To solve this Shans Kaluhin did rewrite script to use data export for ID's that was extracted from inventory-storage to generate valid .mrc file.
- ±100 000 records can be imported in less than 30 minutes (to be more accurate in 27-30 minutes) with using this kind of Infrastructure
- Import limit and inventory limit was set for 100 000 for all tests.
- For data base containing 6.6M records whole update took approximately 15 hours.
- Possible memory leak detected on mod-inventory-storage (memoryusage grow from 27% to 62% during first test. And from 62% to 95% during second test. )
- DB size (mod_inventory_storage.authority) 6664205 records. According to data import it did update 2688643 records. Which is 40%!!!
...
- Original ticket to test script (and Authorities update) performance PERF-456
- Additional info for Authorities data import (modules, kafka topics, data flow) SPIKE: Update MARC authority records via Data import
- instructions how to use script to trigger subsequent marc Authorities update Marc Authorities update instructions#ConfigurationFile
- Ticket to rename fields in inventory_storage.Authority schema MODINVSTOR-875 (In PTF env we hadn’t updated it for easy update use:
...
Infrastructure
PTF -environment ncp3
- 9 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
- 2 instances of db.r6.xlarge database instances, one reader, and one writer
- MSK ptf-kakfa-3
- 4 m5.2xlarge brokers in 2 zones
Apache Kafka version 2.8.0
EBS storage volume per broker 300 GiB
- auto.create.topics.enable=true
- log.retention.minutes=480
- default.replication.factor=3
- Kafka topics partitioning:
- DI_RAW_RECORDS_CHUNK_READ -2
- DI_RAW_RECORDS_CHUNK_PARSED -2
- DI_PARSED_RECORDS_CHUNK_SAVED -2
- DI_SRS_MARC_AUTHORITY_RECORD_CREATED -1
- DI_COMPLETED -2
...