Data Import Stabilization plan
Steps
- Gather existing issues (Vladimir Shalaev , Kateryna Senchenko )
- Create new features (Vladimir Shalaev, Kateryna Senchenko )
- Provide feature dependencies (Vladimir Shalaev , Kateryna Senchenko )
- Estimate (priorities + complexity) (Vladimir Shalaev , Kateryna Senchenko )
- Remove duplicates (grooming with Ann-Marie)
- Final priorities
- Align to timeline, and assign to appropriate Jira Feature, and review Jira issue priorities (Taisiya Trunova)
Categories
See : Assessment ratings
- Performance: di-performance
- Stability/Reliability: di-data-integrity (more tags to be added)
- Scalability
- Architecture
- Code quality
Priorities
High, Mid, Low
Complexity
S, M, L, XL, XXL
Table
| Category | Problem definition | Business impact | Proposed solution | Priority DEV | Priority PO | Complexity | Existing Jira item(s) | Current feature(s) | Final feature (s) | |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Performance | Kafka producer closed after sending | Low performance of import | Create pool of active producers. Start pool on module launch, close on shutdown. Reuse connections. Add max/min pool sizes. | High | L | MODDATAIMP-499 - Getting issue details... STATUS | UXPROD-3135 - Getting issue details... STATUS | UXPROD-3191 - Getting issue details... STATUS | |
| 2 | WARN message when no handler found | none | Do not subscribe to messages you're not going to process OR Lower log lever for this type of messages | Low | S | MODSOURCE-340 - Getting issue details... STATUS | UXPROD-3135 - Getting issue details... STATUS | UXPROD-3191 - Getting issue details... STATUS | ||
| 3 | Stability/Reliability | Race condition on start (Kafka consumers start working before DB is configured) OR Periodical DB shutdown after SRS restart. Jobs get stuck if not able to update status in DB (messages ACKed even if we could not process them) | Imports might get stuck on module restart | Need investigation / check Investigate the issue with DB (possible OOM on PG server) | Mid | MODSOURCE-339 - Getting issue details... STATUS | UXPROD-3135 - Getting issue details... STATUS | UXPROD-3193 - Getting issue details... STATUS | ||
| 4 | Performance Stability/Reliability | High CPU/Memory consumption on modules | Low performance of import. Higher costs for hosting | Significantly decrease size of payload:
| High | XXL | MODDATAIMP-439 - Getting issue details... STATUS MODSOURMAN-519 - Getting issue details... STATUS MODINV-405 - Getting issue details... STATUS MODINV-408 - Getting issue details... STATUS MODINV-460 - Getting issue details... STATUS MODINVOICE-251 - Getting issue details... STATUS MODINVOICE-252 - Getting issue details... STATUS MODPUBSUB-167 - Getting issue details... STATUS MODSOURCE-286 - Getting issue details... STATUS MODSOURCE-290 - Getting issue details... STATUS MODSOURMAN-463 - Getting issue details... STATUS MODSOURMAN-464 - Getting issue details... STATUS MODSOURMAN-465 - Getting issue details... STATUS MODSOURMAN-466 - Getting issue details... STATUS MODSOURMAN-468 - Getting issue details... STATUS MODSOURMAN-469 - Getting issue details... STATUS MODSOURMAN-474 - Getting issue details... STATUS MODSOURMAN-519 - Getting issue details... STATUS | UXPROD-3135 - Getting issue details... STATUS | UXPROD-3193 - Getting issue details... STATUS | |
| 5 | Performance | Kafka cache resource consumption | Low performance of import. Higher costs of hosting. | Remove Kafka cache. Modules that do not do persistent changes will sometimes (on duplicates read) do unnecessary calls. Can be optimized further upon adding distributed in-memory cache (ex hazelcast) (blocked by 6) | Mid | M | UXPROD-3135 - Getting issue details... STATUS | UXPROD-3191 - Getting issue details... STATUS | ||
| 6 | Stability/Reliability | Duplicates created upon import | Data inconsistency on import. | Make consumers behave idempotent. Add pass-through identifier to de-duplicate messages. | High | XL | MODDATAIMP-474 - Getting issue details... STATUS MODDATAIMP-440 - Getting issue details... STATUS MODDATAIMP-491 - Getting issue details... STATUS MODDATAIMP-495 - Getting issue details... STATUS | UXPROD-3135 - Getting issue details... STATUS | UXPROD-3193 - Getting issue details... STATUS | |
| 7 | Stability/Reliability | Kafka consumers stop reading messages eventually, breaking job progress until module restart. | Imports eventually get stuck until module restart | Need investigation | High | ? | MODINV-417 - Getting issue details... STATUS | UXPROD-3135 - Getting issue details... STATUS | UXPROD-3193 - Getting issue details... STATUS | |
| 8 | Code quality | Test coverage is not high enough (Unit) | Higher amount of bugs | Write more tests | Mid | S | MODPUBSUB-168 - Getting issue details... STATUS | UXPROD-2697 - Getting issue details... STATUS | UXPROD-2697 - Getting issue details... STATUS | |
| 9 | Code quality | Test coverage is not high enough (Karate) | Higher amount of bugs | Write more tests (define test cases) | Mid | L | UXPROD-2697 - Getting issue details... STATUS | UXPROD-2697 - Getting issue details... STATUS | UXPROD-2697 - Getting issue details... STATUS | |
| 10 | Stability/Reliability | mod-data-import stores input file in memory, limiting size of uploaded file and possibly having oom | Data import file size is limited | Split to chunks, put to database, work with database/temp storage. Partially done (to be investigated) | Mid | L | MODDATAIMP-390 - Getting issue details... STATUS MODDATAIMP-392 - Getting issue details... STATUS MODDATAIMP-465 - Getting issue details... STATUS | UXPROD-3135 - Getting issue details... STATUS | UXPROD-3193 - Getting issue details... STATUS | |
| 11 | Performance | Data import impacts other processes | Slower response of system during data import | Need investigation (possible solution - configure rate limiter) Relates to number 4 | MODDATAIMP-517 - Getting issue details... STATUS | UXPROD-3135 - Getting issue details... STATUS | UXPROD-3191 - Getting issue details... STATUS | |||
| 12 | Performance | High resource consumption to get job(s) status/progress | Slow performance of import and landing page. | Add some kind of caching for progress tracking (database or in-memory) | Low | S | MODSOURMAN-469 - Getting issue details... STATUS UIDATIMP-918 - Getting issue details... STATUS | UXPROD-3135 - Getting issue details... STATUS | UXPROD-3191 - Getting issue details... STATUS | |
| 13 | Stability/Reliability | SRS can fail when processing message during import | Import can end up creating some instances but not creating holdings/items for some MARC records | Generate "INSTANCE CREATED" from mod-inventory. Consume in SRS to update HRID in BIB and in INVENTORY to continue processing. Remove unnecessary topics (* ready for post processing and hrid set) | Mid | L | MODDATAIMP-500 - Getting issue details... STATUS | UXPROD-3135 - Getting issue details... STATUS | UXPROD-3193 - Getting issue details... STATUS | |
| 14 | Stability/Reliability | If we have infrastructure issue (like DB not available, module being restarted or network failure), we are sending DI_ERROR instead of retrying | Records that can potentially be processed during import are not processed if we have temporary infrastructure issues (DB down, network connectivity loss, etc) | Do not ACK messages in Kafka if there's not a logic, but infrastructure error/exception. Split failed processing results into 2 categories:
| Mid | MODDATAIMP-501 - Getting issue details... STATUS | UXPROD-3135 - Getting issue details... STATUS | UXPROD-3193 - Getting issue details... STATUS | ||
| 15 | Consumer gets disconnected from Kafka cluster | Jobs get stuck until module restart | Need investigation | Mid | MODINV-417 - Getting issue details... STATUS | UXPROD-3135 - Getting issue details... STATUS | UXPROD-3193 - Getting issue details... STATUS | |||
| 16 | De-duplication of status messages for progress bar | Progress bar might display incorrect progress | De-duplicate status messages per-record while tracking progress | Mid | L (depends on 12) | MODSOURMAN-522 - Getting issue details... STATUS | UXPROD-3135 - Getting issue details... STATUS | UXPROD-3193 - Getting issue details... STATUS |
Filters
Issues to potentially remove from scope
MODDATAIMP-410 - Getting issue details... STATUS
MODDATAIMP-430 - Getting issue details... STATUS
MODDATAIMP-444 - Getting issue details... STATUS
MODSOURCE-300 - Getting issue details... STATUS
MODSOURMAN-481 - Getting issue details... STATUS
MODSOURMAN-521 - Getting issue details... STATUS