Jira Legacy | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
|
...
500 error causes
Issue | Cause | Description |
---|---|---|
Kafka topics configuration | Kafka consumers try to connect to existing consumer groups that have different assignment strategies. This led to mod-quick-marc doesn't receive any confirmation about successful/failed update. mod-quick-marc has a timeout (the 30s) for confirmation receiving. If the timeout is exceeded then respond with a 500 error. | |
Related modules are down | The process of update is started in SRM by sending a Kafka event to SRS and then to Inventory. If SRS or Inventory is down then the process couldn't be finished. Consequently, the timeout for confirmation receiving in mod-quick-marc is exceeded. | |
Kafka topic does not exist | Kafka consumers cannot connect and receive messages because topics do not exist. Consequently, the timeout for confirmation receiving in mod-quick-marc is exceeded. | |
IDs of records are not consistent | When an SRS record is created firstly it has 2 IDs: record ID and record source ID, and they are the same. After 1st record update record source ID is changed. On the mod-quick-marc side, we expect to receive an event that contains an ID that is equal to the record ID but receives a different ID, because the initial ID was changed. Consequently, the timeout for confirmation receiving in mod-quick-marc is exceeded. | |
Optimistic locking response changed | When mod-quick-marc is getting an optimistic locking error from mod-inventory, it expects that it has JSON-structure. But it was changed to a simple string, this causes an error during message processing. |
Info |
---|
|
Warning |
---|
The issue could be related to updating data directly to storage in cases if inventory-storage record ID is not consistent or linked to source-storage record ID |
Main problems
- quickMarc update flow is implemented on very similar to the data-import flow but is not updated with latest data-import improvements
- quickMarc update flow uses different from data-import Kafka settings
- async-to-sync approach that is used in quickMarc has no error handling (in any problem the result is 500 error)
...
- Create default update profiles for each record type (which could be hidden)
- Use already implemented for derive MARC bib and create MARC holdings code base for data-import job initializing
- Using ReplyingKafkaTemplate instead of the combination of DeferredResult and cache
- Configure template to receive events from DI_COMPLETED and DI_ERROR topics
- Modify data-import payload to always populate correlationId Kafka header if it exists in the initial event.
- Start data-import process record by sending DI_RAW_RECORDS_CHUNK_READ instead of using POST /jobExecutions/{jobExecutionId}/records endpoint
- Specify a timeout for confirmation receiving (1 min ??)
- If the timeout is exceeded then use combination of GET /metadata-provider/jobLogEntries/{jobExecutionId} and GET /metadata-provider/jobLogEntries/{jobExecutionId}/records/{recordId} endpoints to get the status of job and error message if it failed.
...
Info | ||
---|---|---|
| ||
The async-to-sync approach is about waiting for update confirmation received and only after that respond to UI. |
View file | ||||||||
---|---|---|---|---|---|---|---|---|
|
Info | ||
---|---|---|
| ||
The status ping approach is about responding to UI with a status of the process, and UI ping status endpoint until the status changes to COMPLETED or ERROR. |
Warning |
---|
All actions in quickMarc should be moved to one of these approaches. |
Pros | Cons |
---|---|
All processes in quickMarc are consistent and easy to maintain | All future bugs in data-import will affect quickMarc also |
Kafka configuration is inherited from data-import Kafka configuration | A little performance degradation |
All future improvements to data-import will affect quickMarc also | |
Some features could be easily implemented in the future based on these changes:
|
Prevention plan
- Fix Karate tests for mod-quick-marc to make it possible to rely on them. Possible solution: use mod-srs and mod-inventory-storage endpoints to populate records instead of using data-import.
- Continuously update Karate, module integration, and unit tests each time a new edge-cased bug was founded and fixed.