Improve performance and refactoring of SRS with standard DB tables

Description

There are five use cases identified for MARC source records retrieval.

Use Case

Comment

1. Retrieve single records based on their Instance ID in order to feed the View Source in the UI

2. Retrieve all records

Retrieve paged responses as quickly as possible to the OAI-PMH modules.

3. Retrieve records for a period based on createdDate and updatedDate

Retrieve records filtered by their createdDate and updatedDates for certain date- and timespans for OAI-PMH

4. Retrieve MARC records for data export

Identifying records for the export will be Inventory driven so we will be getting underlying MARC records from SRS based on the Instance Id (and later by Holdings Id as well)

5. Retrieve MARC records based on custom criteria

High-level strategy to improve SRS performance.

  1. We have to get rid of RMB PostgreSQL Client and CQL because it does not provide us with fine-grained control over how SQL statements and especially Where clauses are generated.

  2. Fields in JSON documents that are used in the search conditions intensively must be taken out as separate columns of the table. Based on the use cases analysis DB indexes must be created for those columns. The consequence is that DB tables must be managed by Liquibase instead of RMB and schema.json

  3. We must define several strategies to retrieve data and each strategy must be used for a particular use case(s).

    1. Retrieve all MARC source records in chunks (it means the row set must be ordered, the most efficient way here to do this is ordering by Id) (Covers use case 2)

    2. Retrieve MARC source records for a period in chunks (it means the row set must be ordered, the most efficient way here to do this is ordering by updatedDate. To make criteria simpler only updatedDate must be considered. CreatedDate should not be used) (Covers use case 3)

    3. Retrieve MARC source records by Instance Ids. InstanceId field must be indexed (Covers use cases 1 and 4)

    4. Retrieve MARC records by custom criteria. Used by use case 5.

  4. Data structures returned by SQL queries as well as returned by HTTP end-points must be narrowed and contain only data valuable for a consumer. (see sql functions in the script)

  5. These changes must be implemented ASAP because of the breaking nature of them. Also migration scripts must be created. (see script for a baseline)

Environment

None

Potential Workaround

None

CSP Request Details

None

CSP Approved

None

CSP Rejection Details

None

Attachments

4

defines

is defined by

Checklist

hide

TestRail: Results

Activity

Show:

Oleksii KuzminovMay 20, 2020 at 12:24 PM

Hello , No changes in json schemas now

Ann-Marie BreauxMay 19, 2020 at 5:38 PM

And and pardon a stupid question on my part. After this refactoring of SRS, will the JSON schema for these MARC bibs have changed?

Ann-Marie BreauxMay 19, 2020 at 5:37 PM

and Once this refactoring work is completed, will we be able to test the two bugs linked to this umbrella and see if they have improved?

Marc JohnsonApril 16, 2020 at 4:10 PM

I was asking Oleksii (my team lead and the owner of this umbrella) if we need to create a task within this umbrella for that migration task.

Apologies, I shouldn't have butted in.

Ann-Marie BreauxApril 16, 2020 at 4:02 PM
Edited

Hi Yes, I understand that. I was asking Oleksii (my team lead and the owner of this umbrella) if we need to create a task within this umbrella for that migration task. We had a meeting earlier today where we discussed the scope of stories and tasks in this and some other SRS related performance umbrellas. I appreciate your input.

Done

Details

Assignee

Reporter

Priority

Development Team

Folijet

Fix versions

TestRail: Cases

Open TestRail: Cases

TestRail: Runs

Open TestRail: Runs

Created December 24, 2019 at 2:32 PM
Updated June 26, 2020 at 12:53 PM
Resolved June 26, 2020 at 12:53 PM
TestRail: Cases
TestRail: Runs