Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Jira Legacy
serverSystem JiraJIRA
serverId01505d01-b853-3c2e-90f1-ee9b165564fc
keyMODKBEKBJ-458

Spike should address the requirements from MODKBEKBJ-304.

Goal of the spike is to find possible solutions to combine existing eHoldings filters with filtering by agreement assignment status.

...

  1. Add new table to database with packageId→agreementId relation (kbCredentialsId, recordId(packageId), recordType(package), agreementId).
    Jira Legacy
    serverSystem JiraJIRA
    serverId01505d01-b853-3c2e-90f1-ee9b165564fc
    keyMODKBEKBJ-651
  2. Implement api for posting package-agreement relations (which will write relation to database and send to mod-agreements).
    Jira Legacy
    serverSystem JiraJIRA
    serverId01505d01-b853-3c2e-90f1-ee9b165564fc
    keyMODKBEKBJ-652
  3. Copy existing relations from mod-agreements (develop sql script. Note: it should be executed not as sqlite change but separately with necessary rights to both schemas. Add script somewhere in classpath and mention it in release notes).
  4. Implement agreement assignment filtering. For 'assigned': retrieve packageIds from new table for filtering and pass ids to holdingsIQ (same logic as for tags). For 'not assigned' filtering we should query packages from holdingsIQ (possibly multiple times), then filter them by existence in database(repeat until we get complete page to return on front-end). We can use holdingsIQ filters along with 'not assigned' filter because no preprocessing is done before querying from holdingsIQ.
    Jira Legacy
    serverSystem JiraJIRA
    serverId01505d01-b853-3c2e-90f1-ee9b165564fc
    keyMODKBEKBJ-653
  5. Replace call to mod-agreements for assignment with call to new mod-kb-ebsco-java api (it will send corresponding request to mod-agrrements by itself) (UI, eHoldings).
    Jira Legacy
    serverSystem JiraJIRA
    serverId01505d01-b853-3c2e-90f1-ee9b165564fc
    keyUIEH-1289
  6. Agreement should be posted to mod-agreements without package relation and then new assignment api of mod-kb-ebsco should be called (which will send assignment to mod-agreements by itself).
  7. Replace call to mod-agreements for assignment with call to new mod-kb-ebsco-java api (it will send corresponding request to mod-agrrements by itself) (UI, agreements, agreement line page).
    Jira Legacy
    serverSystem JiraJIRA
    serverId01505d01-b853-3c2e-90f1-ee9b165564fc
    keyUIEH-1290
  8. Implement 'unassign' api on kb-ebsco-java to delete package-agreement relation.
    Jira Legacy
    serverSystem JiraJIRA
    serverId01505d01-b853-3c2e-90f1-ee9b165564fc
    keyMODKBEKBJ-654
  9. Add call for 'unassign' api of kb-ebsco-java on agreement unassignment on package page (UI).
    Jira Legacy
    serverSystem JiraJIRA
    serverId01505d01-b853-3c2e-90f1-ee9b165564fc
    keyUIEH-1291
  10. Add call for 'unassign' api of kb-ebsco-java on agreement line deletion on agreement page (UI).
    Jira Legacy
    serverSystem JiraJIRA
    serverId01505d01-b853-3c2e-90f1-ee9b165564fc
    keyUIEH-1292
  11. Add possibility to combine 'assigned' filter with other kb-ebsco-java internal filters.
    Jira Legacy
    serverSystem JiraJIRA
    serverId01505d01-b853-3c2e-90f1-ee9b165564fc
    keyMODKBEKBJ-655

...

Internal filters of mod-kb ebsco can be combined with each other too(not implemented yet, see MODKBEKBJ-371). After filtering kb-ebsco-java pass only packageIds to retrieve packages by ids from holdingsIQ without any filtering.

...

Some universities have at about 200-600 of packages assigned to an agreement but there are cases with 1-2 thousands of packages assigned to an agreement so second solution is more appropriate.

Further discussions

Filters combining clarifications

  1. Assigned packages filtering will not be able to be combined with package name filtering, or other holdingsIQ filterings (same as tags).
  2. Not assigned packages filtering will be able to be used together with package name filtering and other holdingsIQ filtering.

Problems

ProblemDescriptionMetrics

Not assigned packages querying

Not assigned packages querying is quiet problematic on big amount of data

F.e. if we have 2000 packages for a kb credentials and 1980 of them are assigned, unassigned packages (f.e.) are in the end if list, page size is 20 and we want to get first page for 20 unassigned packages.

In such case we need to query holdingsIQ 100 times, also query assigned packages from database 100 times (for each page returned from holdingsIQ we need to check whether the package is assigned to an agrrement according to data in database).

Approximate metrics for this 'extreme' case are:

  • query 20 packages from holdingsIQ - about 400ms (queried multiple times via postman). For 2000 packages number should be around 40000ms (40s)
  • check whether 20 packages are assigned to an agreements by query to database - about 2ms (queried test data from local database via dbeaver. Note: for production database number may increase because of remote database). For 2000 packages number should be around 200 ms. Note: testing was done on a tags table with similar (to the planned table) structure and 3000 tags present.
  • check whether 20 packages are assigned to an agreements by query to database - about 500ms (queried test data from spitfire rancher database via pgadmin. Note: for production database number may be less because of bigger amount of resources). For 2000 packages number should be around 50000ms (50s). Note: testing was done on a tags table with similar (to the planned table) structure and 3000 tags present.

To sum up, to get 20 unassigned packages in a bad scenario it can take about 90s + some time to return data on front-end. Note: this numbers are very approximate and may vary based on hardware and internet connection.

Migrating data from mod-agreementsmod-agreements doesn't have any information about kb credentials and we need to assign each relation (package→agreement) to ones. In order to achieve this, we need to query each package for each relation from holdingsIQ using each kb credentials (because same package can exist for multiple kb credentials)

F.e. if we have 2000 relations (package→agreement) for a tenant in mod agreements and also we have 5 kb credentials configured, then we need query holdingsIQ 10000 times (for each packageId and each kb credentials).

Approximate metrics:

  • query 10 (mod-agreements returns max 10 per query even if page size is bigger) entitlements (package→agreement relations) from mod-agreements with 2000 entitlements present in mod-agreements - about 220ms (tested on snapshot envoronment). For 2000 relations should be about 44000ms (44s)
  • query package by id from holdingsIQ - 400ms (queried mutiple times via postman). For 2000 relations should be around 800000ms (800s / 13min 20sec)

Writing to database was not tested because table structure should be created and data generated. Also it's not really necessary to test this because writing 2000 record in database is a fast process and should take no more than a second (maybe few seconds, depends on environment).

To sum up, to migrate 2000 relations from mod-agreements it can take about 14 minutes + some time to save data to database. Note: this numbers are very approximate and may vary based on hardware and internet connection.

To eliminate problem regarding 'not assigned' packages querying we can periodically query all packages(same as we do for holdings now) to have them in mod-kb-ebsco-java database. But in this case we will not have an ability to use holdingsIQ filters (same as for assigned packages).

Also we can get assigned selected (only) packages from database already because we have scheduled job to query holdings, but it executes once in 5 days.

As an alternative to package filtering by assignment status, additional column to the result list with assignment status could be considered.