CallNumber Browse Refactoring
Status | DONE |
|---|---|
Impact | high |
Prod Ticket | |
Arch Ticket |
Summary
Current situation or problem: In order to continue to build on call number browse functionality (including browsing by type and browsing by instance classification), we need to refactor the current implementation.
Business expectations
Easily navigate through large datasets.
For example, when the call number shares the first 10 characters of the shelving order.
Address preceding and succeeding navigation, especially with large datasets.
Leaving the first and last pages handling exact and non-exact match
Address effective location facet issues
Address type-specific browsing issues (i.e. sorting and finding exact matches)
Technical expectations
Streamline code. Significantly decrease complexity of code to make it much easier to implement new enhancements.
Requirements
Functional Requirements
https://folio-org.atlassian.net/wiki/spaces/MM/pages/4669674
Non-functional Requirements
Configurability - The solution should allow disabling/enabling indexing by a feature flag.
Maintainability - The solution should allow changes for different call number types, searching for prefixes/suffixes, etc.
Performance - The solution should not impact reindexing time significantly.
Arch Ticket: https://folio-org.atlassian.net/browse/ARCH-272
[TBD: Create a NFR Page]
Assumptions
https://folio-org.atlassian.net/browse/UXPROD-4892 is done and performance test results are satisfactory
Baseline Architecture
https://github.com/folio-org/mod-search/blob/master/doc/browsing.md#call-number-browsing
Target Architecture
Summary
The solution is based on a new reindexing approach proposed in https://folio-org.atlassian.net/wiki/spaces/DD/pages/262144012. The aspects of the proposed solution points
In
mod-searchPostgreSQL DBcreate tables for call numbers
The following fields should be present in the table
callnumbertable:callnumber_ideffective_callnumber_components- set of components for a call numbercallnumberprefixsuffixcallnumber_type_id
volumeenumerationchronologycopynumber
The following fields should be present in the table
callnumber_instancestable:callnumber_iditem_idinstance_idsharedtenant_idlocation_id
on create/update/delete events for items create a new procedure that would extract call numbers from items
Adjust the Reindexing procedure and Ongoing domain events consuming for items
Create a separate index for call numbers
Refactor browse queries to use
search_aftersearch_beforequeriesThe titles for the browse option can be queried on the fly either from the
instancestable or theinstancessearch index
Indexing Sequence Diagram
As per the current approach for reindexing the indexing of call numbers is split into two main phases: merge and upload. The merge phase is already present in the reindexing procedure.
For performance purposes, the extraction of call numbers (step 16) should happen on the database side. The current approach uses batch inserts to insert items into the table. It is proposed to create a new PL/pgSQL procedure to extract call numbers. The next section describes the details of the mentioned procedure.
Extract Call Numbers Activity Diagram
The diagram below describes the procedure that should be created for inserting items and extracting call numbers in the PostgreSQL database. The procedure should be used instead of bulk inserts. Key aspects:
The flag whether to extract and store call numbers should be stored in the database
The arrays of records similar to the table structure should be created inside of the procedure to hold call numbers and call number instances
Insert from the arrays into the main tables should be ordered to avoid deadlocks on main table indices
Browsing Sequence Diagram
The browsing call numbers follow the approach used for the classification browse feature (https://folio-org.atlassian.net/wiki/spaces/DD/pages/1781742).
Holding-level Call Numbers
Problem statement
Some libraries do not create items for holdings, and they need to be able to browse the call number on the holdings record. The solution should provide the capability to browse call numbers in the following situations:
A library has only holdings related to instances
A library has both holdings and items related to instances
A library has holdings, but some holdings do not have items
Solution Options
Option | Description | Pros & Cons |
|---|---|---|
Dedicated holdings/items callnumber search indexes | Holdings call number browse and item call number browse are separate features and can enabled/disabled through configuration flags per tenant | Pros:
Cons:
|
One search index for all callnumbers | If an instance has items, then only item call numbers are filled. If an instance has holdings but not items, the index is filled with callnumbers from holdings | Pros:
Cons:
|
Addressing the holding-level call number browsing
The proposed solution should reuse an approach similar to item-level indexing and use the same call number-related tables. Currently on the mod-inventory-storage side items that have no call number, inherit the call numbers from holdings, hence there is no need to insert them in the call number table. This requires to index items before holdings on the merge stage.
Risks