CallNumber Browse Refactoring
Summary
Current situation or problem: In order to continue to build on call number browse functionality (including browsing by type and browsing by instance classification), we need to refactor the current implementation.
Business expectations
Easily navigate through large datasets.
For example, when the call number shares the first 10 characters of the shelving order.
Address preceding and succeeding navigation, especially with large datasets.
Leaving the first and last pages handling exact and non-exact match
Address effective location facet issues
Address type-specific browsing issues (i.e. sorting and finding exact matches)
Technical expectations
Streamline code. Significantly decrease complexity of code to make it much easier to implement new enhancements.
Requirements
Functional Requirements
Call number browse requirements overview - DRAFT
Non-functional Requirements
Configurability - The solution should allow disabling/enabling indexing by a feature flag.
Maintainability - The solution should allow changes for different call number types, searching for prefixes/suffixes, etc.
Performance - The solution should not impact reindexing time significantly.
[TBD: Create a NFR Page]
Assumptions
UXPROD-4892: Reindexing improvementsIn Review is done and performance test results are satisfactory
Baseline Architecture
https://github.com/folio-org/mod-search/blob/master/doc/browsing.md#call-number-browsing
Target Architecture
Summary
The solution is based on a new reindexing approach proposed in Reindex Improvements. The aspects of the proposed solution points
In
mod-search
PostgreSQL DBcreate tables for call numbers
The following fields should be present in the table
callnumber
table:callnumber_id
effective_callnumber_components
- set of components for a call numbercallnumber
prefix
suffix
callnumber_type_id
volume
enumeration
chronology
copynumber
The following fields should be present in the table
callnumber_instances
table:callnumber_id
item_id
instance_id
shared
tenant_id
location_id
on create/update/delete events for items create a new procedure that would extract call numbers from items
Adjust the Reindexing procedure and Ongoing domain events consuming for items
Create a separate index for call numbers
Refactor browse queries to use
search_after
search_before
queriesThe titles for the browse option can be queried on the fly either from the
instances
table or theinstances
search index
Indexing Sequence Diagram
As per the current approach for reindexing the indexing of call numbers is split into two main phases: merge and upload. The merge phase is already present in the reindexing procedure.
For performance purposes, the extraction of call numbers (step 16) should happen on the database side. The current approach uses batch inserts to insert items into the table. It is proposed to create a new PL/pgSQL procedure to extract call numbers. The next section describes the details of the mentioned procedure.
Extract Call Numbers Activity Diagram
The diagram below describes the procedure that should be created for inserting items and extracting call numbers in the PostgreSQL database. The procedure should be used instead of bulk inserts. Key aspects:
The flag whether to extract and store call numbers should be stored in the database
The arrays of records similar to the table structure should be created inside of the procedure to hold call numbers and call number instances
Insert from the arrays into the main tables should be ordered to avoid deadlocks on main table indices
Browsing Sequence Diagram
The browsing call numbers follow the approach used for the classification browse feature (Browse Instance classification numbers - Phase 1 POC).
Holding-level Call Numbers
Problem statement
Some libraries do not create items for holdings, and they need to be able to browse the call number on the holdings record. The solution should provide the capability to browse call numbers in the following situations:
A library has only holdings related to instances
A library has both holdings and items related to instances
A library has holdings, but some holdings do not have items
Solution Options
Option | Description | Pros & Cons |
---|---|---|
Dedicated holdings/items callnumber search indexes | Holdings call number browse and item call number browse are separate features and can enabled/disabled through configuration flags per tenant | Pros:
Cons:
|
One search index for all callnumbers | If an instance has items, then only item call numbers are filled. If an instance has holdings but not items, the index is filled with callnumbers from holdings | Pros:
Cons:
|
Addressing the holding-level call number browsing
The proposed solution should reuse an approach similar to item-level indexing and use the same call number-related tables. Currently on the mod-inventory-storage
side items that have no call number, inherit the call numbers from holdings, hence there is no need to insert them in the call number
table. This requires to index items before holdings on the merge stage.
Risks