Summary
Current situation or problem: In order to continue to build on call number browse functionality (including browsing by type and browsing by instance classification), we need to refactor the current implementation.
Business expectations
Easily navigate through large datasets.
For example, when the call number shares the first 10 characters of the shelving order.
Address preceding and succeeding navigation especially with large datasets.
Leaving first and last page handling exact and non-exact match
Address effective location facet issues
Address type-specific browsing issues (i.e. sorting and finding exact matches)
Technical expectations
Streamline code. Significantly decrease complexity of code to make it much easier to implement new enhancements.
Requirements
Functional Requirements
Call number browse requirements overview - DRAFT
Non-functional Requirements
[TBD: Create a NFR Page]
Assumptions
Baseline Architecture
https://github.com/folio-org/mod-search/blob/master/doc/browsing.md#call-number-browsing
Target Architecture
Summary
The solution is based on a new reindexing approach proposed in Reindex Improvements. The aspects of the proposed solution points
In
mod-search
PostgreSQL DBcreate tables for call numbers
The following fields should be present in the table
callnumber
table:callnumber_id
effective_callnumber_components
- set of components for a callnumbercallnumber
prefix
suffix
callnumber_type_id
volume
enumeration
chronology
copynumber
The following fields should be present in the table
callnumber_instances
table:callnumber_id
instance_id
shared
tenant_id
on create/update/delete events for items create a new procedure that would extract call numbers from items
Adjust the Reindexing procedure and Ongoing domain events consuming for items
Create a separate index for call numbers
Refactor browse queries to use
search_after
search_before
queries
Indexing Sequence Diagram
As per the current approach for reindexing the indexing of call numbers is split into two main phases: merge and upload. The merge phase is already present in the reindexing procedure.
For performance purposes, the extraction of call numbers (step 16) should happen on the database side. The current approach uses batch inserts to insert items into the table. It is proposed to create a new PL/pgSQL procedure to extract call numbers. The next section describes the details of the mentioned procedure.
Extract Call Numbers Activity Diagram
The diagram below describes the procedure that should be created for inserting items and extracting call numbers in the PostgreSQL database. The procedure should be used instead of bulk inserts. Key aspects:
The flag whether to extract and store call numbers should be stored in the database
The temporary table should be created inside of the procedure based on the main table
create temp table tmp_callnumber (like callnumber including indexes);
Insert from the temporary table into the main table should be ordered to avoid deadlocks on main table indices
Browsing Sequence Diagram
The browsing call numbers follow the approach used for the classification browse feature (Browse Instance classification numbers - Phase 1 POC).