Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

#RequirementDescriptionValidation ApproachRemarks
1PerformanceSearch on table with <X> 2M records should be done in <Y> under 30 seconds 
  1. Prepopulate table with X records
  2. Do search by itemBarcode
  3. Validate timing
TBC

...

Solution Options

#TitleDescriptionProsCons
1Define retention policy (use archiving or table partitioning)
  1. Archive the old records (maybe not delete them) in another table or use PostgreSQL partitioning - like 1+ year old record.
  2. Create and add an index for createTimeStamp column. This will restrict the search to most recent actions that are taken on the item and improves the search performance.
  1. Future-proof solution. Will have same response time in case of the same amount of records per time period
  1. In case of unpredicted growth of transactional volume search can be slow
  2. Partitioning is not used in other modules, hence breaks the unity or common approach
  3. Search in longer date period may still be slow
  4. Not quite clear what is the common period for retention
2Add gin index for nested items → itemBarcode field
  1. Create gin index for itemBarcode field (see link

Update (Apr 4th, 2023):

  • Gin indexes are already present in the table creation script
  • Queries that are generated by PostgresClient do not hit the indexes. The library auto-generated SQL Query example is:
Code Block
languagesql
SELECT id, jsonb
    FROM %schema%_mod_audit.circulation_logs 
where ((circulation_logs.jsonb->>'items') @@ ('BATTLE10')) LIMIT 100 OFFSET 0;
  •  To enable the index in the search the query should be rewritten in this form:
Code Block
languagesql
select id, jsonb from %schema%_mod_audit.circulation_logs where %schema%_mod_audit.get_tsvector(%schema%_mod_audit.f_unaccent(jsonb ->> 'items'::text))@@ plainto_tsquery('%BATTLE10%') LIMIT 100 OFFSET 0;
  • Below are screenshots of query execution plan for both queries:
  • Image ModifiedImage Modified
  1. Improves performance
  2. Requires minimal backend changes
  1. May negatively affect performance of writing due to calculation of index.
  2. Has limitations
3Limit search date range
  1. Make UI from / to fields in search view mandatory
  2. Force maximum date range to one year
  3. If further search is required add controls to go to next or previous period
  1. No backend changes required
  2. Can serve as a workaround 
  1. Limits user experience


Questions

#QuestionAnswerStatus
1

How many records per year clients produce?

(OLD ACCOUNT) Erin Nettifee : 2M+ records per year

Status
colourGreen
titleANswered

2

Do clients have any retention policy or regulations on circulation records?

(OLD ACCOUNT) Erin Nettifee : No common retention approach for different libraries

Status
colourGreen
titleAnswered

...

#TitleEstimation
1Define retention policy (use archiving or table partitioning)
  1. POC - 1 sprint
  2. Implementation -2 sprints
  3. Performance testing - 1 sprint
2Add gin index for nested items → itemBarcode field
  1. Implementation - 1 sprint
  2. Performance testing - 1 spring
3Limit search date range
  1. Implementation - 1 sprint
  2. Performance testing - 1 spring

Decision

<TBD>Proposed solution is option 2 - <TBC>