...
Type | Ticket | Module |
---|
Module ticket | Jira Legacy |
---|
server | System JiraJIRA |
---|
serverId | 01505d01-b853-3c2e-90f1-ee9b165564fc |
---|
key | MODAUD-155 |
---|
|
| mod-audit |
Arch ticket | Jira Legacy |
---|
server | System JiraJIRA |
---|
serverId | 01505d01-b853-3c2e-90f1-ee9b165564fc |
---|
key | ARCH-51 |
---|
|
|
|
...
# | Title | Description | Pros | Cons |
---|
1 | Define retention policy (use archiving or table partitioning) | - Archive the old records (maybe not delete them) in another table or use PostgreSQL partitioning - like 1+ year old record.
- Create and add an index for createTimeStamp column. This will restrict the search to most recent actions that are taken on the item and improves the search performance.
| - Future-proof solution. Will have same response time in case of the same amount of records per time period
| - In case of unpredicted growth of transactional volume search can be slow
- Partitioning is not used in other modules, hence breaks the unity or common approach
- Search in longer date period may still be slow
- Not quite clear what is the common period for retention
|
2 | Add gin index for nested items → itemBarcode field | - Create gin index for itemBarcode field (see link)
| - Improves performance
- Requires minimal backend changes
| - May negatively affect performance of writing due to calculation of index.
- Has limitations
|
3 | Limit search date range | - Make UI
from / to fields in search view mandatory - Force maximum date range to one year
- If further search is required add controls to go to next or previous period
| - No backend changes required
- Can serve as a workaround
| - Limits user experience
|
Questions
# | Question | Answer | Status |
---|
1 | How many records per year clients produce? | | |
2 | Do clients have any retention policy or regulations on circulation records? | (OLD ACCOUNT) Erin Nettifee : No common retention approach for different libraries | |
Rationale
Option 1 depends on client needs that might be hard to define for the whole community.
Option 2 should be implemented in any case but requires additional performance testing for writing into circulation log.
Option 3 requires minimal effort but has tradeoffs. Can be implemented as quick fix.
Proposed solution is option 2 due to minimal changes on backend and no changes at frontend
4th10th, 2023): - Gin indexes are already present in the table creation script
- Queries that are generated by
PostgresClient do not hit the indexes. The library auto-generated SQL Query example is:
Code Block |
---|
| SELECT id, jsonb
FROM %schema%_mod_audit.circulation_logs
where ((circulation_logs.jsonb->>'items') @@ ('BATTLE10')) LIMIT 100 OFFSET 0; |
- To enable the index in the search the query should be rewritten in this form:
Code Block |
---|
| select id, jsonb
from %schema%_mod_audit.circulation_logs
where %schema%_mod_audit.get_tsvector(%schema%_mod_audit.f_unaccent(jsonb ->> 'items'::text))@@ plainto_tsquery('%BATTLE10%') LIMIT 100 OFFSET 0; |
- Below are screenshots of query execution plan for both queries:
- Image ModifiedImage Modified
|
- Improves performance
- Requires minimal backend changes
| - May negatively affect performance of writing due to calculation of index.
- Has limitations
| 3 | Limit search date range | - Make UI
from / to fields in search view mandatory - Force maximum date range to one year
- If further search is required add controls to go to next or previous period
| - No backend changes required
- Can serve as a workaround
| - Limits user experience
|
Questions
...
How many records per year clients produce?
...
...
Do clients have any retention policy or regulations on circulation records?
...
(OLD ACCOUNT) Erin Nettifee : No common retention approach for different libraries
...
Rationale
Option 1 depends on client needs that might be hard to define for the whole community.
Option 2 should be implemented in any case but requires additional performance testing for writing into circulation log.
Option 3 requires minimal effort but has tradeoffs. Can be implemented as quick fix.
...
Update on option 2 (Apr 18th, 2023): - The SQL generation by PostgresClient already contains options for invoking full-text search with
gin-indexes . The description of mechanism is located in https://dev.folio.org/faqs/explain-cql/. Every approach has its tradeoffs and possible solution may be to leave the choice for users. The available options are as following:- Exact search: requires input of full barcode (e.g. type "12345" to find "12345" barcode). Fastest
- Left-anchor search: requires input starting symbols of barcode (e.g. type "123" to find "12345" barcode).
- "Contains" search: requires input of any substring in barcode (e.g. type "234" to find "12345" barcode). Slowest option
|
Estimation
# | Title | Estimation |
---|
1 | Define retention policy (use archiving or table partitioning) | - POC - 1 sprint
- Implementation -2 sprints
- Performance testing - 1 sprint
|
2 | Add gin index for nested items → itemBarcode field | - Implementation - 1 sprint
- Performance testing - 1 spring
|
3 | Limit search date range | - Implementation - 1 sprint
- Performance testing - 1 spring
|
...