Platform, DevOps and Release Management (UXPROD-1814)

[UXPROD-1815] support for upgrading schemas without complete reload of data (DB migrations) - PoC + devops guide Created: 29/May/19  Updated: 16/Sep/20  Resolved: 11/Oct/19

Status: Closed
Project: UX Product
Components: None
Affects versions: None
Fix versions: Q3 2019
Parent: Platform, DevOps and Release Management

Type: New Feature Priority: P3
Reporter: Tod Olson Assignee: Taras Spashchenko
Resolution: Done Votes: 0
Labels: back-end, cap-mvp, platform-backlog, po-mvp, q3-2019
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original estimate: Not Specified

Issue links:
Blocks
is blocked by FOLIO-2169 SPIKE investigate schema migrations i... Closed
Cloners
is cloned by UXPROD-2120 support for upgrading schemas without... Closed
Duplicate
is duplicated by UXPROD-753 live db upgrades (db migration) Closed
Relates
relates to DEBT-2 No automatic schema/data migrations Closed
relates to FOLIO-464 Design and implement module upgrade p... Closed
relates to FOLIO-1519 Automatic loading of sample and refer... Closed
relates to FOLIO-406 Design module life cycle Closed
relates to FOLIO-838 performance test and bulk import test... Closed
relates to MODVEND-81 Load sample data via tenant API Closed
Epic Link: Platform, DevOps and Release Management
Back End Estimate: XXL < 30 days
Back End Estimator: Jakub Skoczen
Estimation Notes and Assumptions: XXL is the largest available. This item is likely larger.
Development Team: Core: Platform
PO Rank: 9
Rank: Chicago (MVP Sum 2020): R1
Rank: Duke (Full Sum 2021): R1
Rank: 5Colleges (Full Jul 2021): R1
Rank: GBV (MVP Sum 2020): R1
Rank: Lehigh (MVP Summer 2020): R1
Rank: Leipzig (ERM Aut 2019): R2
Rank: TAMU (MVP Jan 2021): R1
Rank: U of AL (MVP Oct 2020): R2

 Description   

(Note: Q3 work include a PoC and initial devops migration guide, rollout and additional Platform functionality is postponed to Q4)

Currently in FOLIO, if there is a change to the schema used by a storage module, there is no upgrade path for the data already in the module. As we now have test deployments and are trying to move towards production deployments, this will be critical for ongoing testing and operations.

This lack has been identified as Technical Debt by the Technical Council.

Some solution for schema upgrades and migrating data will be needed for all storage modules. The solution could include multiple components. For example, maybe one part of the solution is to include data updates the the Definition of Done for each development team, and maybe there's something that could be added to the core to make it easier to create these migrations. That's just for illustration. The important thing is that FOLIO needs a way to address address schema updates and corresponding data changes that can be consistent across modules.

SysOps identified these sub-issues:

  • Ability to roll back an upgrade
  • Consider the wider scope of how upgrades will occur in FOLIO systems
  • Downtime considerations, can we limit or have no downtime for an application?
  • May need additional work in modules/Okapi/RMB
  • Would want schema upgrade included in Definition of Done

FOLIO June meeting TC tech debt slides:
https://docs.google.com/presentation/d/1Cz5-xhvMdCdm7SXYLhIQE8KV9YIsQW1faNUIWmMANx8



 Comments   
Comment by Jakub Skoczen [ 05/Jun/19 ]

Tod Olson

In general, I think the migrations can be grouped, roughly, into two groups:

  • unattended automatic schema and data migrations – migrations performed by a FOLIO module fully automatically during the module upgrade procedure. Unattended migrations do not require operator's intervention unless an error condition arises.

Example for mod-inventory-storage: a new field called "notes" is added to the Instance entity schema, the field is empty by default. Automatic migrations updates the DB schema (creates an index) and stored data. The migration is performed during the module upgrade callback.

Problems (some) that needs to be addressed
– data consistency and integrity. The migration should be performed within a transaction to allow for rollbacks in case of errors.
– Error reporting and resiliency. Is it acceptable to report errors in response to the lifecycle callbacks or is there a requirement for queueing error reports?

  • migrations that require operator's attention – migration that can't be performed fully automatically because they require operators intervention (validation/evaluation) or can't be performed in isolation by the module (require access to information or data that the module does not posses)

Example for mod-inventory-storage: a new field called "alternativeTitle" is added to the Instance schema. The field is marked mandatory and has no default value. The field value should be populated by the data import module from MARC21 "246$a" "Varying Form of Title" field. The migration can't be performed automatically by the module without access to additional information (original MARC record).

Strategy 1: module performs automatic migration of schema (creates an index) and data. The value of the new field is set to a distinguishable placeholder value e.g "NEEDS IMPORT". The placeholder value is overridden by performing a manual data import, using FOLIO's data import functionality.

Strategy 2: module performs automatic migration of schema (creates an index) and requests data re-import from FOLIO data import subsystem. Module upgrade is complete when data is fully re-imported. This strategy increases complexity of migration as introduces dependency on external modules (additional error conditions). The task also becomes blocked on availability of the data import subsystem which is work in progress.

The above is my own (very rough) take on scenarios and strategies. I am sure there is more and this is something for the SIG to establish.

In terms of executing on the user stories: we will likely need to reserve some capacity in the Platform team to provide an example in a selected module(s) and extend existing automatic integration support (if needed. We also need to create tasks for all other teams to implement migrations targeting specific modules and versions. Finally, the DoDs should be also extended to provide automatic migrations (probably for release artifacts).

Comment by Jakub Skoczen [ 09/Jul/19 ]

Taras Spashchenko please, touch base with me when you get back from vaca so we can plan some SPIKEs for the schema upgrade work. This issue so far is the high-level descripiton of the task.

Comment by Cate Boerema (Inactive) [ 29/Jul/19 ]

Jakub Skoczen can you please give this a PO rank?

Generated at Thu Feb 08 23:17:48 UTC 2024 using Jira 1001.0.0-SNAPSHOT#100246-sha1:7a5c50119eb0633d306e14180817ddef5e80c75d.