Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Brief context

...

Analyze

Jira Legacy
serverSystem JiraJIRA
serverId01505d01-b853-3c2e-90f1-ee9b165564fc
keyMODINVSTOR-812
/
Jira Legacy
serverSystem JiraJIRA
serverId01505d01-b853-3c2e-90f1-ee9b165564fc
keyMODINVSTOR-774
to identify root cause(-s) of long migration (e.g., ongoing indexing on DB, single thread processing etc.) and make optimization of identified bottlenecks.

...

To make the process of preparing and testing data migration scripts more transparent, understandable and effective, and to reduce the risks of problems occurring directly during the release / migration on the production environment, 2 main steps are proposed:

Info
iconfalse
  1. Analysis and review of data migration scripts at the stage of preparation and development
    1. (plus) Add a specific label data-migration to all Jira features/tasks related to data migration
    2. (plus) Update Architecture board https://
  2. issues
    1. folio-org.
  3. folio
    1. atlassian.
  4. org
    1. net/secure/RapidBoard.jspa?rapidView=224 to reflect such labeled tasks in In Review, In Progress and In Code Review statuses
    2. Organize code review and analysis of labeled tickets and PRs by TechLs/SAs using Best practices and Check-list for data-migration tasks below
      1. (plus) Communicate this process and required steps to involved teams and interested stakeholders, share this documentation
    3. (plus) Prepare a brief instruction on How to test data migration performance on Rancher , and communicate to dev teams to conduct performance testing of a particular migration on a temporary performance environment 
  5. (plus) Performance testing for all data migrations on a dedicated environment and on a sufficiently large amount of data as a mandatory step in order to receive an early performance metrics
    1. Jira Legacy
      serverSystem JiraJIRA
      serverId01505d01-b853-3c2e-90f1-ee9b165564fc
      keyUXPROD-3387
      and
      Jira Legacy
      serverSystem JiraJIRA
      serverId01505d01-b853-3c2e-90f1-ee9b165564fc
      keyRANCHER-191

...

More examples of data migrations:

  • Jira Legacy
    serverSystem JiraJIRA
    serverId01505d01-b853-3c2e-90f1-ee9b165564fc
    keyCIRCSTORE-295

...


#1 Lazy migration#2 Expand-contract pattern#3 Optimization#4 Separate Applications approach#5 Blue-green deployment#6 Aggregate several data migration in one
Explanation

Update FOLIO platform without immediate data migration, and start a background task(-s) for gradual data migration

Applications are to be ready to work with both migrated and not-yet-migrated data while migration is in progress

Update FOLIO platform without immediate data migration but build the migration into the code, and do a "modify upon get". So whenever a record is accessed, perform the necessary changes and write them back to the DB. This way the migration happens over time, and only one schema is in play at a given time. More to read https://www.prisma.io/dataguide/types/relational/expand-and-contract-pattern

Follow existing way of data migration as a part of release but review and analyze data migration scripts, and optimize themFollow an approach of FOLIO deployment as not a monolith but rather as a set of independent applications. As the result release of individual applications will be possible, and data migration will also be distributed among applicationsBefore migration a) create a clone of main database, b) execute data migration on the clone, c) roll over all new changes that happened during the migration, d) during release - switch main database from the previous one to this clone, and consider the clone as a new main db
BenefitsNo data migration as a part of releaseNo data migration as a part of releaseNo significant changes in the FOLIO platform logic. Potential improvements with minimal costsEven long migration will affect a particular application only rather than full platformSafe non-blocking migration
ConcernsThe data migration process can be less transparent to engineers, while leaving the risks of partially completed migrations; application logic gets more complex as wellNot suitable for cases with breaking changes or when fully migrated data is required for proper functioning (e.g. for filtering, or sorting, etc.)There might be a case of still unsatisfactory performance while no place for further optimizationData migration is distributed across applications, and the approach to migration itself does not change. Therefore, a lengthy migration process is still possible, although this will only affect one application and not the entire platformStep c) from the list above - accurate synchronization of all changes that occurred in the main database during the period of time between cloning and switching is required
EffortsTotally new approach, so that one need implement background task runner, make sure applications can work with both versions of dataTotally new approachThere is a chance that small point optimizations can be efficient enoughThis is just a side-effect of Separate Applications approachMainly devops efforts to automate the process

...