[UXPROD-3387] Rancher Improvement: Implement Data Migration Functionality Created: 27/Oct/21  Updated: 28/Jul/23  Resolved: 28/Jul/23

Status: Closed
Project: UX Product
Components: None
Affects versions: None
Fix versions: None

Type: New Feature Priority: TBD
Reporter: Hanna Hulevich Assignee: Hanna Hulevich
Resolution: Done Votes: 0
Labels: NFR
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original estimate: Not Specified

Issue links:
Blocks
is blocked by UXPROD-3547 Rancher deployment pipeline NG Closed
Defines
is defined by RANCHER-191 Data Migration time and integrity mea... In Progress
Relates
relates to RANCHER-6 Collect requirements for platform tes... Draft
Development Team: Kitfox
PO Rank: 0

 Description   

Current situation or problem:

Currently there is no possibility to run migration on performance rancher. That's why we can't test migration from old release version to new one before bug fest. It is not optimal situation because teams need to have possibility test data migration (including  data migration performance) during development to identify and solve possible issues on early stage. 

Probably this functionality also can be used to the schema migration testing. Which help  us eliminate a lot of efforts from the development teams.

(currently schema migration testing is done by the teams locally using vagrant box and a set of data generated by a team themselves)

This functionality should allow us to identify the problematic modules from the data migration performance point of view so SA can focus on improving them and this improvement can be measured after it is implemented

In scope

  • Jenkins job need to be implemented to run migration from the specified release version to the specified release version on the rancher performance environment on the specified data set.
  • Modules versions need to be taken from platform-complete by  specified tag or from master branch. It should be possibility to override modules version/branch if needed (from configuration file. version/branch from the file will be finally deployed)
  • Logging should be added to log time took for data migration per module
    • (added by Raman Auramau) Logged duration should be aggregated in report and available via Jenkins job
    • (added by Raman Auramau) Notification is required if upgrade job executes too long (longer than some threshold)
  • We need to be sure that we are installing initial release version with compatible data set. 
  • We can want to use Cornell data set (but sensitive data need to be disclosed) . We need to talk to Cornell before.  But implementation should not be blocked by this. We can start implementation on Bugfest snapshot. 
  • We want to be able to update not all modules but only some specified in the list. 
  • We want to be able to disable functionality to check modules compatibility so we can test upgrade from any version to any version in any time.
  • We need to be able to restore the database to some state if something happened (ability to restore the database).

Out of scope
implementing this functionality for the development teams rancher envs

Preparing Cornell data for using with this env 

Use case(s)

Proposed solution/stories

Links to additional info

Questions

  • Should we implement this functionality for the PTF env?
    Kitfox and PTF should discuss this
  • Should Perf env should be supported for multiple teams in the same time?
    Currently it supports 3 teams at a time

 



 Comments   
Comment by Hanna Hulevich [ 08/Nov/21 ]

https://folio-org.atlassian.net/browse/RANCHER-6

Comment by Marc Johnson [ 22/Nov/21 ]

Hanna Hulevich

Jenkins job need to be implemented to run migration from the specified release version to the specified release version

Will the Jenkins job be the only thing that can access the Rancher performance environment?

This species release versions, does that mean we would only test this performance after a formal release is made?

We want to be able to disable functionality to check modules compatibility so we can test upgrade from any version to any version in any time.

Is this asking for interface dependency checks to be turned off in Okapi during these builds?

If so, why would we want to test the time taken to upgrade to an invalid definition of the system?

We need to be able to restore the database to some state if something happened (ability to restore the database).

Won't the environment be reset at the beginning of each run of the pipeline?

If so, what is the value of restoring the database? Would this be an operation in the pipeline or elsewhere (e.g. manual task by DevOps)?

Comment by Hanna Hulevich [ 22/Nov/21 ]

Marc Johnson

Thank you for the questions! Please see my comments below:

Will the Jenkins job be the only thing that can access the Rancher performance environment?

No changes are expected in this area introduced by this feature. Hleb Surnovich could you please help me to clarify this question.

This species release versions, does that mean we would only test this performance after a formal release is made?

No, that doesn't. We should be able to upgrade to the modules versions from platform-complete, master or any other versions we like (should be possibility to override versions from platform-complete or master). Let's discuss if you think there is not enough flexibility and we can add some other useful possibility here.

Is this asking for interface dependency checks to be turned off in Okapi during these builds?

I think yes. Martin Tran please correct me if I'm wrong here.

If so, why would we want to test the time taken to upgrade to an invalid definition of the system?

To be able test just upgrade and do not spend extra time of resolving dependencies. We thought it might be useful in some cases

Won't the environment be reset at the beginning of each run of the pipeline?

As we are trying to make this flexible and cover a lot of use cases we think that actually it should be 3 Jenkins jobs. one can install fresh FOLIO on specified data set, the second one just run a migration, and the third one do initial install and run migration. As far as I know performer rancher will be down in 8 hours by default but this behavior can be changed and we can keep it longer if needed

If so, what is the value of restoring the database? Would this be an operation in the pipeline or elsewhere (e.g. manual task by DevOps)?

It might be needed to save some data we did manually on this env. Martin Tran please correct me if I'm wrong here. Hleb Surnovich please describe implementation plan (in case you already discussed and decided)

Thank you,
Hanna

Comment by Hleb Surnovich [ 23/Nov/21 ]

Hi, all!
About the questions above.

Will the Jenkins job be the only thing that can access the Rancher performance environment?
We use Jenkins job to create/delete the whole perf env. To access this env developers or our team can use either Rancher (using apps, workloads, secrets etc.) or entering the env endpoints directly via link:

  • UI part (e.g metadata-perf.ci.folio.org);
  • Okapi (e.g. metadata-perf-okapi.ci.folio.org);
  • PgAdmin (e.g. metadata-perf-pgadmin.ci.folio.org).

If so, what is the value of restoring the database? Would this be an operation in the pipeline or elsewhere (e.g. manual task by DevOps)?
We're planning the following operations made with its own jobs:

  • Creating the env with DB snapshot. We use a snapshot provided by FSE team now;
  • Creating a DB snapshot if changes aare crucial
  • Migration operations
  • Deleteting
    All the steps are discussed and will be implemented according to RANCHER-64
Comment by Marc Johnson [ 23/Nov/21 ]

Hleb Surnovich Thank you for your follow up.

We use Jenkins job to create/delete the whole perf env.

This suggests to me that the scope of the jenkins job is to create the (starting) environment.

Rather than a job that performs a migration between one set of versions and another, as this statement in the issue suggests:

Jenkins job need to be implemented to run migration from the specified release version to the specified release version on the rancher performance environment on the specified data set.

Is this job to create the environment, run a planned migration, or is there a sequence of jobs as suggests by your later comments?

Comment by Marc Johnson [ 23/Nov/21 ]

Hanna Hulevich

one can install fresh FOLIO on specified data set, the second one just run a migration, and the third one do initial install and run migration.

I think the first one creates an initial environment, the second one runs a specified module upgrade.

What does the third one do that is different to those two?

As far as I know performer rancher will be down in 8 hours by default but this behavior can be changed and we can keep it longer if needed

What do you mean by the performance rancher will be down for 8 hours?

Are all of these jobs going to run against a single shared environment?

Comment by Hanna Hulevich [ 23/Nov/21 ]

Marc Johnson

 What does the third one do that is different to those two?

it will be a wrapper for first and second job. Will do both initial install and migration so you should not wait until first job is completed to run the second one.

What do you mean by the performance rancher will be down for 8 hours?

For the cost saving purpose Perf Rancher env implemented in the way to become down in 8 hours (by default) 

Are all of these jobs going to run against a single shared environment?

Hleb Surnovich please help me to answer this question.

 

Thank you,
Hanna

Comment by Ann-Marie Breaux (Inactive) [ 02/Dec/21 ]

Hi Hanna Hulevich I happened to see your message in the tech leads channel. I'm not sure exactly how this fits, but I wanted to mention it.

In the hosted ref envs (folio-testing, snapshot, snapshot-load), we have default reference data for things like locations, acquisitions unit, all kinds of Inventory settings, etc.

In Bugfest, we have completely different, and much more diverse, reference data. Is it decided what the reference data for this Rancher migration env will be? I would advocate that if you plan to use a large set of ref data that the hosted ref env's ref data also be included, so that tests written for that specific set of ref data don't break.

I'm not sure if that's covered by your 4th bullet point in the requirements, but I wanted to mention it.

Comment by Hanna Hulevich [ 01/Feb/22 ]

Hi Ann-Marie Breaux,

I think this requirement should be covered. Only one thing we should do to support this is to prepare specific data set for the release version. So we can use this any time later. Aleh Litasau please what is your view?

We can discuss on demo this week

Thank you,
Hanna

Comment by Khalilah Gambrell [ 03/Mar/22 ]

Hanna Hulevich will this feature be included in the Lotus release?

Generated at Fri Feb 09 00:31:36 UTC 2024 using Jira 1001.0.0-SNAPSHOT#100246-sha1:7a5c50119eb0633d306e14180817ddef5e80c75d.