[FOLIO-2451] SPIKE: figure out data retention plan for continuous build Created: 07/Feb/20 Updated: 03/Jun/20 Resolved: 28/Feb/20 |
|
| Status: | Closed |
| Project: | FOLIO |
| Components: | None |
| Affects versions: | None |
| Fix versions: | None |
| Type: | Task | Priority: | P3 |
| Reporter: | Ian Hardy | Assignee: | Ian Hardy |
| Resolution: | Done | Votes: | 0 |
| Labels: | devops, platform-backlog | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original estimate: | Not Specified | ||
| Sprint: | DevOps: sprint 82, DevOps: sprint 83 |
| Development Team: | FOLIO DevOps |
| Description |
|
We have a build at snapshot-core-continuous.dev.folio.org that persists data through upgrades. One of the main reasons to build this was to allow data migrations to occur. There are some open questions on data persistence: 1. Integration tests. If we run integration tests, some data is created but not cleaned up. I've turned off integration tests for now so we don't get a pileup of all that testing data, but we need to decide if we want them to run at all. Running ui integration tests could help us detect if something has gone wrong during a migration. On the other hand it would require work from the ui team to add a cleanup step for each test that creates data. Also, this environment is less likely to be "clean" since data is persisted which calls into question whether any integration tests performed here are valid at all. 2. purging/reloading data: It would probably make sense to clean this environment by using the purge=true parameter to get a fresh install at some interval. This would give us a chance to pick up new reference data if it exists, and clean out any cruft anyone may have added. We want to balance this with the need to see actual migrations happen. |
| Comments |
| Comment by Craig McNally [ 12/Feb/20 ] |
|
One thing to keep in mind is that there currently aren't any rollback scripts for data migration. So if a migration script fails part way through the process, the data may be in a funky state, leading to all sorts of unexpected behavior. Also note that reloading data presents some potential issues as well... consider the following scenario: Depending on how 4 is done we could get into trouble.
One option might be to take a snapshot/backup of the database (or individual schema of the module being upgraded) before the migration, and if the upgrade fails, restore the backup. |
| Comment by Marc Johnson [ 13/Feb/20 ] |
I think in order to best answer this question, I think we need to clarify what goals are intended to be met by our use of this environment. What are the goals of this environment? For example, I think if the goal is to better mimic a production environment, then I don't think purging makes sense, as it seems unlikely folks would do this in their production systems. |
| Comment by Ian Hardy [ 13/Feb/20 ] |
|
Thanks Craig and Marc for weighing in. I'd stop short of saying we're trying to mimic a production environment (we're just using sample data, and using snapshots instead of releases at least for this build). I think a modest goal here is that if someone writes a migration for a core module, it will get executed in this environment. the way this is built now, purging/reloading would build from the top of master which as Craig pointed out would leave the problematic migration behind. Maybe if a full rebuild is necessary the way to do it would be to build from the platform-core commit before it broke. I realize we'd be dumping any additional data beyond the sample that may have been loaded, but I don't think we can keep track of whatever people put in there indefinitely. Maybe if we had a larger controlled sample we could regularly load in that would make things more realistic. |
| Comment by Jakub Skoczen [ 18/Feb/20 ] |
|
Ian Hardy Marc Johnson Craig McNally Maybe purging is something that is done at the point where we know the environment is broken – e.g when data is corrupted due to broken migration scripts? Craig McNally I don't think by "puring" we should assume reloading original Edelweiss (or any release after) data. If we assume that the sample data is kept up to date with the module schema we could essentially "purge and bootsrap" to the last known-working state. We could also do it on a module-by-module basis. |
| Comment by Marc Johnson [ 18/Feb/20 ] |
If an upgrade breaks the existing data in the system, what are the expectations for this module? Is it assumed that the data is rolled back to a known good state and then a fixed upgrade will upgrade data from that to the new state? Or is it expected that a fixed upgrade would be able to fix the data? |
| Comment by Ian Hardy [ 20/Feb/20 ] |
|
If purging is done when things break as Jakub Skoczen suggests, maybe whats needed are optional parameters to purge data and build from a particular commit of platform core. Since we're just working with sample data (acknowledging that its far from a comprehensive test of a migration) the result would be rolling back to a known good state and then trying the next upgrade. |
| Comment by Ian Hardy [ 21/Feb/20 ] |
|
Actually trying to build from a particular commit will not work since the build will just pick up the latest snapshots anyway. Maybe whats needed is to publish the install.json files as an artifact with each build. This would make it more transparent which modules were updated when without digging through the console, and make it possible to roll back when things are broken. |
| Comment by Ian Hardy [ 28/Feb/20 ] |
|
Plan going forward:
|