[FOLIO-2976] Create tool to gather schema changes since previous FOLIO flower release Created: 25/Jan/21  Updated: 06/Oct/22  Resolved: 02/Mar/21

Status: Closed
Project: FOLIO
Components: None
Affects versions: None
Fix versions: None

Type: New Feature Priority: TBD
Reporter: David Crossley Assignee: David Crossley
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original estimate: Not Specified

Issue links:
Relates
relates to FOLIO-3137 gather-schema-changes: Handle case wh... Closed
Sprint: DevOps Sprint 107, DevOps Sprint 108, DevOps Sprint 106
Development Team: FOLIO DevOps

 Description   

Implementers would like early warning about schema changes (both database and API) in each module since the previous flower release, regularly during the development cycle. This would enable preparation of changes to mapping and ETL code, to be ready with data migration as soon as the next release is available.

Some design notes:
https://docs.google.com/document/d/1Asqeg-lqVjhzW87CE48EjnKEg2Qpy_6OXFHvUjUoe_U/edit#



 Comments   
Comment by David Crossley [ 10/Feb/21 ]

The initial prototype is ready. The branch is not yet merged, and the GitHub Workflow is not yet automated.

For each backend module version in the current q3-2020 branch of platform-complete, it compares each JSON schema under the api directory (e.g. ./ramls/) of the release git checkout with the current main branch.

If a file has differences, then this is stored at a matching path (e.g. mod-users/api/usergroup-17.2.3.diff and mod-notes/api/types/notes/note-2.10.2.diff) and there is a companion file with extension .txt which also has some extra metadata.

For each module that has a database schema, any differences are noted in a top-level file (e.g. mod-notes/db-2.10.2.diff).

The JSON "diff" uses the "jd" tool. Refer to the concise "Diff language".

Each module has a "summary.json" listing the files that are changed, and any processing errors encountered.

The set of changes for all modules is published to an S3 space. There is a complete archive (replaced on each harvest run) at https://s3.amazonaws.com/foliodocs/schemadiff/q3-2020/schema-diff-q3-2020.zip

At this stage, all shared git submodules are ignored (e.g. ramls/raml-util) as these would also have other additional changes not relevant to the particular module.

The next version of the prototype intends to discover each schema that is declared in each api description file (called "parent schema") and dereference the files that are included by reference ($ref) thereby providing an overview of the total changes for each main parent schema.

Comment by David Crossley [ 18/Feb/21 ]

The prototype is now running automatically. Every Monday and Thursday, early morning UTC.

Comment by David Crossley [ 24/Feb/21 ]

The prototype 2 is now operating to "dereference parent schema" as described in the earlier Jira comment.

As explained before there is a complete archive (replaced on each harvest run) at https://s3.amazonaws.com/foliodocs/schemadiff/q3-2020/schema-diff-q3-2020.zip

See an example at mod-permissions/api/parents/permissionListObject-5.12.2.diff

Comment by David Crossley [ 02/Mar/21 ]

The workflows are automated.

The "get-release-versions" happens daily. If it discovers that new module versions have been added to the q3-2020 platform-complete, then it opens a PR for the updated data file.

The "gather-schema-changes" happens twice weekly very Monday and Thursday, early morning UTC.
Before that, each day the "get-release-versions" happens.

The workflow runs are here: https://github.com/folio-org/folio-org.github.io/actions
The workflow code and config: https://github.com/folio-org/folio-org.github.io/tree/master/.github/workflows

(GitHub cron has a delay of up to an hour or so.)

Comment by David Crossley [ 04/May/21 ]

This is now instead tracking the schema changes since the release of the R1-2021 Iris release branch of platform-complete.

Comment by David Crossley [ 06/Oct/22 ]

The tool is updated soon after each subsequent Flower release, e.g. now tracking morning-glory.

Generated at Thu Feb 08 23:24:40 UTC 2024 using Jira 1001.0.0-SNAPSHOT#100246-sha1:7a5c50119eb0633d306e14180817ddef5e80c75d.