[draft] Configuration management - Improvement process

[draft] Configuration management - Improvement process

Team

@Drif Abdenour @Mikita Siadykh @Denis @oleksandr_haimanov @Roman_Fedynyshyn @Viktor Gema @Artem Akimov @Oleksii Petrenko


Action Items

#

Date

Action item

Status

Responsible

Comments

#

Date

Action item

Status

Responsible

Comments

1

Feb 12, 2026

Describe configuration review process and workflow

Ready to FSE review

1 @Mikita Siadykh @Denis

 

2

Feb 12, 2026

Single format of configuration files

Ready to FSE review

1 @oleksandr_haimanov @Drif Abdenour @Artem Akimov

Next steps:

  1. @Artem Akimov Update file format

  2. @Mikita Siadykh @Artem Akimov Select place where configuration should be stored

  3. As PoC use mod-bulk-operations

3

Feb 12, 2026

Create backlog to remove OKAPI based system users variables for Umbrellaleaf

In review

@Eldiiar Duishenaliev

Tickets created for making SYSTEM_USER_ENABLED=false

FOLIO-4465

4

Feb 12, 2026

Configuration managment platform -

Specify requirements based on #1 and #2

 

 

 

5

Feb 12, 2026

Who should recommend configuration

 

 

 

6

Feb 24, 2026

Prepare FSE Configuration Management for various setups

Backlog

@Denis

@Drif Abdenour

@Artem Akimov

@Roman_Fedynyshyn

be ready by Mar 17, 2026

7

Mar 4, 2026

Add config file validation mechanism(step) on the stage of module build

 

 

 

8

Mar 5, 2026

Add variants for tracking Feature flags and cache specific parameters

Ready to FSE review

@Mikita Siadykh @oleksandr_haimanov

Added field “type” for EnvVars structure

9

Mar 12, 2026

Create template/list of sidecar env vars as generic

Ready for FSE review

@Mikita Siadykh

@Artem Akimov

 

10

Mar 12, 2026

Describe env vars review process

TBD

@Oleksii Petrenko

 

11

Mar 12, 2026

Schedule call with Martin to understand where the place of PTF team

Scheduled

@Oleksii Petrenko

Mar 16, 2026

Open questions from TLs:

  1. Why don’t extend existing ModuleDescriptor

    • ModuleDescriptor contains information about interfaces deperdencies, permissions. The purpose of config template is to provide essential information for SysOps at one place with usuful format. We do not want to mix deployment specifics with module internal assets.

  2. Why don’t reuse existing DeploymentDescriptor

    • Fully outdated. Need to be decomissioned. Does not contain useful information

  3. Create predefined template for Java mem, Kong timeout, etc that could be overwritened by module

    1. Prepared generictemplate file - https://github.com/folio-org/mod-bulk-operations/pull/586/changes#diff-545748d63ff5c19b9477bd520c1c0b6968785407de25d9626d7e3c80cff82cd3

  4. How to track default configuration vs specific workflows?

    1. Provide possibility to use profiling as in the Spring

  5. How track configuration change to DevEnvs?

    • DevOps should use provided configuration as reference and adjust it based on env real consumption

Problem Statement: Module Configuration Management in FOLIO

That’s a valuable area to improve. Based on how FOLIO is architected (modular, multi-tenant, microservices-based, with runtime module enablement), here’s a structured problem statement you could use for improving modules configuration management.

1. Context

FOLIO is a modular library services platform composed of independently deployable backend modules and UI modules. Modules are:

  • Independently versioned

  • Deployed across distributed infrastructure

  • Configured via environment variables, module descriptors, and tenant-specific settings

As the ecosystem grows, managing configuration across modules, tenants, environments, and releases becomes increasingly complex.


2. Core Problem

There is no unified, standardized, and strongly governed configuration management model across FOLIO modules, resulting in:

  • Inconsistent configuration patterns

  • Limited visibility into active configuration

  • Risk of configuration drift

  • Operational fragility across environments

  • Complex upgrades and tenant onboarding


3. Key Pain Points

A. Configuration Fragmentation

Module configuration is spread across:

  • Environment variables

  • Module descriptors

  • Tenant-level configuration entries (mod-configuration)

  • Database-stored settings

  • Helm charts / Kubernetes manifests (in hosted setups)

There is no single source of truth.


B. Inconsistent Configuration Standards

Different modules:

  • Use different naming conventions

  • Validate configuration inconsistently

  • Handle defaults differently

  • Store sensitive values differently

This leads to unpredictable operational behavior.


C. Poor Discoverability & Observability

Currently:

  • There is no centralized inventory of all configuration keys in the system

  • Operators cannot easily answer:

    • What configuration keys exist?

    • Which are required?

    • Which are tenant-specific vs global?

    • Which are unused or deprecated?

This makes troubleshooting slow and error-prone.


D. Configuration Drift Across Environments

Between:

  • Dev

  • SprintTesting

  • Bugfest

  • Production

Configurations may diverge silently, causing:

  • “Works in test but not in prod” issues

  • Deployment instability

  • Upgrade failures


E. Upgrade & Versioning Challenges

When upgrading modules:

  • Configuration keys may change

  • Defaults may shift

  • Required settings may be added

There is no systematic migration strategy for configuration evolution.


4. Impact

These issues result in:

  • Increased operational overhead

  • Higher risk during releases

  • Reduced reliability in production environments

  • Increased cognitive load for DevOps and support teams

As FOLIO scales across institutions and hosting providers, this risk compounds.


5. Desired Future State

An improved configuration management approach should provide:

  1. Centralized configuration registry

  2. Clear separation of global vs tenant-level configuration

  3. Versioned configuration with migration support

  4. Strong validation of configuration

  5. Environment promotion consistency

  6. Auditability & traceability

Default configuration definition

Resource configuration should assume stable funclioning of all features at the following setup:

Data set for 8M+ inventory instance records + 3 standalone tenents with sample and reference data + 11 consotria tenants (1.2M+ at central tenant)

All required information should be filled out based on

[draft] Configuration management - Improvement process | Single format of configuration files

Сonfiguration review process and workflow

 

Configuration Management Process Overview

This document provides a comprehensive explanation of the configuration management process, outlining the flow across three phases. These phases are designed to ensure that code and configuration changes are managed effectively from initial development through release and ongoing testing. The three main phases are as follows:

  1. Phase 1: Development Cycle. In this phase, code and configuration changes are developed by the development team.

  2. Phase 2: Flower/CSP Release. During this phase, the prepared code and configuration changes changes are released.

  3. Phase 3: Ongoing Testing by PTF. Here, the changes are tested by the PTF team, and the results are fed back into the development process based on the findings.

Phase 1: Development Cycle

The development cycle initiates when the development team identifies the need to make changes to the system. This phase begins with either the initiation of new development work or the receipt of a ticket from the PTF team (as referenced in Phase 3, Step 5).

Step 1: Initiating Work on User Stories or Defects

Developers begin by working on a specific user story or defect that requires attention.

Decision Point A: Determining the Need for Configuration Changes

At this point, the team must decide whether the code changes in progress necessitate configuration changes. If the code changes modify system behavior in a way that requires configuration adjustments (for example, adding new features with configurable parameters, modifying existing functionality that depends on configuration, changing default behaviors, or adding new integrations or services), the process proceeds to Step 2. If configuration updates are not required, the development cycle is considered complete for this item.

Step 2: Creating a Zero-Point Ticket for Configuration Changes

If configuration changes are necessary, the development team creates a ticket in the corresponding module’s Jira project. This ticket should include a description of the required configuration changes.

Decision Point B: Checking for Existing Configuration Artifacts

The next step involves verifying whether an essential configuration artifact exists in the repository for the module under development. If the configuration does not exist, the process moves to Step 4. If it does exist, proceed to Step 3.

Step 3: Updating Configuration Files and Submitting a Pull Request

The development team updates the configuration file with the required values based on the changes implemented previously. A separate pull request (PR) containing only the configuration changes is then submitted in the module's repository for review. The process then moves to Decision Point C.

Step 4: Initiating Configuration in the Repository

If no configuration artifact exists, the development team creates a new configuration based on values used during bug fest testing. After this, return to Decision Point B to verify the existence of the new configuration.

Decision Point C: Evaluating Overrides of PTF-Defined Values

The process now checks if any new configuration value will override a value previously defined by the PTF team. This involves consulting PTF configuration documentation, reviewing previously defined values, and checking the configuration change history. If an override occurs, proceed to Step 5; if not, advance to Step 10.

Step 5: Requesting Pull Request Review from PTF

When configuration changes override PTF-defined values, the developer must request a review from the PTF team and properly document the override reason. This includes:

  • Requesting a PR review from the PTF team

  • Documenting the reason for the override

  • Adding the label "PTF-to-review" to the Jira ticket

  • Blocking the ticket until the PTF review is completed

After these steps, the process moves to Step 6.

Step 6: PTF Review of Pull Request

The PTF team reviews the configuration changes and the proposed new values that override existing parameters. This involves:

·       Reviewing the configuration changes

·       Validating the necessity of the override

·       Checking for potential issues

·       Testing in the PTF environment based on the nature and impact of the change

Once the review is complete, proceed to Decision Point D.

Decision Point D: PTF Approval of Configuration Change

The PTF now decides whether to approve the configuration change that overrides their prior value. Approval is granted if the change is necessary, does not negatively impact production, and proper justification is provided. If approved, proceed to Step 7; if not, move to Step 8, which initiates a collaboration loop.

Step 7: PTF Adds "PTF-Approved" Label

If the PTF approves the change, they add the "PTF-approved" label to the Jira ticket, document the approval and any associated conditions, and unblock the ticket. With these actions complete, move to Step 9.

Step 8: Collaboration Between PTF and Developers

If concerns arise regarding the proposed configuration parameter, the PTF and development teams collaborate to find an acceptable solution. This includes discussing concerns, proposing alternatives, revising the configuration approach, and updating documentation. The required conditions are open communication, a willingness to compromise, and technical expertise. This step then loops back to Step 3.

Step 9: Merging Configuration Changes

Following PTF approval, the developer merges the configuration changes. This involves merging the PR, notifying relevant teams, ensuring that PTF approval has been obtained, resolving any merge conflicts, and confirming that all checks are passing. The next step is Step 10.

Step 10: Creating a Ticket and Coordinating with Kitfox Team

The development team now creates a ticket in the RANCHER project, labeled with "new-configuration-release name/CSP number," and coordinates with the Kitfox team for review and integration. The ticket should include a description of the changes, reference related tickets, notify the Kitfox team, and address any review comments. After this, proceed to Step 11.

Step 11: Kitfox Review and Application of Configuration Changes

The Kitfox team pulls the ticket in progress, reviews the code and configuration changes, then applies them accordingly. Completion of this step marks the end of the development cycle.

Completion of Development Cycle

With the development cycle complete, the following outcomes are achieved:

  • Code is merged into the main or master branch

  • Configurations are updated and made permanent

  • Documentation is current

  • Tickets are closed

  • Changes are deployed to the development environment

  • The process is ready to move to the release phase

A feedback loop ensures that this completion feeds into PTF for ongoing testing, as detailed in Phase 3.

Phase 2: RELEASE (Flower/CSP)

Description: The release process begins when development changes are ready to be released, ensuring that all updates have passed initial development and testing stages.

Entry Conditions

  • Development cycle is complete

  • All changes have been tested in the development environment

  • A release has been scheduled

  • The release is triggered by an ongoing need to deploy updates to bug fest cluster(s) of corresponding release (Sunflower, Trillium, etc.).

Step 1: Ticket Creation for Release and Config Review

The Scrum Master (SM) or Product Owner (PO) or Team lead (TL) of the Development Team initiates the release process by creating a ticket for the module to be released, as well as a separate ticket for configuration review. This ensures that both code and configuration aspects are tracked and documented from the outset. Next Step: Proceed to Step 2.

Step 2: Dev Team Configuration Review

The Development Team reviews configuration files for any modules with code changes. This review is essential to identify new or modified configurations that could impact the release. Next Step: Proceed to Decision Point A.

Decision Point A: Were Config Changes Introduced?

The team verifies if configuration changes were made during the development cycle. Validation checks include: Comparing the current config with the previous release version (diff check):

  • If configurations were changed, proceed to Step 3.

  • If configurations were not changed, proceed to Step 11.

Step 3: Release Modules Including Config Changes

The Development Team prepares the release packages, ensuring all changes, including those in configuration files, are bundled and ready for deployment. Next Step: Proceed to Step 4.

Step 4: Ticket Creation for FSE

The Development Team creates a ticket for the Field Support Engineering (FSE) group in the Bugfest (BF) project. This ticket should reference related items and include a "new-configuration-" label for easy tracking. Next Step: Proceed to Step 5.

Step 5: Release Notification with Config Changes

The Development Team posts a message in the release communication channels to notify stakeholders about the release and highlight the configuration changes. Eventually, this notification is expected to be automated via pipeline. Next Step: Proceed to Step 6.

Step 6: FSE Makes Config Changes Permanent in Develop Branch

The FSE team updates the develop branch, making configuration changes permanent for enhanced stability and testing. Actions include updating the branch, validating changes, and documenting the branch state. Next Step: Proceed to Step 7 and Step 8.

Step 7: FSE Updates Bugfest Clusters

The FSE updates the Bug fest clusters according to requirements, ensuring that all necessary changes are reflected in the test environment. Next Step: Proceed to Decision Point B.

Step 8: FSE Coordinates with PTF for PTF Clusters

FSE works alongside the PTF team to apply configuration changes to PTF clusters. This includes scheduling deployment, coordinating timing, executing the deployment, verifying application, and monitoring for issues. Next Step: Proceed to Step 9.

Step 9: FSE Updates PTF's Clusters

The FSE updates PTF’s clusters as per the defined requirements. Next Step: Proceed to Decision Point B.

Decision Point B: Does Testing Raise Issues?

The team evaluates whether the applied configuration changes perform as expected without introducing new issues.

  • If there are issues, proceed to Step 12.

  • If no issues are found, proceed to Step 10.

Step 10: Update Release Notes

The Development Team updates the release notes to highlight all configuration changes. This process will eventually be automated to ensure consistency and thoroughness.

Step 11: Follow Current Release Process

If no configuration changes were made during the development cycle, the team follows the existing release process without additional configuration steps.

Step 12: Development Teams Review Issues and Re-Release If Necessary

If issues were identified during testing, the development teams investigate, resolve, and re-release the affected modules as required. Next Step: Proceed to Decision Point C.

Decision Point C: Were Configs Correctly Applied?

The team verifies that all configuration changes were correctly applied after merging. Validation checks include ensuring configuration syntax is correct, values are properly set, no deployment errors are present, and the system functions as expected.

  • If configurations are properly applied, return to Phase 1.

  • If there are issues with configuration application, repeat from Step 6.

Phase 3: ONGOING (PTF)

Entry Points:

  • Development Cycle Complete (Phase 1)

  • Release is ready (Phase 2)

  • Continuous monitoring loop

Step 1: PTF Performs Tests & Provides Recommendations

Description: PTF conducts thorough testing on various installations of FOLIO, evaluating system performance, reliability, and configuration integrity. The team then delivers actionable recommendations to development teams, Kitfox, and FSE to optimize performance and address any detected issues.

Testing Activities:

  • Performance testing

  • Load testing

  • Configuration validation

Required Conditions:

  1. Environment is stable

  2. Test scenarios are defined

  3. Monitoring tools are active

  4. Testing schedule is established

Next Step: Proceed to Decision Point A

Decision Point A: Were Any Issues Found?

Description: At this decision point, outcomes are determined by the results of PTF testing. The team assesses whether any issues, bugs, or anomalies have surfaced during evaluation.

Decision Criteria:

  • YES: Issues or bugs discovered → Go to Step 3

  • NO: No issues found, continue monitoring → Loop back to Step 1 (continuous monitoring)

Issue Types:

  • Performance degradation

  • Configuration errors

  • Integration failures

  • Functional defects

Step 3: PTF Opens Ticket in JIRA Project

Description: Upon identifying issues, PTF logs a ticket in the JIRA project for the relevant module(s), formally requesting that the development teams address and resolve the problem.

Ticket Information:

  1. Issue description

  2. Details about the environment

  3. Severity and priority

  4. Steps to reproduce

  5. Expected vs actual behavior

  6. Affected modules

  7. Recommended fixes

Required Conditions:

  • Issue validated and documented

  • JIRA project identified

  • Proper categorization

  • Sufficient detail for developers

Next Step: Proceed to Step 4

Step 4: Teams Work on Ticket Following the Development Cycle Process

Description: Development teams take ownership of the ticket, implementing fixes in accordance with the established Development Cycle procedures. This ensures that all resolutions are systematically tracked and verified.

Next Step (Loop): Return to Phase 1

Decision Point B: Is Release Required for the Fix?

Description: The team evaluates whether a formal release is necessary to deploy the fix, considering the nature and impact of the changes.

Decision Criteria:

·       Not needed: No releases required → Loop back to Step 1 (continuous monitoring)

·       Needed: Fixed issues should be released → Go to Step 5 (Release route)

Step 5: Follow Release Process for Configuration Fix

Description: If the fix pertains solely to configuration, the team follows the established Release process to implement changes. This involves updating the configuration and proceeding through the Release Cycle (Phase 2).

·       Update configuration

·       Follow Release Cycle (Phase 2)

Next Step (Loop): Return to Phase 2

Complete configuration file

  1. Purpose - have single place where all module configurations are stored

  2. Who is responsible - development team responsible for particular module

  3. Naming Convention - CompleteConfiguration.json

  4. Where to store - Root forder of module repo

  5. What does “essential” parameter mean for “EnvVars” - Variable that MUST be checked|updated by DevOps team during deployment. If value for “essential“ variable is empty, then SysOps team should provide the value

  6. For what envs this configuration is applicabale - [draft] Configuration management - Improvement process | Default configuration definition

  7. How to hightlight if adjustment is required based on load - In EnvVars description dev teams should highlight if value should be adjusted based on load and “essential” flag should be changed to “true”

  8. List of evn vars types(enum):