Spike - Identify opportunities to split mega applications into smaller pieces

Spike Overview

JIRA ID: EUREKA-305

Objective:
We have several large applications, often referred to as "mega applications." Our goal is to break these down into smaller, more granular applications to enable more frequent releases and deploy only the necessary components to the cluster for customer needs. The first step is to achieve this at the module level without refactoring the code or modifying dependencies. Next, we will align this approach with subject matter experts, who will independently define applications from a FOLIO perspective. We will then collaborate to find a balanced solution between both approaches.

Scope:

  • Identify suitable candidates for breaking modules out of the mega applications into smaller, standalone applications.

  • Focus on modules that many others depend on but that have fewer dependencies themselves, such as inventory or potentially finance.

  • Consider team responsibilities when splitting applications, especially in cases where applications span multiple team boundaries. For example, mod-inventory and ui-inventory are owned by Folijiet, while mod-inventory-storage is managed by Spitfire.

The application descriptor we are analyzing and attempting to split is:

app-platform-complete - GitHub link
It includes 53 backend modules and 54 UI modules.

Representing dependencies using a graph approach:

To illustrate the dependencies between modules, I decided to use a graph representation, where each module is a node, and the dependencies are represented as edges between them.

In the screenshot below, you can see all backend modules in the system, along with their dependencies, arranged in a hierarchical layout.

 

graph_output1.png

 

Evaluation

To group or aggregate nodes into larger nodes (or applications) based on dependencies—where nodes within a group only have dependencies on each other (i.e., they are internally connected but not externally)—the following steps can be followed:

Steps to Aggregate Nodes Based on Dependencies:

  1. Identify Subgraphs:

    • Connected Components: Begin by identifying connected components (subgraphs) within the graph. These represent groups of nodes that are all directly or indirectly connected. If dependencies are confined within a connected component, the nodes in that group can be aggregated.

  2. Dependency Analysis:

    • For each node, evaluate its dependencies. If all dependencies are confined to a specific subset of nodes, that subset can potentially be grouped into a larger node.

  3. Cycle Detection:

    • Strongly Connected Components (SCC): In directed graphs, look for SCCs, where each node is reachable from every other node in the group. Nodes within the same SCC can be grouped together.

  4. Community Detection Algorithms:

    • Utilize clustering algorithms, such as modularity-based or hierarchical clustering, to group nodes that are more interconnected internally than with nodes outside the group. This method allows for more flexible groupings beyond SCCs or connected components.

  5. Check External Dependencies:

    • After identifying potential groups, ensure no nodes within the group have dependencies outside of it. If external dependencies exist, the group may need further refinement.

  6. Merge Nodes:

    • Once valid groups are identified, where dependencies are self-contained, aggregate them into larger nodes. This process reduces the graph by replacing smaller nodes with single, larger nodes.

 

 

The tools used to build the graph and visualize dependencies are outlined below.

TBD link on GitHub after clearing the code

 

Implementation

The following modules have no dependencies and can be easily divided into separate applications: