[FOLIO-1548] SPIKE: a lighter-weight folio/testing-backend VM Created: 03/Oct/18  Updated: 26/Jun/19  Resolved: 26/Nov/18

Status: Closed
Project: FOLIO
Components: None
Affects versions: None
Fix versions: None

Type: Task Priority: P2
Reporter: Wayne Schneider Assignee: Wayne Schneider
Resolution: Done Votes: 0
Labels: ci, integration, sprint48, sprint49
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original estimate: Not Specified

Issue links:
Blocks
blocks FOLIO-1532 Add mod-data-import and mod-source-re... Closed
blocks FOLIO-1603 Add mod-email to CI testing builds Closed
blocks FOLIO-1606 Add mod-event-config to CI testing bu... Closed
is blocked by FOLIO-634 Update network/filesharing configurat... Closed
is blocked by STCLI-15 Dynamic Okapi module installation Closed
is blocked by STCLI-114 CLI support for local backend modules Closed
Relates
relates to FOLIO-1519 Automatic loading of sample and refer... Closed
relates to FOLIO-1632 Create lighter-weight folio core VM Closed
relates to FOLIO-1633 Establish scheme for declaring additi... Open
relates to UXPROD-1827 CI-integrated continuous deployment (... Closed
relates to FOLIO-2118 CI-integrated continuous deployment (... Closed
relates to UXPROD-1156 Re-organization of the CI/CD environm... Closed
relates to UXPROD-1424 Re-organization of the CI/CD environm... Closed
Sprint:
Development Team: Prokopovych

 Description   

As the number of modules jumps, we need to have a more lightweight testing-backend VM for developers. They should be able to enable the modules outside the core that they need using Ansible playbooks.

Possible outcomes/subtasks:

  • Reconfigure the folio/testing-backend Vagrant box to run only modules required by the platform-core Stripes platform, plus additional core modules (e.g. mod-codex-ekb, mod-audit)
  • Create documentation for developers on how to deploy the additional modules they require and load reference and sample data
  • End the nightly build of the full-stack Vagrant boxes folio/testing and folio/snapshot. This will require some coordination with Julian Ladisch, as folio/testing is the basis for folio-demo.gbv.de
  • Reconfigure the AWS builds – is there a need for both folio-testing and folio-snapshot, or could folio-snapshot be reconfigured to be two systems with periodic stripes rebuild like folio-testing is today? Regardless, there should be some kind of snapshot reference system that represents the full set of candidate modules for the next release.

As part of this process, we should create a test Vagrant box and work with some front-end developers to refine the configuration and the procedures for adding/updating modules.



 Comments   
Comment by Jakub Skoczen [ 10/Oct/18 ]

Ian Ibbotson (Use this one) Khalilah Gambrell Ann-Marie Breaux Dennis Bridges guys, I'd like to understand if your teams are using the Vagrant VM for development and if we need to integrate your backend modules here.

Comment by Ann-Marie Breaux (Inactive) [ 10/Oct/18 ]

Jakub Skoczen Will do. Folijet has a standup in a few minutes, so we'll discuss it there.

Comment by Ian Ibbotson (Use this one) [ 10/Oct/18 ]

Heya Jakub Skoczen yes - we're using the testing image for mod-erm, mod-licenses and mod-usage-data. Our dev work requires access to orders, acquisitions, vendors, inventory - pretty much the whole core build. In fact, what would be most useful for us would be everything by default, with the ability to replace any one specific module with a local development version. For ERM work, we're pretty likely to need the full suite of apps I feel at the moment..... Does that answer your question?

Comment by Kateryna Senchenko [ 10/Oct/18 ]

Jakub Skoczen, Ann-Marie Breaux, provided that our modules (mod-data-import, mod-source-record-storage and mod-source-record-manager) are included in the folio/testing and folio/snapshot and there is a clear guide on how to deploy additional modules on folio/testing-backend, we probably don't need them among the core ones. So, we are OK with more lightweight testing-backend VM that doesn't include our modules.

Comment by Kostyantyn Khodarev [ 10/Oct/18 ]

Jakub Skoczen will it be replacement for folio/testing-backend or additional lightweight image?

Comment by Wayne Schneider [ 11/Oct/18 ]

Ian Ibbotson (Use this one) – would it meet your needs to produce a backend Vagrant box with just the core apps (NOT including acquisitions, which as I understand it is not "core"), with a mechanism to easily add apps? E.g. a script to deploy acquisitions modules, enable them for the default tenant, and load sample data (this doesn't address your suggestion to replace a delivered version with a local dev version, I know, but that's a slightly separate issue...there are ways to accomplish that now, if you have enough knowledge of VirtualBox networking and Okapi).

My presumption is that developers are building their own webpacks and don't need a stripes UI on the box – is this accurate? There is still the issue of loading all the required permissions for frontend modules, so probably even in the "slimmed down" VM we would need to load and enable all the frontend modules for the tenant, even if we don't build and serve a webpack.

Kostyantyn Khodarev – this would be a replacement for folio/testing-backend.

Comment by Jakub Skoczen [ 17/Oct/18 ]

We discussed that we'd like to provide functionality to "hot plug" modules in order to let developer pick and choose what modules are installed in the black-box (or locally) and ship only the core modules (as defined by the dependencies of platform-core) in the box.

Ian Ibbotson (Use this one) Kostyantyn Khodarev until we have the hot plugging functionality working would it be helpful to provide a guide on how to run modules outside of the box and make them talk to the modules in the box?

Comment by Ian Ibbotson (Use this one) [ 22/Oct/18 ]

All - apologies for missing out on these questions. I don't think I've really expressed our problem well. Our major issue is that the "Set" of modules that we would consider a bundle for a v1 ERM app (Lets call that "Chalmers" for arguments sake) is made up of some core, and some non-core modules. Right now, this collective set of things seems to break "as a whole" with alarming regularity. Perhaps individual modules all work and pass their respective tests, or perhaps they dont - EG Mod Orders seems to sometimes work and sometimes not even at the level of creating a PO. It is absolutely essential that mod-erm has a stable footprint of modules upon which we can base a stable distribution. Being able to manually add-in modules on a developer desktop is necessary, but not close to being sufficient, for what we need.

What we need is a distribution level set of modules which collectively work and we can call a v1 "Chalmers" or "GBV" system - with a set of integration tests. you could call this a "Product" build if you wanted to - but we need a cohesive and complete set of modules that work together and together move from stable version to stable version. That transition does not have to be, nor would I expect it to be, every night. But what we do need is some reporting, so that if it has been month since that product distribution had a mod-orders and mod-vendors were in a compatible and working state together then we can report that lack of a working integration test back to the project and say "We're effectively blocked because we have dependencies that won't work together".

Right now we can't salute or wave, we've got modules which apparently work or don't work in accordance with some form of quantum mechanics and a project that only wants to look at the individual parts and not the whole.

Apologies for being slightly ranty/shouty. I was trying to have a quiet 1:1 about this, but apparently not.

Comment by Ann-Marie Breaux (Inactive) [ 23/Oct/18 ]

Hi Ian Ibbotson (Use this one) Please be shouty. I totally agree, but from the user perspective instead of the developer perspective. So many of the individual apps have dependencies on others - whether it's settings, passwords and user permissions controlling things in other apps, or orders having to interact with inventory, or data import having to interact with inventory, MARCcat, orders, and invoices, or checkin/checkout/requests having to interact with inventory. I can understand how testing leading-edge development work in separate environments might make sense, but ASAP, those individual pieces need to come together and form a working whole.

Otherwise we don't have a functioning system or environment or cluster of apps. We have (mostly) functioning apps that can't actually function together. I want someone who looks at FOLIO - whether they are in the community or outside of the community - and logs on using the URLs and logons that we post on the FOLIO wiki home page - to have a decent experience. It's frustrating to me and worse, embarrassing. I am proud of FOLIO and I want to be proud of FOLIO. I want to be its biggest advocate. It's hard to be proud when you're praying you won't see an unexpected error of some kind with every button push or click.

I KNOW that there are many good developers and managers working hard to triage the problems and develop both short and longer term solutions. This comment is not meant as a putdown of their work or to demoralize them. But I also know that we, the FOLIO community, can do better. Maybe we pull back and only promote the Q3 site on our wiki home page, at least until some of the other environments stabilize. Maybe all new development halts and we spend a few days with all the developers in a worldwide retreat, discussing the problems, agreeing on fixes, and making it work. Then we resume development. I don't know what the right answer is. Just please make this system work - this system we all care so much about.

OK, end of rant. Time for bed.

Comment by Jakub Skoczen [ 30/Oct/18 ]

Ann-Marie Breaux Ian Ibbotson (Use this one) I am not convinced this particular issue is the place for discussing how to address the problem of FOLIO working well as an "integrated experience" or a "complete product" but I will follow up with some comments addressing the specific issues wrt ERM and Acquisitions:

  • The suite of ERM apps have dependencies on both on the "core" apps and "acquisition" apps. This makes the integration effort quite complex, probably more complex than anything we have dealt with so far. The Core Team is currently integrating some of the Acquisition apps into the FOLIO builds but we have no way to verify that those apps work (no automatic integration tests) nor address any specific issues with those apps in any other way than ask the maintainers to fix the problems and release a new version. On top of that we are dealing with resource constraints due to the sheer number of modules we are trying to integrate into a single VM (this is specifically what this issue is about). Addressing the resources problem is not going to solve the problem Ian Ibbotson (Use this one) laid out in his above comment.
  • I proposed in my earlier comments that the FOLIO VM we ship includes only the "core" set of modules (referred to as "platform-core"). This will help the quality and stability of the VM but will not help the ERM team to solve their problems with Acquisition modules. To address that I'd like to propose that each team responsible for building a suite of apps maintains their own "platform" (along with a set of automatic integration tests) and stable releases of such "platform" are included in what we call "platform-complete". This means Stacks and EPAM teams working on Acq would provide "platform-acquisitions" and the ERM team would provide "platform-erm". With a clearer structure we are more likely to create a stable build of "platform-complete". Maintenance of "platform-complete" will require a shared effort from all teams.
Comment by Wayne Schneider [ 08/Nov/18 ]

A branch of folio-ansible has been updated to provide a "testing-core" build, composed of the modules specified in https://github.com/folio-org/platform-core and their dependencies, plus mod-codex-inventory and mod-codex-ekb. A VM built using this build target sports 19 backend modules (rather than the 30+ currently included in folio/testing), and can comfortably live in 7-8 GB of RAM.

I would suggest that as the work that Matthew Jones is doing on stripes-cli ( STCLI-15 Closed ) to support managing Okapi module deployment progresses, we do the following:

  1. Begin creating a nightly Vagrant build of "testing-core" (and testing-backend-core? Not sure if this is necessary)
  2. Actively encourage developers to begin using VMs based on this box rather than on testing-backend (perhaps discontinue testing-backend altogether?). Provide documentation for using stripes-cli to manage module deployment on the box.
  3. Continue to produce a "testing" build based on platform-complete as a Vagrant box, but boost the default RAM allocation 10 GB. GBV uses this image as the basis for their local demo system, and a few developers have reached out to me about using it for testing. Continue to build this environment as "folio-testing" on AWS, as well.
  4. Discontinue the "snapshot" Vagrant box (but continue to build the snapshot environment as folio-snapshot, at least for the time being).

Some questions:

  1. Currently the "testing" build includes several modules that are not required by the frontend (mod-graphql, mod-rtac, mod-audit, mod-audit-filter, etc.). What should be the process for a module that is not required by another module to get into the testing-core build?
  2. What should be the process for new backend modules to be introduced to either the testing or snapshot builds? Currently, it happens either by someone reaching out to me and asking for it, or by a frontend dev including a new requirement. Obviously, it would be good to know about the module and try to get it integrated before it is included as a frontend requirement.

Other thoughts or comments?

Comment by Jakub Skoczen [ 08/Nov/18 ]

Wayne Schneider

I think we need to discuss the implications for the point #3 – it's likely that even the 10GB will quickly become too little to fit all platform-complete modules comfortably. I'd like to discuss alternative solution to the GBV deployment (e.g based on folio-install and/or Ansible), who is the contact person, Julian Ladisch? In terms of testing and development support, I think a cloud-hosted backend and the dynamic loading capability should be used instead.

I'll take a stab at answering your questions:

1. It would be good if we could list all "additional" modules in a configuration file part of "platform-core" and discuss their inclusion on a case by case basis. E.g CODEX provider modules are needed so the system can function. Everything else is subject to removal from platform-core – the smaller the set the more lightweight and stable the "core" system is.

2. I think the set of modules for "platform-core" will be relatively stable so I propose we rely on inclusion requests in form of JIRA issues in the FOLIO project.

Comment by Julian Ladisch [ 08/Nov/18 ]

Yes, I can switch to use https://github.com/folio-org/folio-ansible to deploy https://github.com/folio-org/platform-complete on the GBV server.

Comment by Matthew Jones [ 08/Nov/18 ]

What should be the process for a module that is not required by another module to get into the testing-core build?

It would be good if we could list all "additional" modules in a configuration file part of "platform-core"

I was thinking the same thing. STCLI-15 Closed already offers an --include option to manually list additional modules not defined in the platform's tenant configuration (stripes.config.js). A new section of the tenant configuration JSON seems like a good fit for defining such modules.

Comment by Wayne Schneider [ 08/Nov/18 ]

Matthew Jones Jakub Skoczen I like this approach. It also means that platform-core/platform-complete stripes.config.js would be the one place where this configuration lives, rather than split between the platform and folio-ansible, which is great. The configuration would need to be additive, right? So that you get all the additional modules from the platform you're working on, as well as from platforms you might pull in as dependencies.

The fact that the tenant config is JS (rather than JSON) does mean that for orchestration, users will pretty much have to use stripes-cli. It will (I think) be necessary to provide a command that returns required modules in a JSON list that can be used to do deployment outside of Okapi, as well as a command to do the tenant install without deploy=true.

Comment by Wayne Schneider [ 08/Nov/18 ]

The list of modules that are not required by any frontend at this time, but are included in the current "testing" VM:

mod-user-import
mod-graphql
mod-codex-inventory
mod-codex-ekb
mod-calendar (the Calendar app is currently commented out in platform-complete)
mod-rtac
mod-template-engine
mod-audit
mod-audit-filter

These are on AWS only:
mod-source-record-storage
mod-source-record-manager
mod-data-import
mod-gobi
mod-oai-pmh
mod-patron
mod-sender
mod-email
mod-event-config

Thoughts about which of these are "core"? mod-codex-inventory, at least, right? (otherwise the Codex app doesn't function at all)

Comment by Wayne Schneider [ 08/Nov/18 ]

Matthew Jones observed on STCLI-15:

Wayne Schneider, While testing the new CLI commands, I've been noticing that some platform installs are overlooking required back-end modules. After debugging the issue, I've determined that module descriptors that come pre-installed with folio/testing-backend omit the "requires" array of interfaces.

This limits the ability of a reasonably fresh VM to be useful for installing a platform on top of another, as even after a call to /_/proxy/pull/modules, the VM could very well have the latest descriptors for several modules. Manually deleting the descriptors then pulling the latest will retrieve the necessary descriptors.

I observed this with folio/testing-backend 5.0.0-20181107.1246 and modules like folio_orders-0.1.100041 and folio_vendors-1.1.100052 while performing various install operations to emulate installing of platform-erm on top of platform-core. Could the VMs (or the new core VM at least) be published with the complete descriptors matching the registry?

I think this is a good plan, but it will mean that for a developer to work with a new interface version of a "core" interface, they will need to deploy it themselves (using the stripes CLI) – so the "core" build will be much more like a "snapshot" build.

Comment by Wayne Schneider [ 08/Nov/18 ]

Jakub Skoczen Adam Dickmeiss John Malconian – how hard would it be to include a -Xmx Java option in either the Dockerfile for modules or as a JAVA_OPTIONS environment variable within the launchDescriptor (embedded in the module descriptor)? Right now this is part of the configuration in folio-ansible, but if developers are using stripes-cli to deploy modules in their VMs, there is no way for them to do this. This is a problem in containers, which by default detect all of the system RAM as available, and so don't set appropriate limits (see https://docs.docker.com/samples/library/openjdk/#make-jvm-respect-cpu-and-ram-limits for more discussion).

Comment by Matthew Jones [ 08/Nov/18 ]

The configuration would need to be additive, right?

Yes

The fact that the tenant config is JS (rather than JSON) does mean that for orchestration, users will pretty much have to use stripes-cli.

Tenant configs can also be authored in JSON. The CLI accepts either format. I'm not sure why historically the format for the tenant config been JS. Using JS can be handy for development, but perhaps should be avoided for production.

Platform-complete does make use of JS to merge configs, but I wonder if it really should. That logic could be moved into the CLI so that any tenant config could extend another without imposing custom logic. Defining something like "extends": "platform-core/or/some/other/stripes.config.js" may work here.

It will (I think) be necessary to provide a command that returns required modules in a JSON list that can be used to do deployment outside of Okapi

The tenant config JSON is available via stripes status stripes.config.js. We just need to isolate the output to be an effective as a converter. Of course, then you'd need the CLI to avoid using the CLI! Seeing as how its installed with the platform, maybe this is okay. Would the format of stripes status stripes.config.js be sufficient (without the surrounding text) or are you looking for something more specific, perhaps a list of module descriptor ids?

as well as a command to do the tenant install without deploy=true.

Toggling deploy=true is supported.

Comment by Wayne Schneider [ 08/Nov/18 ]

I think using the CLI as part of your orchestration is totally fine, not to be avoided – it's just that asking to extend it to (for example) deploy containers on AWS for a non-Okapi-based deployment seems out of scope.

After more exploration and testing of the CLI, it looks like stripes platform backend stripes.config.js --simulate --detail is very close to what I was thinking of.

Comment by Matthew Jones [ 09/Nov/18 ]

Wayne Schneider, For reference, that simulate operation can (mostly) be broken up into the following commands for more granular control.

$ stripes mod descriptor stripes.config.js > module-ids
$ cat module-ids | stripes mod install --simulate

The one piece currently missing, which platform backend provides, is automatically appending ids like "mod-codex-inventory" when they are needed.

Let me know if you find the output format of any commands could be improved. Also, you may find this one:

$ stripes mod descriptor stripes.config.js --full --strict

could, with some tweaking to the output, become a suitable substitute for build-module-descriptors.js (currently outputs an array instead of individual files).

Comment by Matthew Jones [ 27/Nov/18 ]

Jakub Skoczen Wayne Schneider, I see this spike is now closed. Is there a corresponding ticket that captures the creation of this new VM?

Comment by Matthew Jones [ 28/Nov/18 ]

I created FOLIO-1632 Closed for the creation of the VM.

Comment by Matthew Jones [ 28/Nov/18 ]

I also created FOLIO-1633 Open to come up with a way to declare the modules that by design are not required by any front-end module, but still need to be included in a platform.

Generated at Thu Feb 08 23:14:09 UTC 2024 using Jira 1001.0.0-SNAPSHOT#100246-sha1:7a5c50119eb0633d306e14180817ddef5e80c75d.