2022-06-17 - Sys Ops & Management SIG Agenda and Meeting notes

Date

Attendees

Discussion items

TimeItemWhoNotes
5WelcomeIngolf

RDA for Morning Glory and Kiwi support periodIngolf Kuss 

Julian Ladisch wrote :

"The FOLIO Security Team has requested a formal approval of the Morning Glory and Kiwi support period: https://folio-org.atlassian.net/wiki/display/TC/ADR-000003+-+Morning+Glory+support+period"

  • First feedback was given on Wednesday in the Pain Points subgroup; generally it is being considered critically low to have only two major flower releases which are being supported for security issues backporting.
  • Indexdata reports that some clients are still on Juniper and have already fallen out of security support 
  • Also self-hosting institutions (Ingolf et.al.) find two releases a tight schedule; would force self-hosting institutions and hosting providers to uprade all clients at least once every 8 month. This could be problematic for providers who host a lot of (small) libraries. Consider release testing phases and training sessions which need to be co-ordinated with the library staff.
  • Hkaplanian will ask Mark Veksler about the situation and point of view at EBSO.

Installation and Migration experiences at Stanford University

Stanford is running a K8s-cluster on VMs, not using Rancher.  They run Kubernetes with Lens | The Kubernetes IDE (k8slens.dev) .

They report issues in having Postgres run in Kubernetes. This led to very slow data loading, resulting in 3.000 records having been loaded in 45 minutes!

They need to load 9 mio. records; using Apache Airflow (as Jeremy had shown in a demo in one of these sessions).

Interactions of Postgres with the modules (which set off SQL commands) and other modules that run in the cluster made Postgres so slow. They solved this by running Postgres on a separate VM. But they want to bring back Postrgres to the cluster, using some kind of clustered and HA Postgres.


TC Pain Points  subgroup of SysOpsTod Olson 

The subgroup met on Wednesday but could not re-meet on Thursday, due to overlapping work. Progress was made in defining the concerns about the Release Upgrade procedure.

Tod says issues in batch import and data migration still need to be worked out. 

If the  import jobs fail, there is no sufficient information where it exactly failed. Maybe it has succeeded in 49,999 of 50,000 records, but one would not know and have to re-run the import job. This in turn would run into conflicts with optimistic locking in upserts, which have been recently reported (and are already being extensively discussed elsewhere MODINVSTOR-924 - Getting issue details... STATUS ).

Module logging and behavior in case of failure should follow a common expectation, common accross the modules.

Tod will edit the SysOps Pain Points paper, accordingly, and ask for feedback in the SysOps Slack chat. 

The subgroup is planning to re-meet next Wednesday (June 22nd).

Action items


  • Ingolf Kuss Plan Operational Needs session for WOLFCon 2022