2024-09-19 Product Council Agenda and Meeting Notes (SIG Conveners)

 Date

Sep 19, 2024

 Participants

  • @Alexis Manheim@Jennifer Eustis @Kristin Martin @Charlotte Whitt @Martina Tumulla @Scott Perry @Martin Scholz @Jeremy Huff Lucy (Galileo) @Gang Zhou @Marc Johnson @Brooks Travis @Felix Hemme @Tara Barnett @Ian Walls @Martina Schildt @Jesse Koennecke @Oleksii Petrenko @Owen Stephens @Mark Veksler @Kirstin Kemner-Heek @Dung-Lan Chen @Yogesh Kumar @Maura Byrne @Caitlin Stewart @Jenn Colt

  • Note taker: @Jennifer Eustis

 Discussion topics

Time

Item

Presenter

Notes

Time

Item

Presenter

Notes

5 min

Announcements

all

Please remember to vote on the review for the Reading Room functionality presented last week.

Please add your ideas to the retro board for the PC meeting at Wolfcon.

10 min

September SIG Reports



SIG Conveners

Questions about this month’s SIG reports?
No questions.

40 min

Critical Service Patch (CSP) process

@Oleksii Petrenko @Kristin Martin

PC, Release Management Stakeholders, and SIGs discuss the CSP process and how it is working in different functional areas and for different libraries

CSP means Critical Service Patch and is a small and targeted software patch to address a critical issue. CSP requestors are product owners and CSP approvers is the Release Management Stakeholders (RMS): Khalilah Gambrell, Mike Gorrell, Harry Kaplanian, Jakub Sckoczen, Mark Veksler and representatives from different councils and institutions (Current members). Issues labeled as P1 and P2 are included in a CSP.

Before CSPs there were hot fixes. CSP started around 2022. Since 2022, the number of CSP has increased and the majority of these are related to security.

CSP Process: There are 4 steps:

  • POs sends message to RMS

  • Issues added based on POs understanding of priority

  • Planned ahead by 2-3 weeks

  • Bugs addressed

There are new Jira fields like “CSP approved”, “CSP justification” for instance. There are 7 points that need to be described by the requestor (impact on business, institutions affected, workarounds or not, areas impacted, technical implementation description, testing required, roll back plan).

CSP Logistics: Requestor submits a Jira issue using existing Slack channel to request CSP and provides information on release and patch number, filled out RCA field, testrail case linked, and “CSP Request details” filled. RMS reviews and provides feedback and says yes or no and if no provides details.

Fields to be added to Jira soon: proceed to implementing, test, and module release, when the fix has been certified, information about the CSP release’s release notes page, dev-ops team to add tag to platform-complete, release coordinator announces CSP.

Owen: Do we have stats on how many CSP requests have been made and rejects and why? Oleksii: We don’t have that information now but Oleksii will look this up.

Mark: The number of rejections are slow. So far the issues are real that prevent people from doing business. POs have been providing great justifications where it is easy to make a decision. Maybe 96% are approved.

Kristin: It seems like if there is a workaround then the patch can be delayed for things not approved.

Dung-Lan Chen: How do bugs become P1? Oleksii: There is a defect prioritization wiki page.

Owen: In my experience, a P1 is usually really obvious - some essential functionality that simply doesn’t work. The P2s are more difficult to agree on with the question of a workaround.

Example:

FOLIO library reports an issue that can’t be addressed by system admin or hosting provider. The issue is created and assigned to PO who then reviews and prioritizes the issue as P1/P2 (without reasonable workaround).

  • PO works with dev team to investigate, determine fix strategy, effort, risk, and a test plan

  • PO prepares Jira ticket and sends message to RMS slack channel.

  • RMS reviews

Security CSPs: These are different from the other CSPs and address a security vulnerability. These CSPs have to have “security-reviewed” for a label and then it goes to RMS for approval process.

Alexis: When you showed the chart of CSPs vs Hot Fixes and those CSPs are security related, are there security issues bundled in with functional CSPs? Oleksii: yes

Alexis: Would it make sense to differentiate between the two because they are different enough?

Mark: If a CSP security issue is brought up shouldn’t hold up other functional CSPs. Security issues are given priority and some need to be addressed right away. EBSCO is conducting regular penetration testing for every flower testing and a list of new vulnerabilities that the security team reviews and assigns priority to them. EBSCO is also using code and container scans for any changes.

Marc: We’re mixing two things together. How do we reduce the number of changes we apply? The Root cause analysis makes things easier. From JIRA, we can gather information from which to learn how to do things moving forward. There’s the individual fixes that go into modules in FOLIO and the security patches are done as soon as possible. The release manager can bundle these 2 types at their discression. There are mechanisms for both.

Kristin: I’d be interested in hearing from SIGs on how the CSP is working for their SIGs and institutions. The process seems very similar to the hot fix process. It has still been a struggle to keep up with the CSPs.

Mark: One reason that we are finding additional issues is because of the unique workflows uncovered during testing. We need to keep identifying those workflows. We also need to have different data sets representative of different institutions.

Jenn: I definitely use the CSP Jira boards to see what is happening.

Yogesh: WIth all these CSPs, my team regularly review Jira issues and puts them into different categories to improve analysis. Yogesh will be presenting at Wolfcon on this: https://wolfcon2024.sched.com/event/1eetR .

Jeremy: From their perspective, Texas A&M have been happy with the CSP process. The process has been smooth.

Dung-Lan Chen: They are relatively small. They get a notification for the upgrade and get a week or so to test. Try to make sure to look into release information and test as much as possible. Sometimes they just roll into the new release.

Owen: Do we know whether how many situations something released through CSP has caused regression issues or caused further issues that then have to be fixed through a CSP?

Kristin: It is a conversation about when to do the CSPs. There is an understanding about open source and how to live with some things and what is broken and needs to be fixed. Poppy was challenging.

Marc: If an organisation doesn’t have capacity to test specifically, they could adopt a more cautious approach, leverage other orgs testing and learning, and thus wait for multiple CSPs to come out before upgrading.

Yogesh: There are data sets errors in test and testing. There are also unusual things like the environment didn’t work and are just hard to predict. That’s why this data set is importing.

5 min

Future topics

all

See you at WOLFCon!

 Action items

 Decisions