[FOLIO-3013] Schedule execution of age to lost processes in hosted reference environments (and vagrant boxes) Created: 12/Feb/21  Updated: 12/Apr/21  Resolved: 11/Mar/21

Status: Closed
Project: FOLIO
Components: None
Affects versions: None
Fix versions: None

Type: Task Priority: P2
Reporter: Marc Johnson Assignee: Wayne Schneider
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original estimate: Not Specified

Issue links:
Blocks
blocks CIRC-1084 Stop periodic execution of age to los... Closed
Relates
relates to CIRC-1057 SPIKE: Aged to lost process needs to... Closed
Sprint: DevOps Sprint 109, DevOps Sprint 108
Development Team: FOLIO DevOps

 Description   

Context
Implementing organisations need to be able to schedule age to lost and and aged to lost billing processes to run when they choose.They will often choose to run these overnight in order to avoid reduce the load on systems and impact on day to day activities.This capability is needed for the 2021 R1 release.
 
The current implementation of these processes uses the okapi _timer interface. They are scheduled to run approximately every 30 minutes.The Okapi _timer interface only allows endpoints to be invoked periodically (e.g. every 2 hours) not according to a specific schedule (e.g. 2am every Tuesday).There are less than 3 weeks of development time remaining for the 2021 R1 release (for platform core modules) and the Core Functional team does not have capacity to make significant changes to this implementation.
 
Proposed Short Term Approach
 It is believed that many organisations already have the ability to schedule processes in their operational infrastructure already.

Hence it has been suggested we use the organisations chosen mechanism for doing this outside of FOLIO. I think this would involve the following development:

  • Removal of the timer interface definition that runs these processes
  • Provision of a new interface that defines endpoints for each of these processes

Configuration by the hosting provider (including the hosted reference environments) of:

  • a user that has permissions to invoke these processes
  • a scheduled task that logs in as the user and invokes the relevant process

Organisations will likely need to schedule these processes separately meaning two separate tasks and schedules are likely to be needed.

New External Endpoints

  • /circulation/scheduled-age-to-lost
  • /circulation/scheduled-age-to-lost-fee-charging

A preview of the module changes can be found here



 Comments   
Comment by Marc Johnson [ 12/Feb/21 ]

John Malconian Ian Hardy Jakub Skoczen 

We need to make changes to how the age to lost processes are scheduled for 2021 R1.

The proposed approach requires that hosting providers (including the hosted reference environments) will need to set up some scheduling outside of Okapi.

Wayne Schneider wisely suggested I raise this sooner rather than later to raise awareness of the need for this work.

Please do ask me any questions about it you may have.

I will update the issue with more concrete details once the back end development has been planned more.

cc: Holly Mistlebauer

Comment by Jakub Skoczen [ 16/Feb/21 ]

Marc Johnson Can you provide the APIs that need to be invoked through the scheduler (cron)? 

Comment by Marc Johnson [ 16/Feb/21 ]

Jakub Skoczen

Can you provide the APIs that need to be invoked through the scheduler (cron)?

I've updated the description to include them. They aren't usable yet, until the development work is done.

Comment by mark.stacy [ 16/Feb/21 ]

Marc Johnson Thank you for the updates. I will work on a tool/service that can be used with crontab on VM infrastructure or a CronJob template that can be used within K8s infrastructure.

Comment by Marc Johnson [ 17/Feb/21 ]

mark.stacy

Thanks. Please let me know if you need additional information from me.

Comment by Marc Johnson [ 17/Feb/21 ]

Holly Mistlebauer For the hosted environments, when are these processes intended to be scheduled? Every 30 / 35 minutes like they are now?

Comment by Marc Johnson [ 25/Feb/21 ]

mark.stacy Have you had chance to try out the new endpoints?

Comment by mark.stacy [ 25/Feb/21 ]

Marc Johnson I have created the folioCronService and created an ansible role to deploy. I tested out folio-snapshot-core and the endpoints have not been deployed. 

Status:404 Method: POST Request: /circulation/scheduled-age-to-lost-fee-charging
No suitable module found for path /circulation/scheduled-age-to-lost-fee-charging for tenant diku

Status:404 Method: POST Request: /circulation/scheduled-age-to-lost
No suitable module found for path /circulation/scheduled-age-to-lost for tenant diku

'mod-circulation', 'version': '19.3.0-SNAPSHOT.807' ---> does not have the endpoint 

 

Comment by mark.stacy [ 25/Feb/21 ]

Marc Johnson I set the frequency for every 30 mins. To add or update job parameters are located in folio-tools 

Comment by mark.stacy [ 25/Feb/21 ]

Marc Johnson Sorry, I missed the email notification regarding the PR review. Reviewed and looks good. Let me know if you have any questions regarding folioCronService. 

Comment by Marc Johnson [ 25/Feb/21 ]

mark.stacy

I tested out folio-snapshot-core and the endpoints have not been deployed.

Ah, my intent had been to leave it on a branch for you to test prior to merging. I can get this merged early tomorrow and into the folio-snapshot-core environment for you to test this post-merge if that works for you?

Comment by mark.stacy [ 26/Feb/21 ]

Marc Johnson Yes, that would be great! 👍 

Comment by Marc Johnson [ 26/Feb/21 ]

mark.stacy

The code has been merged and I've confirmed that the expected version has been deployed to folio-snapshot-core (not folio-snapshot)

Comment by mark.stacy [ 26/Feb/21 ]

Marc Johnson tested and all seems to be working.

2021-02-26T17:15:33.266645 Status:204 Method: POST Request: /circulation/scheduled-age-to-lost
2021-02-26T17:16:16.905381 Status:204 Method: POST Request: /circulation/scheduled-age-to-lost-fee-charging

Jakub Skoczen I have created service and ansible role.( FOLIO-3013 Closed ). Need to add an issue to deploy in all reference environments.

Comment by Marc Johnson [ 01/Mar/21 ]

Holly Mistlebauer The age to lost processes will not happen on the hosted reference environments until this and the follow up issue referred to above are completed

Comment by Marc Johnson [ 04/Mar/21 ]

Jakub Skoczen

I think you said that mark.stacy was now working on other things. Who is taking over this work? Who is reviewing these changes? Is it Wayne Schneider ?

Comment by Wayne Schneider [ 04/Mar/21 ]

Marc Johnson yes I'm doing this review, and hope to have it integrated today.

Comment by Marc Johnson [ 04/Mar/21 ]

Wayne Schneider Thanks

yes I'm doing this review, and hope to have it integrated today.

By integrated are you referring to this code being merged or this code be used to invoke the processes on the hosted reference environments?

Comment by Wayne Schneider [ 04/Mar/21 ]

Yes to both. I hope.

Comment by Marc Johnson [ 10/Mar/21 ]

Wayne Schneider Please can you help me understand where this work is?

Comment by Wayne Schneider [ 10/Mar/21 ]

Marc Johnson apologies. This work is still in process. Some of the code written for the cron job configuration and installation needed some work. It is actively being worked on today, and will hopefully be resolved by tomorrow.

Comment by Marc Johnson [ 10/Mar/21 ]

Wayne Schneider

apologies. This work is still in process. Some of the code written for the cron job configuration and installation needed some work. It is actively being worked on today, and will hopefully be resolved by tomorrow.

No need to apologise, I was only checking in to understand where the work is. Thank you.

Comment by Wayne Schneider [ 10/Mar/21 ]

This work has been tested and merged. The next daily builds of the reference environments and Vagrant boxes should include crontab entries to login and hit the /circulation/scheduled-age-to-lost endpoints. Both jobs are currently scheduled periodically every 30 minutes.

Comment by Wayne Schneider [ 11/Mar/21 ]

Confirmed that the crontab entries are installed and jobs are running on reference environments.

Comment by Hongwei Ji [ 08/Apr/21 ]

Okapi added schedule cron expression support recently. See https://folio-org.atlassian.net/browse/OKAPI-1000. It would be really nice to release Okapi 4.8 for Iris so hosting environment does not have to set up FOLIO schedulers separately. Holly Mistlebauer, Marc Johnson what do you think? Cc Adam Dickmeiss and Jakub Skoczen.

Comment by Wayne Schneider [ 08/Apr/21 ]

Adam Dickmeiss Hongwei Ji Marc Johnson it looks like the API for timer management requires the timer to be defined by the module in the module descriptor (that is, an operator can't just create an arbitrary timer task) – so this would also require an update to the mod-circulation module descriptor to put the default timer tasks back in. Am I understanding correctly?

That said, it might be attractive to sysops to avoid a one-release solution to a problem that would be resolved by releasing Okapi 4.8.0 and a new version of mod-circulation, if indeed it would be that simple.

Comment by Adam Dickmeiss [ 09/Apr/21 ]

Correct. A timer must be defined in the module descriptor for a module before it's called.

Comment by Marc Johnson [ 12/Apr/21 ]

Hongwei Ji Wayne Schneider Adam Dickmeiss

It would be really nice to release Okapi 4.8 for Iris so hosting environment does not have to set up FOLIO schedulers separately. Holly Mistlebauer, Marc Johnson what do you think?

That said, it might be attractive to sysops to avoid a one-release solution to a problem that would be resolved by releasing Okapi 4.8.0 and a new version of mod-circulation, if indeed it would be that simple.

It is unfortunate that the Okapi timer extension work was planned separately to the changes to the scheduling of these processes in mod-circulation. I don't know if the Okapi improvements are considered the longer term approach to scheduling in FOLIO.

I understand that the short term approach taken increases the burden on system operators. This approach was discussed with the SysOps SIG and with some of the DevOps folks.

We could use some of the capacity of the Prokopovych team to change this again during the bug fixing window for 2021 R1. I believe we have higher priority changes that have a direct user impact. Holly Mistlebauer Is this something that you would want to do?

Were we to decide move ahead with this, we would need a feature release of mod-circulation. That in itself, at this point in the release schedule, requires work to manage.

We are currently the best part of a month and a half past the feature freeze (2021-02-26) for 2021 R1 and two months past the freeze (2021-02-05) for infrastructure changes. As I understand it, these limits are intended to provide a stable point for the basis of the release distribution. I think making these changes at this point would be counter the intent of that schedule.

Who makes the decision whether we do this or not? I imagine it would need to include the release triage folks.

cc: Anton Emelianov Oleksii Petrenko Jakub Skoczen

Generated at Thu Feb 08 23:24:57 UTC 2024 using Jira 1001.0.0-SNAPSHOT#100246-sha1:7a5c50119eb0633d306e14180817ddef5e80c75d.