2024-01-24 - Direct DB Migration Scripts
Date
Attendees
- Craig McNally
- Ingolf Kuss
- Maccabee Levine
- Tod Olson
- Taras Spashchenko
- Oleksii Petrenko
- Ankita Sen
- Jenn Colt
- Ian Walls
- Jeremy Huff
- Marc Johnson
- Florian Gleixner
Discussion items
Time | Item | Who | Notes |
---|---|---|---|
1 min | Scribe | All | Jeremy Huff is next, followed by Marc Johnson Reminder: Please copy/paste the Zoom chat into the notes. If you miss it, this is saved along with the meeting recording, but having it here has benefits. |
1 min | Reminders | All | Quick reminders to TC members...
|
* | Direct DB Migration Scripts | All | Context/Background: From Ingolf Kuss in #tc-internal. See full thread here.
Notes:
Ingolf Kuss mentioned that Wayne from index data broached the subject of the number of database update scripts, and if this is an ideal situation. The unanimous response was that, "no", this isn't ideal. Aleksey Petrenko, agreed that it is not ideal and clarified that this is not specific to Poppy. Taras Spashchenko explains why this decision was made: performance tests with plain SQL scripts were taking up to 14 hours. This was not acceptable, and the scripts were rewritten by splitting the data sets into separate chunks (16). For each chunk it took an hour, but they can be run in parallel. This decision was made to save time with the overall runtime of the migration. This approach also allows for remediation per chunk if something goes wrong. Craig McNally is this decision handled on a case by case basis, or are there guidelines? Taras Spashchenko it was made on based on the specific circumstances. Marc Johnson what Ingolf is wanting to talk about is a general set of procedures for this process. Marc is questioning what the origin of the practice of splitting sql into chunks came from. Aleksey Petrenko says that this change is an improvement when there is a significant amount of data. Marc Johnson these changes might be improvements, but where do we want this conversation to go? He is hearing that this is what we have to do, and others are saying they are not happy with this approach. Craig McNally It is helpful to hear the background on how this decision was made. Since the improvement brought the runtime from 15 hours to 15 minutes, it might be run inbound. Can we do these optimizations beforehand, so the need to split the scripts into out of band is not needed. Aleksey Petrenko: Is would be good to get feedback from EBSCO. It would be beneficial to involve team leads who have performed these migrations. Taras Spashchenko when new fields have been added to the instance table that needs to be populated with values based on the holdings and items data, because the sql update takes place on single thread, this is a candidate for paralyzation. In regards to callnumber updates, the update requires an update of the json object. As a single update it does not take advantage of the DB resources, and parallelization. Marc Johnson if we are going to talk about specific examples we should get the dev teams observation. It would be good to have an EBSCO rep in the sys ops sig. He appreciates the background information, but is not sure if this information will help us move forward with the questions at hand. Ingolf Kuss he is hearing for the first time that it is necessary to run these updates in parallel. Maybe this should be expressed in the release notes. Ingolf Kuss has invited EBSCO representatives to the sysops sig. Craig McNally It seems clear keeping these as in-band scripts is not going to work. It is also a pain point to run these out of band. It is doable but not ideal. It is better but still inconvenient. Maybe it is sufficient for us to just have a better understanding of the situation, and maybe we can document these things as general guidelines for how to approach these decisions. It is being handled on case by case basis. Can we parallelize in the inbound process? Ingolf Kuss there was not enough time to test this. This explanation helps him understand, and we can produce stadardize documentation. Craig McNally Documenting what the process is will help. If we can look at improvements that will also be useful. Florian Gleixner Do out of band running of scripts need to be run on a FOLIO system that is pristine. Maybe inbound scripts are better since the usage of modules during migration can be controlled. There are possible situation where the upgrade of one tenant may have a negative impact on other tenants. Even if this upgrade does nto present these issues, future updates may. Jeremy Huff what are the blockers for parallelization during an in-bound upgrade. Taras Spashchenko RMB cannot parallelize database interactions. Spring modules may be able to do this with some changes. Maybe providing some sort of driver script for the out of band script approach might make sense. Maccabee Levine what is really missing here is documenting the out of band approach. Craig McNally Communication is always important. He is not sure if this was mentioned in the release notes of poppy. Florian Gleixner two questions, do you have to shut down the tenant for upgrade (the answer is yes), and how big was the tenant which took 14 hours to upgrade. Taras Spashchenko it was 9 million records. Florian Gleixner Maybe we only do parallelism on large tenants? For the idea of providing a shell script with the out of bound scripts, this would be nice to have but it is not necessary. Craig McNally If the script could be parameterized for number of threads or data chunks, this could be a good idea. Ingolf Kuss he thinks a shell script could be helpful. What sort of deployment is documented. What sort of deployment should we document the process for, he feels it should only be for the single server. Sometimes you need to deactivate kafka during the upgrade. He has heard that jroot does this. Was this a factor in this upgrade? Craig McNally would prefer if this question was addressed in the sysops group Aleksey Petrenko appreciates this feedback. He is happy to see us in the development team, feel free to join. Craig McNally maybe there could be a tighter integration between development and sysops. Better communication between these two groups might make sense. What are the concrete action items. We want to document what is the decision process is for when migrations need to be split out into asynchronous migrations. Do we have a volunteer? Marc Johnson The TC has limited experience with this. It would be better for this documentation to be produces by the people who have the most experience with this. Taras Spashchenko will draft the rationale that was used for poppy, and we can derive general rules from that. Craig McNally we can use the poppy release as a case study for creating general guidelines. We can also take a look at how these can be improved. We will have additional follow conversations about this topic. |
NA | Zoom Chat | 11:09:32 From Jenn Colt to Everyone: |
Topic Backlog | ||
Decision Log Review | All | Review decisions which are in progress. Can any of them be accepted? rejected? |
Translation Subgroup | All | Since we're having trouble finding volunteers for a subgroup, maybe we can make progress during a dedicated discussion session? |
Communicating Breaking Changes | All | Since we're having trouble finding volunteers for a subgroup, maybe we can make progress during a dedicated discussion session? |
Officially Supported Technologies - Upkeep | All | Previous Notes:
|
Dev Documentation Visibility | All | Possible topic/activity for a Wednesday session:
|