Batch Importer (Bib/Acq)
(UXPROD-47)
|
|
| Status: | Closed |
| Project: | UX Product |
| Components: | None |
| Affects versions: | None |
| Fix versions: | R1 2021 | Parent: | Batch Importer (Bib/Acq) |
| Type: | New Feature | Priority: | P2 |
| Reporter: | Kateryna Senchenko | Assignee: | Ann-Marie Breaux (Inactive) |
| Resolution: | Done | Votes: | 0 |
| Labels: | data-import, epam-folijet, performance | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original estimate: | Not Specified | ||
| Issue links: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Release: | Q3 2020 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Epic Link: | Batch Importer (Bib/Acq) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Front End Estimate: | Very Small (VS) < 1day | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Front-End Confidence factor: | Medium | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Back End Estimate: | XXXL: 30-45 days | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Back End Estimator: | Oleksii Kuzminov | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Development Team: | Folijet | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| PO Rank: | 97 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Rank: Chicago (MVP Sum 2020): | R2 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Rank: Cornell (Full Sum 2021): | R2 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Rank: 5Colleges (Full Jul 2021): | R2 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Rank: FLO (MVP Sum 2020): | R1 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Rank: GBV (MVP Sum 2020): | R2 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Rank: MO State (MVP June 2020): | R1 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Rank: TAMU (MVP Jan 2021): | R2 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Rank: U of AL (MVP Oct 2020): | R1 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
Steps
Change SRM DB approach. For now, it is a bottleneck for performance - move to R2 and create another one feature (should be smaller than this feature; same as SRS, plus will need migration scripts. Yellow = partly done Notes on maximum file size from Data Import Subgroup Sept 2020
|
| Comments |
| Comment by Oleksii Kuzminov [ 05/Aug/20 ] |
|
Kateryna SenchenkoAnn-Marie Breaux I changed status to draft. After the final PoC and approvals, we can adapt this umbrella and stories to the new requirements. |
| Comment by Ann-Marie Breaux (Inactive) [ 05/Aug/20 ] |
|
Sounds good Oleksii Kuzminov Thank you! |
| Comment by Marc Johnson [ 12/Aug/20 ] |
What aspect of the source record storage database approach is a bottleneck? |
| Comment by Ann-Marie Breaux (Inactive) [ 15/Sep/20 ] |
|
Hi Oleksii Kuzminov I changed
Thank you! |
| Comment by Marc Johnson [ 15/Sep/20 ] |
|
Ann-Marie Breaux Does that mean that the approach outlined in this issue has been agreed and development will start on it? |
| Comment by Ann-Marie Breaux (Inactive) [ 15/Sep/20 ] |
|
Taras Spashchenko and VBar Are you comfortable with the path forward on this, or should we seek review/approval from the broader FOLIO tech community? |
| Comment by Taras Spashchenko [ 19/Sep/20 ] |
|
the steps are Ok. and we can proceed with detailed stories and implementation. |
| Comment by Marc Johnson [ 21/Sep/20 ] |
Does that mean that folks like the Technical Leads / Technical Council will not have an opportunity to provide feedback on this change? |
| Comment by Taras Spashchenko [ 21/Sep/20 ] |
|
Marc Johnson, this is the internals of Data-Import, it is not a platform-wide change, not sure that it makes sense to bring it to the TL or TC, to be honest, it will take a lot of time. But based on the completed PoC it will bring required reliability with quite good performance, and as far as I know, we do not have a real alternative, that could be implemented with a reasonable effort to achieve the same results. |
| Comment by Marc Johnson [ 21/Sep/20 ] |
There is every chance that I am misunderstanding the scope of this work. Is this the work that changes the integration between the various modules involved in data import from using the HTTP API provided by mod-pubsub to using Kafka directly? |
| Comment by Taras Spashchenko [ 21/Sep/20 ] |
|
Yes, you are right, the solution for Data import is to change the interaction between components from Http to direct Kafka connections. But it is not a substitution for Http interaction nor Pubsub that is proposed for the whole platform. |
| Comment by Marc Johnson [ 21/Sep/20 ] |
Thank you for confirming the scope of this work. I had thought this design was intended to be shared with a broader audience for feedback. I imagine I might have a different sense of what is considered a significant technical decision. I think a side-effect of this work is that Kakfa moves from being a design decision of mod-pubsub to a platform level capability that modules can use (much like how introducing mod-pubsub for the first generation of data import made it available to other modules). To me, even if we consider the changes to data import itself to not be a significant design decision, I think this change in the emphasis and visibility of Kafka to be a significant architectural change. I don't know if we want to explore that topic on this issue. |
| Comment by Ann-Marie Breaux (Inactive) [ 28/Sep/20 ] |
|
Kateryna Senchenko Oleksii Kuzminov Taras Spashchenko The Capacity Planning Team is already starting to plan for R1 2021. The sooner we can get these draft stories changed to open, add any other necessary stories, and have a t-shirt size for backend, the better. Please let me know if there's anything I can do to help. Thank you! |
| Comment by Ian Walls [ 29/Sep/20 ] |
|
I would agree with Marc Johnson on this... direct access to Kafka, instead of via mod-pubsub, makes it a core piece of the platform. I'd be in favor of it; I think having messaging built in "close to the ground" would have lots of utility for the project and make extensibility much easier. But it is a significant choice to make. |
| Comment by Ann-Marie Breaux (Inactive) [ 03/Dec/20 ] |
|
Hi Oleksii Kuzminov At a check-in today, EBSCO was emphasizing that once this work is done, we should get with the Performance Task Force to check a couple standard scenarios, especially with regards to very large files being imported. Should we include a couple tasks in this feature to account for this? |