FOLIO Harvester - new FOLIO module (UXPROD-2293)

[UXPROD-4463] GBV. FOLIO module wrapper (mod-harvester-admin, ui-harvester-admin) around existing Legacy Harvester (Quesnelia work) Created: 12/Sep/23  Updated: 08/Feb/24

Status: In Progress
Project: UX Product
Components: None
Affects versions: None
Fix versions: Quesnelia (R1 2024)
Parent: FOLIO Harvester - new FOLIO module

Type: New Feature Priority: P2
Reporter: Charlotte Whitt Assignee: Charlotte Whitt
Resolution: Unresolved Votes: 0
Labels: CBS2FOLIO, GBV, prio1
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original estimate: Not Specified

Attachments: PNG File Skärmavbild 2021-06-16 kl. 13.11.01.png     PNG File Skärmavbild 2021-06-16 kl. 13.11.27.png     PNG File Skärmavbild 2021-06-16 kl. 13.11.51.png    
Issue links:
Cloners
is cloned by UXPROD-4521 GBV. FOLIO module wrapper (mod-harves... Open
Defines
is defined by MODHAADM-80 Release mod-harvester-admin v2.0.x Open
is defined by MODHAADM-84 Investigate memory issues of the Lega... Closed
is defined by MODHAADM-68 Cannot PUT or POST transformation pip... Closed
is defined by MODHAADM-83 localindices: when batch processing, ... Closed
is defined by UIHAADM-8 Support management of transformation ... Closed
is defined by UIHAADM-69 Harvester-admin. Datepicker problems Closed
is defined by MODHAADM-63 Enable storageBatchLimit in PUT/POST ... Closed
is defined by MODHAADM-69 Deletion of steps that are in use sho... Closed
is defined by MODHAADM-76 BE in Lecacy Harvester: Can't access ... Closed
is defined by UIHAADM-9 Support management of transformation ... Closed
is defined by UIHAADM-94 Harvester-admin. Sort jobs in descend... Closed
is defined by UIHAADM-95 Harvester-admin. Paginate list of jobs Closed
is defined by UIHAADM-98 Harvester-admin. Datepicker calendar ... Closed
is defined by UIHAADM-100 Harvester admin settings improvements Closed
is defined by UIHAADM-102 Harvester admin result list improvements Closed
is defined by UIHAADM-103 Harvester admin job detail view > The... Closed
is defined by UIHAADM-106 Harvester admin: ability to export fa... Closed
is defined by UIHAADM-107 When viewing a Storage Engine, obscur... Closed
is defined by UIHAADM-108 When editing a Transformation Pipelin... Closed
is defined by UIHAADM-109 When editing a Transformation Step, v... Closed
is defined by UIHAADM-110 When viewing/editing a Transformation... Closed
is defined by UIHAADM-111 When editing a Transformation Step, p... Closed
is defined by UIHAADM-117 Failed Records tab: in result list, R... Closed
is defined by UIHAADM-120 Status not displaying in Job pane-title Closed
is defined by UIHAADM-122 Remove developer information from rec... Closed
is defined by UIHAADM-10 Add settings page Closed
is defined by MODHAADM-71 Harvester-admin: Deletion of log files Draft
is defined by UIHAADM-101 Harvester admin search improvements Blocked
is defined by UIHAADM-104 Release ui-harvester-admin v2.1.x Blocked
is defined by UIHAADM-105 Harvester admin: investigate failed jobs In Refinement
Release: Quesnelia (R1 2024)
Epic Link: FOLIO Harvester - new FOLIO module
Front End Estimate: XL < 15 days
Front End Estimator: Charlotte Whitt
Back End Estimate: Large < 10 days
Back End Estimator: Niels Erik Nielsen
Development Team: Sif
PO Rank: 0
Rank: Cornell (Full Sum 2021): R5
Rank: GBV (MVP Sum 2020): R1
Rank: Mainz (Full TBD): R1

 Description   

Current situation or problem: Remaining work on the FOLIO wrapper for Harvester Admin. The wrapper interacts for now with the legacy Harvester.

In scope: FOLIO module wrapper around existing Harvester for GBV. 2nd phase of FOLIO Harvester module.

Out of scope

Use case(s): For the German libraries - GBV, Hebis, Mainz then ... (enter text here)

Proposed solution/stories

Links to additional info

Questions

Out of scope:
Harvester version 2.0. 2nd phase FOLIO Harvester module

Use case(s):
As an admin from library in the GBV consortia,
I need to be abel to run (start, stop, general access to) my library's jobs in the Harvester

Proposed solution/stories:
Solution that meets GBV’s requirements for giving different admins access to only their own jobs in the Harvester. It assumes that libraries will not have access to transformation pipelines, transformation steps or back-end (Inventory storage) configurations, which are assumed to be maintained in the existing Harvester Admin UI by a central maintainer.

The basic idea is for a solution that utilizes current FOLIO user management modules and authentication - and applies access control on row level in the existing Harvester database. The idea is thus to discard the existing Harvester admin UI for jobs management but use the existing Harvester admin REST APIs by wrapping these XML based REST APIs in JSON based APIs. This is to make it possible to put a Stripes admin UI on top of them. The underlying MySQL database could be kept as is, initially at least, possibly save for a few tweaks. The harvest scheduler and harvest job running logic would also not change.

The suggestion is to implement access control by tagging all records in the Harvester database with tokens specifying who created the records and who should thus be able to retrieve them again. The simplest approach is possibly to install the Harvester module in a given FOLIO, create a user per library in that FOLIO and filter jobs by user ID. Alternatively it might be possible to control access per tenant.

The existing Harvester database and its REST APIs would in other words become the storage layer for the new JSON APIs.

The FOLIO module, ie mod-harvester-admin, would need to know where the Harvester resides. This could be a FOLIO tenant setting, but that would require that module to have a storage and the UI to provide the settings page. Mod-configuration might be another option. A simpler solution might be to declare the Harvester’s location in the deployment descriptor - pretty much the same way access to PostgreSQL is configured in other FOLIO storage modules.

It should be considered changing the current primary key scheme in the Harvester database from sequence generated primary keys to UUIDs. This would bring the schema closer to FOLIO conventions (and have unrelated configuration maintenance benefits)

The project would thus include:

  • Wrap select XML APIs as JSON APIs, provide unit tests.
  • Perhaps make a few changes to the underlying database for more FOLIO like JSON schemas
  • Write Stripes UIs with unit tests, to replace existing JSF UIs
  • Implement row level access control

Following APIs would need to be implemented in JSON

  • Storages (GET, to populate drop-down in jobs edit page)
  • Jobs (POST, PUT, GET, DELETE)
  • Transformation pipelines (GET, to populate drop-down in jobs edit page)
  • Job logs (GET, possibly just text passthrough rather than JSON)
  • Failed records (GET)

Following existing JSF based UIs would be implemented as Stripes front-ends - not necessarily in a one-to-one fashion but at least so that existing functionality will continue to be supported.

Links to additional info:
https://folio-org.atlassian.net/wiki/pages/viewpage.action?pageId=1387435

Questions


Generated at Fri Feb 09 00:40:07 UTC 2024 using Jira 1001.0.0-SNAPSHOT#100246-sha1:7a5c50119eb0633d306e14180817ddef5e80c75d.