FOLIO Harvester - new FOLIO module (UXPROD-2293)

[UXPROD-4034] GBV. FOLIO module wrapper around existing Harvester (Poppy) Created: 06/Feb/23  Updated: 05/Jul/23  Resolved: 12/Jun/23

Status: Closed
Project: UX Product
Components: None
Affects versions: None
Fix versions: Poppy (R2 2023)
Parent: FOLIO Harvester - new FOLIO module

Type: New Feature Priority: P2
Reporter: Charlotte Whitt Assignee: Charlotte Whitt
Resolution: Done Votes: 0
Labels: CBS2FOLIO, GBV, prio1
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original estimate: Not Specified

Attachments: PNG File Skärmavbild 2021-06-16 kl. 13.11.01.png     PNG File Skärmavbild 2021-06-16 kl. 13.11.27.png     PNG File Skärmavbild 2021-06-16 kl. 13.11.51.png    
Issue links:
Blocks
is blocked by UXPROD-3013 GBV. Consider design for integration ... Closed
Cloners
clones UXPROD-3892 GBV. FOLIO module wrapper around exis... Closed
Defines
defines MODHAADM-41 Failed-records API: Translate XML fie... Closed
defines MODHAADM-45 Add option to return previous harvest... Closed
is defined by MODHAADM-39 Describe new APIs etc in README. Cont... Closed
is defined by MODHAADM-40 Declared response type is "text" but ... Closed
is defined by MODHAADM-42 Support paging parameters (offset,lim... Closed
is defined by MODHAADM-43 Create API /previous-jobs/failed-reco... Closed
is defined by MODHAADM-46 OpenApi: Document {id} path parameter... Closed
is defined by MODHAADM-47 Enable suitable CQL operators for que... Closed
is defined by MODHAADM-48 Add the log text to list of queryable... Closed
is defined by MODHAADM-49 Don't set `finished` in `previousJob`... Closed
is defined by MODHAADM-50 Searching old logs by message does no... Closed
is defined by MODHAADM-51 Better searching for job names Closed
is defined by MODHAADM-55 Cannot update harvestable: No schema ... Closed
is defined by MODHAADM-60 Filtering by timeStamp does not work ... Closed
is defined by UIHAADM-5 Provide access to most recent harvest... Closed
is defined by UIHAADM-11 Tidy up Harvestable editing form in a... Closed
is defined by UIHAADM-25 Display simple list of errors on the ... Closed
is defined by UIHAADM-29 Provide access to harvester logs from... Closed
is defined by UIHAADM-33 Rename various entites Closed
is defined by UIHAADM-34 Activate the search pane for the harv... Closed
is defined by UIHAADM-35 Searching jobs by ID or Harvestable I... Closed
is defined by UIHAADM-36 On old-jobs page, add filters for dat... Closed
is defined by UIHAADM-44 Trying to view logs for a new (never ... Closed
is defined by UIHAADM-45 Trying to view logs for a job that ha... Closed
is defined by UIHAADM-46 Start/Stop Job do not work due to inc... Closed
is defined by UIHAADM-47 Sanitize XML-bulk records before savi... Closed
is defined by UIHAADM-48 Can't save harvestables when usedBy o... Closed
is defined by UIHAADM-49 Sanitize status values in records bef... Closed
is defined by MODHAADM-30 Populate "previous-job" with a correc... Closed
is defined by MODHAADM-32 OUF/Legacy Harvester: usedBy can't be... Closed
is defined by MODHAADM-58 Connector and Status harvestable cann... Closed
is defined by UIHAADM-12 Tidy up Harvestable read-only display... Closed
is defined by UIHAADM-23 [@formatjs/cli] [WARN] Error validati... Closed
is defined by UIHAADM-37 Minor quality-of-life improvements fo... Closed
is defined by UIHAADM-38 Single-line summary of stats from har... Closed
is defined by UIHAADM-39 Make logs-page wording dependent on w... Closed
is defined by UIHAADM-40 On the Logs page, move the plain-text... Closed
is defined by UIHAADM-41 Add a button on the Logs page to re-r... Closed
is defined by UIHAADM-42 Remove the “source” data from all dis... Closed
is defined by UIHAADM-43 Move Storage engines/Transformation p... Closed
is defined by UIHAADM-51 Prevent UI from attempting to sort Ol... Closed
is defined by UIHAADM-52 Minor UI changes decided Closed
is defined by UIHAADM-53 Aggregated view of failed records Closed
is defined by UIHAADM-54 Section Harvestables. Change display ... Closed
Release: Poppy (R2 2023)
Epic Link: FOLIO Harvester - new FOLIO module
Front End Estimate: XL < 15 days
Front End Estimator: Charlotte Whitt
Back End Estimate: Large < 10 days
Back End Estimator: Niels Erik Nielsen
Development Team: Sif
PO Rank: 0
Rank: Cornell (Full Sum 2021): R5
Rank: GBV (MVP Sum 2020): R1
Rank: Mainz (Full TBD): R1

 Description   

Current situation or problem: Wrap up the work on the FOLIO wrapper for Harvester Admin. The wrapper interacts for now with the legacy Harvester.

In scope: FOLIO module wrapper around existing Harvester for GBV. 1st phase of FOLIO Harvester module.

  1. Logfiles should be stored
  2. UI display of log data

Out of scope

Use case(s): For the German libraries - GBV, Hebis, Mainz then ... (enter text here)

Proposed solution/stories

Links to additional info

Questions

Out of scope:
Harvester version 2.0. 2nd phase FOLIO Harvester module

Use case(s):
As an admin from library in the GBV consortia,
I need to be abel to run (start, stop, general access to) my library's jobs in the Harvester

Proposed solution/stories:
Solution that meets GBV’s requirements for giving different admins access to only their own jobs in the Harvester. It assumes that libraries will not have access to transformation pipelines, transformation steps or back-end (Inventory storage) configurations, which are assumed to be maintained in the existing Harvester Admin UI by a central maintainer.

The basic idea is for a solution that utilizes current FOLIO user management modules and authentication - and applies access control on row level in the existing Harvester database. The idea is thus to discard the existing Harvester admin UI for jobs management but use the existing Harvester admin REST APIs by wrapping these XML based REST APIs in JSON based APIs. This is to make it possible to put a Stripes admin UI on top of them. The underlying MySQL database could be kept as is, initially at least, possibly save for a few tweaks. The harvest scheduler and harvest job running logic would also not change.

The suggestion is to implement access control by tagging all records in the Harvester database with tokens specifying who created the records and who should thus be able to retrieve them again. The simplest approach is possibly to install the Harvester module in a given FOLIO, create a user per library in that FOLIO and filter jobs by user ID. Alternatively it might be possible to control access per tenant.

The existing Harvester database and its REST APIs would in other words become the storage layer for the new JSON APIs.

The FOLIO module, ie mod-harvester-admin, would need to know where the Harvester resides. This could be a FOLIO tenant setting, but that would require that module to have a storage and the UI to provide the settings page. Mod-configuration might be another option. A simpler solution might be to declare the Harvester’s location in the deployment descriptor - pretty much the same way access to PostgreSQL is configured in other FOLIO storage modules.

It should be considered changing the current primary key scheme in the Harvester database from sequence generated primary keys to UUIDs. This would bring the schema closer to FOLIO conventions (and have unrelated configuration maintenance benefits)

The project would thus include:

  • Wrap select XML APIs as JSON APIs, provide unit tests.
  • Perhaps make a few changes to the underlying database for more FOLIO like JSON schemas
  • Write Stripes UIs with unit tests, to replace existing JSF UIs
  • Implement row level access control

Following APIs would need to be implemented in JSON

  • Storages (GET, to populate drop-down in jobs edit page)
  • Jobs (POST, PUT, GET, DELETE)
  • Transformation pipelines (GET, to populate drop-down in jobs edit page)
  • Job logs (GET, possibly just text passthrough rather than JSON)
  • Failed records (GET)

Following existing JSF based UIs would be implemented as Stripes front-ends - not necessarily in a one-to-one fashion but at least so that existing functionality will continue to be supported.

Links to additional info:
https://folio-org.atlassian.net/wiki/pages/viewpage.action?pageId=1387435

Questions


Generated at Fri Feb 09 00:36:44 UTC 2024 using Jira 1001.0.0-SNAPSHOT#100246-sha1:7a5c50119eb0633d306e14180817ddef5e80c75d.