FOLIO module wrapper around existing Index Data Harvester for GBV.
In scope:
FOLIO module wrapper around existing Harvester for GBV. 1st phase of FOLIO Harvester module.
Use case(s):
As an admin from library in the GBV consortia,
I need to be able to run (start, stop, general access to) my library's jobs in the Harvester
Proposed solution/stories:
Solution that meets GBV’s requirements for giving different admins access to only their own jobs in the Harvester. It assumes that libraries will not have access to transformation pipelines, transformation steps or back-end (Inventory storage) configurations, which are assumed to be maintained in the existing Harvester Admin UI by a central maintainer.
The basic idea is for a solution that utilizes current FOLIO user management modules and authentication - and applies access control on row level in the existing Harvester database. The idea is thus to discard the existing Harvester admin UI for jobs management but use the existing Harvester admin REST APIs by wrapping these XML based REST APIs in JSON based APIs. This is to make it possible to put a Stripes admin UI on top of them. The underlying MySQL database could be kept as is, initially at least, possibly save for a few tweaks. The harvest scheduler and harvest job running logic would also not change.
The suggestion is to implement access control by tagging all records in the Harvester database with tokens specifying who created the records and who should thus be able to retrieve them again. The simplest approach is possibly to install the Harvester module in a given FOLIO, create a user per library in that FOLIO and filter jobs by user ID. Alternatively it might be possible to control access per tenant.
The existing Harvester database and its REST APIs would in other words become the storage layer for the new JSON APIs.
The FOLIO module, ie mod-harvester-admin, would need to know where the Harvester resides. This could be a FOLIO tenant setting, but that would require that module to have a storage and the UI to provide the settings page. Mod-configuration might be another option. A simpler solution might be to declare the Harvester’s location in the deployment descriptor - pretty much the same way access to PostgreSQL is configured in other FOLIO storage modules.
It should be considered changing the current primary key scheme in the Harvester database from sequence generated primary keys to UUIDs. This would bring the schema closer to FOLIO conventions (and have unrelated configuration maintenance benefits)
The project would thus include:
* Wrap select XML APIs as JSON APIs, provide unit tests.
* Perhaps make a few changes to the underlying database for more FOLIO like JSON schemas
* Write Stripes UIs with unit tests, to replace existing JSF UIs
* Implement row level access control
Following APIs would need to be implemented in JSON
* Storages (GET, to populate drop-down in jobs edit page)
* Jobs (POST, PUT, GET, DELETE)
* Transformation pipelines (GET, to populate drop-down in jobs edit page)
* Job logs (GET, possibly just text passthrough rather than JSON)
* Failed records (GET)
Following existing JSF based UIs would be implemented as Stripes front-ends - not necessarily in a one-to-one fashion but at least so that existing functionality will continue to be supported.
Current screens: