...
mod-data-export-spring is designed to manage, configure, and run jobs. This module is the entry point to start data export, it calls mod-data-export-worker to execute jobs sending events to the mod-data-export-worker's Kafka topic.
What should be done in this module:
- add new export type - 'eHoldings';
- add new request parameters (to ExportTypeSpecificParameters.json), needed to pass export fields, search params for titles search, and other params ; Jira Legacy server System Jira serverId 01505d01-b853-3c2e-90f1-ee9b165564fc key MODEXPS-94
- add new JobCommandBuilder, needed to take request parameters to pass in Kafka event;
mod-data-export-worker is intended to receive events from mod-data-export-spring,andexecute its jobs. The module is built based on Spring Batch Framework, and jobs are configured by a set of steps. The execution of a job happens in 3-stages: retrieve data, process data, and write data to the temporary file. Uploading files to some vendor-specific storage is preconfigured already (using AWS S3 bucket) by the listener and happens when the file is written.
What should be done in this module:
- create a Reader extending base functionality (CsvItemReader.java, see CirculationLogCsvItemReader.java as an example). The reader should retrieve packages/titles using REST clients, taking search parameters from the incoming Kafka event (from job parameters);
- create a Processor (implementing ItemProcessor). The processor has to take only selected fields for export from the incoming packages/titles. The list of fields for export comes from job parameters;
- create a Writer extending base functionality (we can just use CsvWriter.java if nothing special is needed);
- create a Configurationto build a job and set Reader, Writer, and Processor (see CirculationLogJobConfig.java);
- configure a cleaner to purge deprecated files (that we generated more than 30 days back);
...
- Should a user be automatically directed to the Export Manager after pressing the 'Export' button?
- Should the list of package&title fields be configured in Settings? Or it always will be hardcoded?
It makes sense to add a time to the generated file name (to make it unique)
...