IN PROGRESS
Participants:
Role | Name | Approval |
---|---|---|
Solution Architect | - approved | |
Java Lead | ||
Product Owner |
Spike goals:
Define fields set
Define a format for the export document
Design an endpoint for export
Define possible difficulties
General view of the export tab
The first phase of "Cost-per-use" feature should include the option for the user to export title records tied to particular package. As it is seen from the screenshots, export tab location is under the
"Actions" button.
1. Define fields set
Based on - MODKBEKBJ-479Getting issue details... STATUS , the structure of the exported document should include following fields:
- Title
- Type
- Cost
- Usage
- Cost per use
- % of usage
Below you can find an example table with sample data:
Here is a "Package" column added to provide a context of the package and will not be included to the the result document.
The last column "Product Owner Approval" - defines whether Product owner is agree with suggested format for the values
- will not be included to the result file
- approved to be added to the document
- under discussion
Column Name | Package | Title | Type | Cost | Usage | Cost per use | % of usage | year |
---|---|---|---|---|---|---|---|---|
Example value | EBSCO Open Access Journals | Writings of Professor B. B. Edwards | Book | 500.00 | 2225 | 0.22 | 16 | 2019 |
EBSCO Open Access Journals | The Seasons and the Symphony | Streaming Video | 800.00 | 4544 | 0.18 | 20 | 2019 | |
Product Owner Approval |
2. Define a format for the export document
The result format of the export file should be csv.
An example export file content, based on the columns defined in section 1. Define fields set, is available below, wherethe name of the file - <package_name>.csv
Title, Type, Cost, Usage, Cost per use, % of usage Writings of Professor B. B. Edwards,Book,500.00,2225,0.22,16 The Seasons and the Symphony,Streaming Video,800.00,4544, 0.18,20
3. Design an endpoint for export
The "mod-kb-ebsco-java" module should provide an endpoint for the package-title export option.
The definition for the ModuleDescriptor. json file
Method | GET |
---|---|
Endpoint | /eholdings/packages/{packageId}/resources/costperuse/export |
Permission Required | "kb-ebsco.package-resources-costperuse.export.collection.get" |
Description | Get cost-per-use information for the titles in csv format. |
The definition for the raml file:
... /export: get: description: | Endpoint provides a cost-per-use information about the titles included into the package in csv format. responses: 200: description: OK body: text/csv: example: strict: false value: !include examples/export/package_title_get_response.csv
4. Define possible difficulties
The main concern and point to be mentioned in this section - the response time from the APIGEE service for the huge amount of the entities (i.e ~10000 items).
The actual log files can be found - MODKBEKBJ-500Getting issue details... STATUS issue. It means for the current moment that APIGEE service is not able to proceed big number of request quickly. It is directly slow down the export process and, from the user experience, an end user should be notified with some message, that export file is preparing.
Proposed export flow:
The steps for the export titles process:
- UI sends a request to the backend to get the csv information about the package titles.
- Backend is
- fetching title ids from the holdings table
- if records are in table - fetch them
- if no records - return empty result
- define request number needed to be performed to APIGEE
- split total title ids/1000 to get request number
- prepare batches with title ids max 1000 entities
- fetch titles cost-per-use info
- if APIGEE returns [200 OK] - continue processing
- if APIGEE returns [<error>] - return error message to the user
- calculate title cost-per-use info
- send export information in text/csv format
- fetching title ids from the holdings table
- UI gets text file and
- create a file based on package name
- download file for the user
an existing ui-erm-usage code base can be applied.
Questions:
Q | A | |
---|---|---|
1 | What is the preferred delimiter for exported file - coma(,) or tab (\t)? if delimiter is coma(,) then:
example Reading: Harvard Views of Readers, Readership, and Reading History if delimiter is tab(\t), then
| Answer from Khalilah Gambrell: |
2 | Why backend can not return a file? | Answer from Natalia Zaitseva : backend has to get the package name for the export file name from RM API. As we know the service has limitation for the number of requests, so, backend would like not to send additional requests without high need. |
- the max amount of title cos-per-use information should be defined in a separate spike.
Update:
for the PoC testing of maximum number of titles that can be exported, the following data used:
Vagrant box - 'testing'
holdings credentials - sandbox
Export findings
number of exported titles | response time | size | notes/example files |
---|---|---|---|
< 20 000 | < 22s | ~1.7 Mb | |
56 508 | 51s 15 ms | 3.71 Mb | |
71 482 | 1 m 14s | 7.93 Mb | |
184 963 | 2 m 50s | 16.74 Mb | |
200 336 | 3m 02s | 17.7 Mb | |
300 000 | - | - | OOM error java_pid1.hprof.zip |
The OutOfMemoryError occurs for the export titles if their range between 200 000 - 300 000 entities. The current setup for the sandbox KB Credentials does not have such amount.