Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Import is considered finished when all the chunks are processed successfully or marked as ERROR, appropriate JobExecution status is set and the file is visible in the logs section on the UI.

Performance was measured locally on folio-testing-backend Vagrant box version 5.0.0-20190619.2334 (allocated memory 16 GB) with mod-source-record-manager and mod-data-import deployed additionally. For each module running on a docker container was allocated 256 MB of JVM heap memory.


Files used for testing:

View file
namemsplit30000.mrc
height150
 
View file
nameRecordsForSRS_20190322.json
height150

msplit30000.mrc contains 30,000 raw MARC bibliographic records, RecordsForSRS_20190322.json contains 28,306 MARC records in json format.

Results are shown below, default values are highlighted. In average, it takes about 25 sec per 1000 raw marc records and about 22 sec per 1000 json records.

Image Removed


Data-import performance was also tested Performance was measured on https://folio-snapshot-load.aws.indexdata.com using the same files, but only with  with default chunk size and queue size parameters (50 and 10 respectively). It is consistently takes 8 min to load each of the files, which makes about 17 sec to load 1000 records

Data-import performance was also tested locally on folio-testing-backend Vagrant box version 5.0.0-20190619.2334 (16 GB of RAM) with mod-source-record-manager and mod-data-import deployed additionally. For each module running on a docker container was allocated 256 MB of JVM heap memory.

Results are shown below, default values are highlighted. In average, it takes about 25 sec per 1000 raw marc records and about 22 sec per 1000 json records.

Image Added