SPIKE: MODSOURMAN-526 Verify persist value in DB during parsing 004 field
Participants: | |
|---|---|
Solution Architect | @Vladimir Shalaev |
Product Owner | @Khalilah Gambrell |
Java Lead | @Kateryna Senchenko |
Goal and requirements
Provide a solution for verify that the 004 value is an Instance record's HRID in the database
Requirements
Ensure a MARC holdings record always has a 004 value AND only one such value
Ensure that the 004 value is an Instance record's HRID
Ensure the 004 value does not contain a subfield delimiter
Cannot have multiple 004 values on a MARC Holdings record
Ensure if an invalid 004 value is set in the MARC Holdings record then return an error message and do not allow the record to be created/saved to SRS
Ensure that a valid 004 value links an Instance record to the MARC Holdings record as shown in the above screenshots
Example
A Holdings record represents the location where one will find a title (referred to in FOLIO as an instance)
Example: Book Title Harry Potter is held at the Main Library - Dekin Wing
Harry Potter is a FOLIO instance record
Main Library - Dekin Wing is a Holdings record in FOLIO
Every MARC Holdings record must have only one 004 value
The 004 value is the Instance record HRID value that the Holdings record is linked
Having a valid Instance record HRID in the 004 field is the only way that a user can view the Holdings record on FOLIO (see below examples)
Without a valid Instance record HRID, the Holdings record is not discoverable via FOLIO and it is a meaningless record if an instance is not linked
Create MARC bib record
First of all, we should create marc bib record. To initiate records parsing one should send POST request containing RawRecordsDto, which contains raw records list ("initialRecords" field) to /change-manager/jobExecutions/{jobExecutionId}/records The list of records can contain records in different formats ("MARC_RAW", "MARC_JSON", "MARC_XML").
{jobExecutionId} - JobExecution id, which can be retrieved from response of previous request.
Post request on creation MARC bib
POST /change-manager/jobExecutions/{jobExecutionId}/records
curl -w '\n' -X POST -D - \ -H "Content-type: application/json" \ -H "Accept: text/plain, application/json" \ -H "x-okapi-tenant: diku" \ -H "x-okapi-token: eyJhbGciOiJIUzI1NiJ9.eyJzdWIiOiJkaWt1X2FkbWluIiwidXNlcl9pZCI6IjQwZDFiZDcxLWVhN2QtNTk4Ny1iZTEwLTEyOGUzODJiZDMwNyIsImNhY2hlX2tleSI6IjMyYTJhNDQ3LWE4MzQtNDE1Ni1iYmZjLTk4YTEyZWVhNzliMyIsImlhdCI6MTU1NzkyMzI2NSwidGVuYW50IjoiZGlrdSJ9.AgPDmXIOsudFB_ugWYvJCdyqq-1AQpsRWLNt9EvzCy0" \ -d @rawRecordsDto.json \ https://folio-testing-okapi.dev.folio.org:443/change-manager/jobExecutions/647c2dee-70a8-4ae8-aba4-81579ee17e58/records
example of rawRecordsDto.json to parse marc records in json format:
json format
{ "id": "22fafcc3-f582-493d-88b0-3c538480cd83" // for each chunk we need to have and unique uuid "recordsMetadata": { "last": false, "counter": 1, "total": 1, "contentType":"MARC_JSON" }, "initialRecords": [ { "record": "{\"leader\": \"00648cam a2200193 a 4500\",\r\n \"fields\": [\r\n {\r\n \"001\": \"FOLIOstorage\"\r\n },\r\n {\r\n \"008\": \"960521s1972\\\\\\\\\\\\\\\\se\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\0\\\\\\\\\\\\swe\\\\\\\\\"\r\n },\r\n {\r\n \"041\": {\r\n \"ind1\": \"1\",\r\n \"ind2\": \"\\\\\",\r\n \"subfields\": [\r\n {\r\n \"a\": \"swe\"\r\n }\r\n ]\r\n }\r\n },\r\n {\r\n \"096\": {\r\n \"ind1\": \"\\\\\",\r\n \"ind2\": \"\\\\\",\r\n \"subfields\": [\r\n {\r\n \"y\": \"Z\"\r\n },\r\n {\r\n \"b\": \"TAp Chalmers tekniska h\u00F6gskola. Inst. f\u00F6r byggnadsstatik. Skrift. 1972:4\"\r\n },\r\n {\r\n \"s\": \"g\"\r\n }\r\n ]\r\n }\r\n },\r\n {\r\n \"100\": {\r\n \"ind1\": \"1\",\r\n \"ind2\": \"\\\\\",\r\n \"subfields\": [\r\n {\r\n \"a\": \"Sahlin, Sven\"\r\n }\r\n ]\r\n }\r\n },\r\n {\r\n \"245\": {\r\n \"ind1\": \"0\",\r\n \"ind2\": \"0\",\r\n \"subfields\": [\r\n {\r\n \"a\": \"P\u00E5lslagning\"\r\n }\r\n ]\r\n }\r\n },\r\n {\r\n \"260\": {\r\n \"ind1\": \"\\\\\",\r\n \"ind2\": \"\\\\\",\r\n \"subfields\": [\r\n {\r\n \"c\": \"1972\"\r\n }\r\n ]\r\n }\r\n },\r\n {\r\n \"300\": {\r\n \"ind1\": \"\\\\\",\r\n \"ind2\": \"\\\\\",\r\n \"subfields\": [\r\n {\r\n \"a\": \"19 bl.\"\r\n }\r\n ]\r\n }\r\n },\r\n {\r\n \"440\": {\r\n \"ind1\": \"\\\\\",\r\n \"ind2\": \"\\\\\",\r\n \"subfields\": [\r\n {\r\n \"a\": \"Skrift, Chalmers tekniska h\u00F6gskola, Institutionen f\u00F6r byggnadsstatik\"\r\n },\r\n {\r\n \"x\": \"9903909802 ;\"\r\n },\r\n {\r\n \"v\": \"72:4\"\r\n }\r\n ]\r\n }\r\n },\r\n {\r\n \"907\": {\r\n \"ind1\": \"\\\\\",\r\n \"ind2\": \"\\\\\",\r\n \"subfields\": [\r\n {\r\n \"a\": \".b11154585\"\r\n },\r\n {\r\n \"b\": \"hbib \"\r\n },\r\n {\r\n \"c\": \"s\"\r\n }\r\n ]\r\n }\r\n },\r\n {\r\n \"902\": {\r\n \"ind1\": \"\\\\\",\r\n \"ind2\": \"\\\\\",\r\n \"subfields\": [\r\n {\r\n \"a\": \"190206\"\r\n }\r\n ]\r\n }\r\n },\r\n {\r\n \"998\": {\r\n \"ind1\": \"\\\\\",\r\n \"ind2\": \"\\\\\",\r\n \"subfields\": [\r\n {\r\n \"b\": \"0\"\r\n },\r\n {\r\n \"c\": \"990511\"\r\n },\r\n {\r\n \"d\": \"m\"\r\n },\r\n {\r\n \"e\": \"b \"\r\n },\r\n {\r\n \"f\": \"s\"\r\n },\r\n {\r\n \"g\": \"0\"\r\n }\r\n ]\r\n }\r\n },\r\n {\r\n \"909\": {\r\n \"ind1\": \"0\",\r\n \"ind2\": \"0\",\r\n \"subfields\": [\r\n {\r\n \"a\": \"m\"\r\n },\r\n {\r\n \"c\": \"a\"\r\n },\r\n {\r\n \"d\": \"b\"\r\n }\r\n ]\r\n }\r\n },\r\n {\r\n \"945\": {\r\n \"ind1\": \"\\\\\",\r\n \"ind2\": \"\\\\\",\r\n \"subfields\": [\r\n {\r\n \"l\": \"hbib3\"\r\n },\r\n {\r\n \"a\": \"TAp Chalmers tekniska h\u00F6gskola.Inst. f\u00F6r byggnadsstatik. Skrift 72:4\"\r\n }\r\n ]\r\n }\r\n }\r\n ]\r\n }" } ] }Or you can download via UI with
Create MARC Holdings record with valid 004 field
During creation Marc Holdings with 004, will be executed verification in the SRM module, which will call request with parameter 004 field into SRS. If the 004 value is the Instance record HRID value that the Holdings record is linked and this record is located in database - we should successfully save Marc Holdings record without any error.
Example of Marc Holding raw record:
During testing on the rancher, MARC Holdings with VALID 004 field by uploading the file - As result we found record from SRS and MARC Holdings will be saved correctly:
Create MARC Holdings record without/with invalid 004 field
During creation Marc Holdings with 004, will be executed verification in the SRM module, which will call request with parameter 004 field into SRS. If the 004 value is the Instance record HRID value that the Holdings record is NOT linked and this record is NOT located in database - we will not save Marc Holdings record and receive error.
Example of Marc Holding raw record:
During testing on the rancher, file CornellFOLIOExemplars_Holdings.mrc use to load MARC Holdings and the 004 field has an HRID that does not exist in the database. As result we can see error in console:
When we import MARC Holdings with available and not available MARC Bib ids, so we receive the next log:
Module changes
SRM
Change logic for ChangeEngineServiceImpl by adding SRS client for retrieving record by 001 field from MARC bib.
Example of logic
private void postProcessMarcHoldingsRecord(Record record, InitialRecord rawRecord, OkapiConnectionParams okapiParams) { var controlFieldValue = getControlFieldValue(record, TAG_004); if (isBlank(controlFieldValue)) { LOGGER.error(HOLDINGS_004_TAG_ERROR_MESSAGE); record.setParsedRecord(null); record.setErrorRecord(new ErrorRecord() .withContent(rawRecord) .withDescription(new JsonObject().put("message", HOLDINGS_004_TAG_ERROR_MESSAGE).encode()) ); } else { SourceStorageStreamClient sourceStorageStreamClient = getSourceStorageStreamClient(okapiParams); MarcRecordSearchRequest marcRecordSearchRequest = new MarcRecordSearchRequest(); marcRecordSearchRequest.setFieldsSearchExpression("001.value = '" + controlFieldValue + "'"); try { sourceStorageStreamClient.postSourceStorageStreamMarcRecordIdentifiers(marcRecordSearchRequest, asyncResult -> { if (asyncResult.succeeded()) { var body = asyncResult.result().body(); LOGGER.info("Response from SRS with MARC Bib 001 field: {} and body: {}", controlFieldValue, body); var object = new JsonObject(body); var records = object.getJsonArray("records"); if (records.isEmpty()) { LOGGER.error(HOLDINGS_004_TAG_ERROR_MESSAGE); record.setParsedRecord(null); record.setErrorRecord(new ErrorRecord() .withContent(rawRecord) .withDescription(new JsonObject().put("message", HOLDINGS_004_TAG_ERROR_MESSAGE).encode())); } } else { LOGGER.error("Error during call post request to SRS"); } }); } catch (Exception e) { LOGGER.error("Error during call post request to SRS ", e.getCause()); } } } private SourceStorageStreamClient getSourceStorageStreamClient(OkapiConnectionParams okapiParams) { var token = okapiParams.getToken(); var okapiUrl = okapiParams.getOkapiUrl(); var tenantId = okapiParams.getTenantId(); return new SourceStorageStreamClient(okapiUrl, tenantId, token); }Write tests for cover new logic.
SRS
Implement new endpoint for retrieving invalid marc bib ids
Create separate endpoint for searching invalid marc bib
Create new DTO for response
Extend raml file by new endpoint
Create service and dao layer
Write tests for new functionality
Query for receiving invalid marc bib ids from database
Query for receiving invalid marc bib ids
SELECT marc.hrid FROM (SELECT unnest(ARRAY['222222222222','in00000000313','111111111111','in00000000316']) as hrid) as marc LEFT JOIN diku_mod_source_record_storage.records_lb lb ON (lb.instance_hrid = marc.hrid and lb.record_type = 'MARC_BIB') WHERE lb.instance_hrid IS NULL
Testing process
Changes should be tested on the rancher environment.
File to load MARC Holdings with invalid 004 field. The MARC bib HRIDs doesn't exists in database:
File to load MARC Holdings with valid 004 field. The MARC bib has been created before MARC Holdings:
Problems
The main problem during investigation is:
When we load MARC Bib with 001 field, for example: 366832. After MARC Bib will be loaded, 001 field will be moved to 035, and 001 will be replaced by new generation HRID, for example: in00000012415. Then MARC Holdings will loaded with 004 field 366832. In this case, proposed approach will find in the SRS by 366832 value and MARC bib will be not find (because MARC Bib will saved by new HRID: in00000012415). As result MARC Holdings is not loaded.
Questions
Question | Answer |
|---|---|
Which status do we need to return after data import was imported MARC Holdings with invalid 004 field? Complete with errors? Failed? | From Khalilah: Failed |
If file MARC Holdings partially consists of valid and invalid 004 field. For example: we have file with 3 records, on of them is correct, other - not. Will we save only one record? And which kind of status will be? Complete with errors? Failed? | From Khalilah: Completed with errors
|
Data import must support a similar requirement today. Data import supports the ability to create/update Holdings record with the source = FOLIO. We need to find out what is being done that links a Holdings record to an Instance record and/or MARC bib record currently. Any validation in place? |
|
Stories
Story | Jira | High level estimation(story points) |
|---|---|---|
MODSOURMAN-544 Validate MARC Holdings 004 field from MARC Bib HRID | 5 | |
MODSOURCE-351 Endpoint to verify invalid MARC Bib ids in the system | 3 |