MARC records migration (authority)

Target releaseQuesnelia (R1 2024)
Epic


UXPROD-4082 - Getting issue details... STATUS
Document status
DRAFT

Feature Overview

There exists significant difficulties with regards to the upgrading of a large set of MARC authority records when mapping rule changes have been applied, as well as migrating authority records from other systems to FOLIO.  Updating authority records to apply mapping rules changes is a time-consuming process that takes multiple days, and using DI does not scale well when libraries are attempting to do other work with data import. Failure to update these records leads to data inconsistency. With the implementation of authority control/linking (manual) with Orchid, this issue may become even more problematic for libraries. As automated linking is introduced, this issue will become even more challenging.

The proposed solution addresses issues related to remapping and initial record migration in terms of data volumes and expected performance.


Requirements

  • The solution should not use data import, as the core issue lies in migrating a large volume of records. Data import is not designed to handle millions of records.
  • Operate in the background without disrupting other FOLIO workflows, including data import.
  • Support UTF-8 encoding.
  • Support optimistic locking mechanism.
  • Enforce data validation rules during the creation of MARC authority records to prevent duplicates.
  • Apply the same data validation rules enforced during the updating of MARC authority records.
  • Provide a response that includes details on records that failed to create or update in SRS and Inventory.
  • Consider that it can be used for two types of migrations 
      • migrating all authority records from one release to another (remapping)
      • migrating a library with over 500,000+ authority records to  FOLIO.
  • For migration cases the import file format is MARC21
  • A straightforward solution should be available to the user to obtain and analyze all errors.
  • Users should have a simple solution to get/receive files that contain records with errors (The records that were not processed successfully).

User Interface

No user interface


Technical Design

Architecture

Long term solution for migrating authority records

Database Schema

If the feature involves changes to the database, provide a schema or data model.

API Endpoints

MethodPathBodyResponseNote
POST/marc-migrations

{
"entityType": {
"type": "string",
"enum": ["AUTHORITY"]
},
"operationType": {
"type": "string",
"enum": ["REMAPPING"]
}
}

{
"id": {
"type": "uuid"
},
"entityType": {
"type": "string",
"enum": ["AUTHORITY"]
},
"operationType": {
"type": "string",
"enum": ["REMAPPING"]
},
"status": {
"type": "string",
"enum": ["NEW"]
}

}

Register new marc-migration operation
PUT/marc-migrations/<id>/steps

{
"operationStepType": {
"type": "string",
"enum": ["DATA_SAVING"]
}
}

{
"id": {
"type": "uuid"
},
"entityType": {
"type": "string",
"enum": ["AUTHORITY"]
},
"operationType": {
"type": "string",
"enum": ["REMAPPING"]
},
"status": {
"type": "string",
"enum": ["DATA_SAVING"]
}
}

Trigger data-saving step for marc-migration operation
GET/marc-migrations/<id>

{
"id": {
"type": "uuid"
},
"userId": {
"type": "uuid"
},
"entityType": {
"type": "string",
"enum": [
"AUTHORITY"
]
},
"operationType": {
"type": "string",
"enum": [
"REMAPPING",
"IMPORT"
]
},
"status": {
"type": "string",
"enum": [
"NEW",
"DATA_MAPPING",
"DATA_MAPPING_COMPLETED",
"DATA_MAPPING_FAILED",
"DATA_SAVING",
"DATA_SAVING_COMPLETED",
"DATA_SAVING_FAILED"
]
},
"total_num_of_records": {
"type": "integer"
},
"processed_num_of_records": {
"type": "integer"
},
"start_time_mapping": {
"type": "string",
"format": "date-time"
},
"end_time_mapping": {
"type": "string",
"format": "date-time"
},
"start_time_saving": {
"type": "string",
"format": "date-time"
},
"end_time_saving": {
"type": "string",
"format": "date-time"
}
}

Return the marc-migration operation

Data Flow

Described in architecture design


Development Tasks

  1. mod-marc-migrations
    1. create repository and base structure of the module
    2. create enpoints
      1. POST endpoint for registering new marc-migration operation
      2. PUT endpoint for triggering data-saving step for marc-migration operation
      3. GET endpoint for fetching marc-migration operation
    3. implement mechanism for async records mapping
      1. prepare chunks
      2. mapping chunks
    4. implement mechanism for async data saving
  2. mod-entities-links
    1. create new enpoint to process file-name and process it 

key summary type status
Loading...
Refresh


Testing

Test Cases

  1. Test case 1
  2. Test case 2
  3. ...

Performance testing

Provide results of performance testing if needed.

Load testing

Provide results of load testing if needed.


Deployment

Deployment notes

Describe the deployment process.

Migration

Describe the migration process.


Documentation

Provide links to or include any documentation related to the feature, such as API documentation or user guides.


Dependencies

List any external dependencies to other teams and features required for this feature.


Risks and Mitigations

Identify potential risks associated with the feature and describe mitigation strategies.