Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

More details on a feature: [UXPROD-3874] and spike story: [MODELINKS-79]

Requirements

Functional Requirements

...

  1. The UI sends a request to the backend API.
  2. mod-quick-marc receives the request and converts the record into an SRS-like format.
  3. mod-quick-marc sends a request to the mod-entities-links API.
  4. mod-entities-links receives the request and fetches linking rules from the database using cache.
  5. From MARC bib fields that are applicable for linking according to linking rules, $0 subfield values are extracted.
  6. $0 values are used for search authorities in mod-search. The current mod-search endpoint also calculates a number of already existing links in the instance index, this should be omitted to speed up the process. TBD: authorities naturalId is exist in the internal database table 'authority_data'. Should we use this data before doing a search in mod-search? 
  7. mod-entities-links receives a collection of authority records and prepares a request to the mod-source-record-storage.
  8. mod-entities-links sends a request to the mod-source-record-storage bulk endpoint.
  9. mod-entities-links receives a collection of authority source records.
  10. mod-entities-links analyze results, prepare data for links according to linking rules, and set constructed links into the record.
  11. mod-quick-marc receives the record with links.
  12. mod-quick-marc converts the record into the appropriate format.
  13. UI receives the record with suggested links.

API Design

mod-quick-marc

POST /records-editor/links/suggestion

This endpoint will be used to find and provide UI with valid links for a record. The request will include a JSON payload with the record data:

Code Block
titleRequest body
collapsetrue
{
  "marcFormat": "BIBLIOGRAPHIC",
  "leader": "01587ccm a2200361   4500",
  "fields": [
    {
      "tag": "001",
      "content": "393893"
    },
    {
      "tag": "100",
      "content": "$a 393893 $b test $0 n1234567890 $9 312da284-a8fd-4c84-ae90-927539d6df93",
      "indicators": [
        "1",
        "2"
      ],
      "link": {
        "authorityId": "312da284-a8fd-4c84-ae90-927539d6df93",
        "authorityNaturalId": "n1234567890",
        "linkingRuleId": 1,
        "status": "ACTUAL"
      }
    },
    {
      "tag": "100",
      "content": "$a 393893 $b test $0 n1234567890 $9 312da284-a8fd-4c84-ae90-927539d6df93",
      "indicators": [
        "1",
        "2"
      ],
      "link": {
        "authorityId": "312da284-a8fd-4c84-ae90-927539d6df93",
        "authorityNaturalId": "n1234567890",
        "linkingRuleId": 1,
        "status": "ERROR"
      }
    },
    {
      "tag": "600",
      "content": "$a 393893 $b test",
      "indicators": [
        "1",
        "2"
      ]
    }
  ]
}

The response will include suggested links with the status "NEW"; fixed data and status "ACTUAL" for links, that had the status "ERROR"; links with the status "ERROR" and cause type for fields where a link can't be assigned.

Code Block
titleResponse body
collapsetrue
{
  "marcFormat": "BIBLIOGRAPHIC",
  "leader": "01587ccm a2200361   4500",
  "fields": [
    {
      "tag": "001",
      "content": "393893"
    },
    {
      "tag": "100",
      "content": "$a 393893 $b test $0 n1234567890 $9 312da284-a8fd-4c84-ae90-927539d6df93",
      "indicators": [
        "1",
        "2"
      ],
      "link": {
        "authorityId": "312da284-a8fd-4c84-ae90-927539d6df93",
        "authorityNaturalId": "n1234567890",
        "linkingRuleId": 1,
        "status": "ACTUAL"
      }
    },
    {
      "tag": "110",
      "content": "$a 393893 $b updated $0 n1234567890 $9 312da284-a8fd-4c84-ae90-927539d6df93",
      "indicators": [
        "1",
        "2"
      ],
      "link": {
        "authorityId": "312da284-a8fd-4c84-ae90-927539d6df93",
        "authorityNaturalId": "n1234567890",
        "linkingRuleId": 1,
        "status": "NEW"
      }
    },
    {
      "tag": "600",
      "content": "$a 393893 $b test",
      "indicators": [
        "1",
        "2"
      ],
      "link": {
        "status": "ERROR",
        "errorCauseCode": "101"
      }
    }
  ]
}

Error cause types:

Error cause codeDescription
101applicable authority was not found 
1022 or more applicable authorities were found
103auto linking feature is disabled
TBD


Code Block
titleRequest body
collapsetrue
{
  "records": [
    {
      "fields": [
        {
          "001": "393893"
        },
        {
          "100": {
            "ind1": "/",
            "ind2": "/",
            "subfields": [
              {
                "a": "Mozart, Wolfgang Amadeus,"
              },
              {
                "d": "1756-1791."
              },
              {
                "0": "12345"
              },
              {
                "9": "b9a5f035-de63-4e2c-92c2-07240c88b817"
              }
            ],
            "linkStatus": "ACTUAL"
          }
        },
        {
          "110": {
            "ind1": "/",
            "ind2": "/",
            "subfields": [
              {
                "a": "Mozart"
              }
            ]
          }
        }
      ],
      "leader": "01706ccm a2200361   4500"
    }
  ]
}

The response will include suggested links with the status "NEW"; fixed data and status "ACTUAL" for links, that had the status "ERROR"; links with the status "ERROR" and cause type for fields where a link can't be assigned.

Code Block
titleRequest body
collapsetrue
{
  "records": [
    {
      "fields": [
        {
          "001": "393893"
        },
        {
          "100": {
            "ind1": "/",
            "ind2": "/",
            "subfields": [
              {
                "a": "Mozart, Wolfgang Amadeus,"
              },
              {
                "d": "1756-1791."
              },
              {
                "0": "12345"
              },
              {
                "9": "b9a5f035-de63-4e2c-92c2-07240c88b817"
              }
            ],
            "linkStatus": "ACTUAL"
          }
        },
        {
          "110": {
            "ind1": "/",
            "ind2": "/",
            "subfields": [
              {
                "a": "Mozart"
              },
              {
                "0": "12345"
              },
              {
                "9": "b9a5f035-de63-4e2c-92c2-07240c88b817"
              }
            ],
            "linkStatus": "NEW"
          }
        },
        {
          "130": {
            "ind1": "/",
            "ind2": "/",
            "subfields": [
              {
                "a": "Mozart"
              }
            ],
            "linkStatus": "ERROR",
            "errorStatusCode": "101"
          }
        }
      ],
      "leader": "01706ccm a2200361   4500"
    }
  ]
}

mod-source-record-storage

POST /source-storage/batch/parsed-records/fetch


Code Block
titleRequest body
collapsetrue
{
  "conditions": {
    "ids": [
      "312da284-a8fd-4c84-ae90-927539d6df93",
      "934fee76-89e5-4046-89f0-d812e5368e1c"
    ],
    "idType": "EXTERNAL"
  },
  "data": {
    "fieldsRange": "010,100-199"
  },
  "recordType": "MARC_AUTHORITY"
}



The response will include collection of records found by conditions, records will contains all related to a record ids and only fields that are included in fieldsRange field.

Code Block
titleResponse body
collapsetrue
{
  "records": [
    {
      "id": "c56b70ce-4ef6-47ef-8bc3-c470bafa0b8c",
      "externalIdsHolder": {
        "authorityId": "b9a5f035-de63-4e2c-92c2-07240c89b817"
      },
      "recordType": "MARC_AUTHORITY",
      "recordState": "ACTUAL",
      "parsedRecord": {
        "id": "c9db5d7a-e1d4-11e8-9f32-f2801f1b9fd1",
        "content": {
          "fields": [
            {
              "010": {
                "ind1": " ",
                "ind2": " ",
                "subfields": [
                  {
                    "a": "2001000234"
                  }
                ]
              }
            },
            {
              "100": {
                "ind1": "/",
                "ind2": "/",
                "subfields": [
                  {
                    "a": "Mozart, Wolfgang Amadeus"
                  },
                  {
                    "d": "1756-1791"
                  }
                ]
              }
            },
            {
              "110": {
                "ind1": "1",
                "ind2": "0",
                "subfields": [
                  {
                    "a": "Works"
                  }
                ]
              }
            }
          ],
          "leader": "01706ccm a2200361   4500"
        }
      }
    }
  ],
  "totalRecords": 1
}

GET /search/authorities

New query parameter to add:

ParameterTypeNote
includeNumberOfTitlesboolean (default = true) If true do not perform a search for a number of linked instances


Performance

Considerations

  1. Using mod-search for searching by naturalId instead of just doing a search in mod-source-record-storage has to decrease response time when the number of records in the system is more than 1M. (Using mod-search will be needed for possible future requirements to have automated linking not only by $0 but by some other data)
  2. Disabling the linked instances counting for the mod-search authority request have to decrease the time of response.
  3. Having only required fields in the mod-source-record-storage response will decrease the size of data that has to be transferred via HTTP. The necessity of this should be tested to define if such processing will decrease performance. 2 options there: get the record as jsonb from the marc_records table and retain only needed fields or construct a record field-by-field from the marc_indexers partitioned table.
  4. Using mod-search and mod-source-record-storage bulk endpoints will decrease response time.

Testing

Performance testing has to be done on the environment with:

  • > 1M authority records
  • > 1M MARC-based instance records
  • Prepared MARC bib records that have >50 fields that are applicable for linking and all these fields should have $0 values matched to existing in the system authorities..

Tests are needed for:

  • 1 request/sec
  • 10 requests/sec
  • 100 requests/sec
  • 1000 requests/sec
Info
Based on testing results some performance improvements could be suggested if it will be required.